CN113905204A - Image display method, device, equipment and storage medium - Google Patents

Image display method, device, equipment and storage medium Download PDF

Info

Publication number
CN113905204A
CN113905204A CN202111047995.3A CN202111047995A CN113905204A CN 113905204 A CN113905204 A CN 113905204A CN 202111047995 A CN202111047995 A CN 202111047995A CN 113905204 A CN113905204 A CN 113905204A
Authority
CN
China
Prior art keywords
lens
image display
target image
image
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111047995.3A
Other languages
Chinese (zh)
Other versions
CN113905204B (en
Inventor
陈文明
倪世坤
张世明
吕周谨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Emeet Technology Co ltd
Original Assignee
Shenzhen Emeet Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Emeet Technology Co ltd filed Critical Shenzhen Emeet Technology Co ltd
Priority to CN202111047995.3A priority Critical patent/CN113905204B/en
Priority to PCT/CN2021/118489 priority patent/WO2022262134A1/en
Publication of CN113905204A publication Critical patent/CN113905204A/en
Application granted granted Critical
Publication of CN113905204B publication Critical patent/CN113905204B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/695Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects

Abstract

The invention discloses an image display method, device, equipment and storage medium, and belongs to the technical field of video display. The method and the device determine the target image display mode through the conference request instruction input by the user, adjust different lens orientations according to different target image display modes, are suitable for image display in multiple scenes, select a proper target image display strategy according to the combination of the lens orientations and the current conference participating target information, and possibly select different image display modes according to the different conference participating target information, thereby avoiding the technical problems that the video conversation can only be used in a single application scene, different display modes can not be selected according to the number of participants and the conference mode, and the user experience is poor.

Description

Image display method, device, equipment and storage medium
Technical Field
The present invention relates to the field of video display technologies, and in particular, to an image display method, apparatus, device, and storage medium.
Background
With the development of science and technology, remote offices and remote conferences are more and more popular with people, and communication also exceeds the limits of time and space. People have more and more functional requirements on conference communication products and have higher and higher requirements on the performance of the products. Therefore, many audio and video conference office products are produced, and the audio and video conference office products comprise fixed focusing video products, automatic focusing video products and rotating holder video products, which are all simpler, and are only one shooting device, so that the audio and video conference office products can be used only in a single application scene when video conversation is carried out, different display modes can not be selected according to the number of people participating in a conference, and the user experience is poor.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide an image display method, an image display device, image display equipment and a storage medium, and aims to solve the technical problems that in the prior art, video conversation can only be performed in a single application scene, different display modes can not be selected according to the number of participants and a conference mode, and user experience is poor.
To achieve the above object, the present invention provides an image display method comprising the steps of:
when a meeting request instruction is received, determining a target image display mode according to the meeting request instruction, and acquiring current meeting participation target information;
determining the orientation of a lens according to the target image display mode, and adjusting the orientation of the lens according to the orientation of the lens;
determining a target image display strategy based on the lens orientation and the current meeting target information;
and when the lens rotates to the lens direction, displaying a preset privacy image or a current target image acquired by the camera according to the target image display strategy.
Optionally, the target image display policy comprises a target image panorama display policy;
if the lens orientation is that the lens faces upwards, acquiring the number of the conference targets and the conference participation degree in the current conference target information;
determining a target image panoramic display strategy according to the lens orientation, the conference participating target number and the conference participation degree;
when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps:
and when the lens is rotated to the lens direction, displaying the current target image acquired by the camera according to the target image panoramic display strategy.
Optionally, the target image display strategy comprises a target image wide-angle display strategy;
if the lens position is that the lens faces forwards, determining a wide-angle display strategy of the target image according to the lens position and the current conference information;
when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps:
and when the lens rotates to the lens position, displaying the current target image acquired by the camera according to the target image wide-angle display strategy.
Optionally, the target image display policy comprises a target image privacy display policy;
if the lens orientation is that the lens faces downwards, determining a target image privacy display strategy according to the lens orientation;
when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps:
and when the lens rotates to the lens direction, displaying a preset privacy image according to the target image privacy display strategy, and closing the microphone.
Optionally, when the lens is rotated to the lens position, before displaying a preset privacy image or a current target image acquired by the camera according to the target image display policy, the method includes:
acquiring a current participating target image through a camera, and starting a microphone;
performing image segmentation on the current participating target image through a preset image segmentation model to obtain a segmented image;
and carrying out image processing on the segmented image to obtain a current target image.
Optionally, the image segmentation is performed on the current participating target image through a preset image segmentation model to obtain a segmented image, and the method includes:
determining a segmentation initial point of the current conferencing target image based on the lens orientation;
and carrying out image segmentation on the current participating target image through a preset image segmentation model based on the segmentation initial point and a preset direction to obtain a segmented image.
Optionally, after displaying a preset privacy image or a current target image acquired by the camera according to the target image display policy when the lens is rotated to the lens position, the method further includes:
carrying out mouth shape detection on the current target image through a preset mouth shape detection model to obtain an initial speaker image;
acquiring a current sound signal through a microphone, and determining speaker information according to the current sound signal;
and determining a target speaker image according to the speaker information and the initial speaker image, and marking the target speaker image.
Further, to achieve the above object, the present invention also proposes an image display device comprising:
further, to achieve the above object, the present invention also proposes an image display apparatus comprising: a memory, a processor and an image display program stored on the memory and executable on the processor, the image display program being configured to implement the steps of the image display method as described above.
Furthermore, to achieve the above object, the present invention also proposes a storage medium having stored thereon an image display program which, when executed by a processor, implements the steps of the image display method as described above.
According to the method and the device, when a meeting request instruction is received, a target image display mode is determined according to the meeting request instruction, current meeting-participating target information is obtained, a lens direction is determined according to the target image display mode, the lens direction is adjusted according to the lens direction, a target image display strategy is determined based on the lens direction and the current meeting-participating target information, and when the lens direction rotates to the lens direction, a preset privacy image or a current target image obtained by a camera is displayed according to the target image display strategy. Compared with the prior art, the method and the device have the advantages that the target image display mode is determined through the conference request instruction input by the user, different lens orientations are adjusted according to different target image display modes, the method and the device are suitable for image display in multiple scenes, a proper target image display strategy is selected according to the combination of the lens orientations and the current conference participating target information, different image display modes can be selected according to different conference participating target information, and the technical problem that the user experience is poor because the video conversation can be carried out only in a single application scene, different display modes cannot be selected according to the number of participants and the conference mode is solved.
Drawings
Fig. 1 is a schematic structural diagram of an image display apparatus in a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a first embodiment of an image displaying method according to the present invention;
FIG. 3 is a schematic diagram of a video conference device according to an embodiment of an image display method of the present invention;
FIG. 4 is a flowchart illustrating a second embodiment of an image displaying method according to the present invention;
FIG. 5 is a schematic diagram of a VIP image display strategy according to an embodiment of the image display method of the present invention;
FIG. 6 is a schematic diagram of a presenter image display strategy according to an embodiment of the image display method of the present invention;
FIG. 7 is a diagram illustrating a multi-person session image display strategy according to an embodiment of the image display method of the present invention;
fig. 8 is a schematic diagram of an alternate-speaking image display strategy according to an embodiment of the image display method of the present invention;
FIG. 9 is a flowchart illustrating a third exemplary embodiment of an image displaying method according to the present invention;
FIG. 10 is a view illustrating a scene in a wide-angle mode according to an embodiment of an image displaying method of the present invention;
FIG. 11 is a schematic diagram of a wide-angle display strategy of a target image according to an embodiment of an image display method of the present invention;
FIG. 12 is a flowchart illustrating a fourth exemplary embodiment of an image displaying method according to the present invention;
fig. 13 is a block diagram of the image display device according to the first embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an image display device in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the image display apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the image display apparatus, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a storage medium, may include therein an operating system, a network communication module, a user interface module, and an image display program.
In the image display apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the image display apparatus of the present invention may be provided in the image display apparatus which calls the image display program stored in the memory 1005 through the processor 1001 and executes the image display method provided by the embodiment of the present invention.
An embodiment of the present invention provides an image display method, and referring to fig. 2, fig. 2 is a flowchart illustrating a first embodiment of an image display method according to the present invention.
In this embodiment, the image display method includes the steps of:
step S10: and when a meeting request instruction is received, determining a target image display mode according to the meeting request instruction, and acquiring current meeting participating target information.
It should be noted that the execution subject of the present embodiment may be an image display device, where the image display device may be a controller of a video conference device, for example: personal computer, control chip, etc., and may also be other device controllers capable of performing video conference, and this embodiment is not limited in particular.
It can be understood that the conference request instruction may be a request instruction for controlling the video conference device to start operating, which is input by a user through a button of the video conference device, a remote controller, or a mobile phone APP, and the conference request instruction may include a start signal of a conference initial mode.
It should be noted that, in this embodiment, the video conference device includes five modules: 1. a microphone array module; 2. a motor and a driving module; 3. a lens module; 4. a sensor module; 5. a controller; the microphone array module is used for collecting sound signals of a user during a video conference and sending the sound signals to the controller for detecting a sound direction, and in order to make the collected sound signals more complete and clear, the microphone array may be a multi-microphone array, for example: 4 wheat array, 6 wheat array, 8 wheat array, etc., and the embodiment is not limited in particular.
The motor and the driving module are used for rotating according to a signal sent by the controller, so that the lens can rotate and the adjustment of the position of the lens is realized.
In this embodiment, the lens in the lens module may be an image capturing lens capable of capturing an image at an angle of 220 °, or may be another lens capable of achieving the same or similar functions; the sensor module is used for detecting the detection of the direction of the lens when the motor and the driving module control the lens to rotate so as to judge whether the rotation operation is finished.
The target image display mode may be a panoramic mode, a wide-angle mode, a privacy mode, or the like, and the video conference device controller may start the corresponding target image display mode according to a conference request instruction input by a user.
It can be understood that the current conference target information may be information such as the number of people currently participating in the conference, face information, and conference participation degree.
Step S20: and determining the orientation of the lens according to the target image display mode, and adjusting the orientation of the lens according to the orientation of the lens.
It should be noted that, the lens orientations corresponding to different target image display modes are different, in this embodiment, referring to fig. 3, 1 represents a panoramic mode, and the lens orientation corresponding to the panoramic mode is that the lens faces upward; 2, a wide-angle mode, wherein the lens orientation of the wide-angle mode is that the lens faces forwards; and 3, a privacy mode, the privacy mode lens orientation being lens down.
In a specific implementation, after the lens orientation is determined according to the target image display mode, the lens is controlled to rotate according to the motor and the driving module, so that the lens is rotated to the lens orientation corresponding to the target image display mode determined according to a conference request instruction input by a user.
Step S30: and determining a target image display strategy based on the lens orientation and the current meeting target information.
It should be noted that, in this embodiment, the image display policy may be a multi-person conversation image display policy, a moderator image display policy, a VIP image display policy, an alternate speech image display policy, and the like in the panoramic mode, or may also be a normal image display policy, and the like in the wide-angle mode, and in this embodiment, a specific target image display policy is determined according to the lens orientation and the current meeting target information.
It can be understood that, before determining the target image display strategy, the lens orientation is determined by the image display mode, so in this embodiment, according to three different target image display modes, the lens of the video conference device has three orientations, which are: lens up, lens forward, and lens down.
Step S40: and when the lens rotates to the lens direction, displaying a preset privacy image or a current target image acquired by the camera according to the target image display strategy.
It should be noted that the current target image may be a conference image acquired by a camera when the video conference device is in a panoramic mode or a wide-angle mode.
Further, before step S40, in order to make the image captured by the camera more suitable for display, the method further includes:
acquiring a current participating target image through a camera, and starting a microphone;
performing image segmentation on the current participating target image through a preset image segmentation model to obtain a segmented image;
and carrying out image processing on the segmented image to obtain a current target image.
It should be noted that the preset image segmentation model may be used to perform image segmentation on a target image acquired by a camera, extract a person image in the target image, and mark the person image as a segmented image.
It can be understood that the current target image is an expanded image obtained after the target image is expanded according to the target image acquired by the camera, and in the process of expanding, an expansion point needs to be determined first, and in this embodiment, the determination of the position of the expansion point may be based on the lens position to determine a segmentation initial point of the current conference target image; and carrying out image segmentation on the current participating target image through a preset image segmentation model based on the segmentation initial point and a preset direction to obtain a segmented image.
In the specific implementation, in the process of segmenting the current meeting target image collected by the camera, the current meeting target image needs to be detected according to a preset human face detection model and a human shape detection model, so that the situation that the target image is incomplete due to the fact that a complete person is cut into half is avoided.
In addition, after the target image display strategy shows the preset privacy image or the current target image acquired by the camera, the participants who speak during the video conference can be marked, so that the user can know who speaks more intuitively, and better experience can be realized.
Further, after step S40, the method further includes:
carrying out mouth shape detection on the current target image through a preset mouth shape detection model to obtain an initial speaker image;
acquiring a current sound signal through a microphone, and determining speaker information according to the current sound signal;
and determining a target speaker image according to the speaker information and the initial speaker image, and marking the target speaker image.
It should be noted that the preset mouth detection model is used for detecting the mouth movement of the participant target in the current participant target image, when the participant target speaks, the mouth moves, so as to determine the participant target speaking, and meanwhile, since the possible participant target does not send a sound signal, but detects the mouth movement, the initial speaker information can be obtained through the preset mouth detection model.
It can be understood that the sound signal obtained by the microphone is subjected to direction prediction through a preset direction prediction model and initial speaker information, so that an accurate target speaker can be determined, and the target speaker image is marked.
It should be noted that the marking of the target speaker may be to increase the brightness of the target speaker image in the display image, mark different colors, or the like, or may be other marking methods with the same or similar marking functions, and this embodiment is not limited in particular.
In the embodiment, when a conference request instruction is received, a target image display mode is determined according to the conference request instruction, current conference participating target information is acquired, a lens orientation is determined according to the target image display mode, the lens orientation is adjusted according to the lens orientation, a target image display strategy is determined based on the lens orientation and the current conference participating target information, and when the lens orientation is rotated to the lens orientation, a preset privacy image or a current target image acquired by a camera is displayed according to the target image display strategy. The embodiment determines the target image display mode through a conference request instruction input by a user, adjusts different lens orientations according to different target image display modes, is suitable for image display in multiple scenes, selects a proper target image display strategy according to the combination of the lens orientations and current conference participating target information, and possibly selects different image display modes according to the different conference participating target information, thereby avoiding the technical problems that the video conversation can only be used in a single application scene, the number of people can not be participated in, the conference mode selects different display modes, and the user experience is poor.
Referring to fig. 4, fig. 4 is a flowchart illustrating an image display method according to a second embodiment of the present invention.
Based on the first embodiment, in this embodiment, the step S30 includes:
step S301: and if the lens position is that the lens faces upwards, acquiring the number of the conference targets and the conference participation degree in the current conference target information.
It should be noted that the conference participation degree may be obtained by calculating the total number of the sound signals collected by the microphone array of the video conference device and the sources of the sound signals, and when the conference participation degree is obtained by processing the sound signals, the total number of the sound signals may be larger, the conference participation degree may be higher, the number of the sources of the sound signals may be larger, and the conference participation degree may be higher.
Step S302: and determining a target image panoramic display strategy according to the lens orientation, the number of the conference participating targets and the conference participation degree.
It should be noted that, when the lens is facing upward, the target image panoramic display strategy includes four target image display strategies: 1. a VIP image display strategy; 2. a host image display policy; 3. a multi-person conversation image display policy; 4. an alternate-talk image display strategy; and selecting a proper target image display strategy according to the specific number of the participating targets and the conference participation degree.
Further, since the number of conference targets is different and the conference participation degree is different, when the image display policy is selected, it is necessary to select a different image display policy, that is, step S301 includes:
when the lens position is that the lens faces upwards, acquiring the number of conference participating targets and the conference participation degree in the current conference participating target information;
judging whether host information exists in the current participating target information;
if the conference participation degree exists, comparing the conference participation degree with a first preset value, and comparing the conference participation target number with a second preset value;
when the conference participation degree exceeds a first preset value and the number of the conference objects is larger than a second preset value, determining a VIP image display strategy according to the conference request instruction, the number of the conference objects and the conference participation degree;
when the conference participation degree does not exceed a first preset value and the number of the conference objects is smaller than a second preset value, determining a host image display strategy according to the conference request instruction, the number of the conference objects and the conference participation degree;
it can be understood that the host information can be a participant target image which needs to be displayed for a long time, the host information can be a conference host, an organizer or a person needing fixed display, and the host information can also be a person in front of the video conference device as a default host, in actual operation, the host information can be manually switched to the host by adjusting the placement of the video conference device, and the host can also be switched by keys, a remote controller, a mobile phone APP and the like on the video conference device, and the embodiment is not limited specifically.
Secondly, when host information exists, different image display strategies are selected according to the conference participation degree and the number of participants, and in actual operation, when the conference participation degree exceeds a first preset value and the number of the participants is greater than a second preset value, the target image display strategy is a VIP image display strategy; when the conference participation degree does not exceed a first preset value, and when the number of the conference participating targets is smaller than a second preset value, the target image display policy is a moderator image display policy, where the first preset value and the second preset value may be values preset by a user, and this embodiment is not particularly limited.
It should be noted that in the VIP image display policy, one VIP location is set, and generally the VIP location is fixed at the upper left corner, and the other number and location may not be fixed, in this embodiment, the number of non-VIP locations is 5, and when a person is detected to speak, the target image corresponding to the speaking participant is extracted to the other locations by detection, and the target image is located in five frames of the non-VIP location clockwise with the VIP location as the starting point.
It should be understood that in the moderator image display strategy, all the acquired participant target images are integrated to obtain a panoramic image containing all the participant target images, the panoramic image is fixed at the lower end of the display image, and the moderator image is fixed at the upper end, so that a plurality of moderator information conditions may occur, and therefore, a plurality of moderator images can be placed at the upper end of the display image.
In addition, if the conference participation degree and the number of participants change in the video conference process, the change can be detected through a preset human shape detection model and a preset human face detection model, and the target image display strategy is changed according to different conference participation degrees and different numbers of participants.
In a specific implementation, A, B, C, D, E, F people currently exist, after the video conference device detects information of a host, it is determined that a is the host, and when the conference participation degree exceeds a first preset value and the number of the participating targets is greater than a second preset value, the target image display policy is the VIP image display policy, so that the display positions of the six people in the video image are as shown in fig. 5.
If the A is the host and the conference participation degree does not exceed the first preset value, and the number of the participating targets is smaller than the second preset value, the target image display strategy is the host image display strategy, and therefore the video image display positions of six people are shown in fig. 6.
Further, during the video conference, there is a possibility that there is no host, so step S302 further includes:
if not, comparing the number of the participating targets with a second preset value;
when the number of the conference participating targets does not exceed a second preset value, determining a multi-person conversation image display strategy according to the conference request instruction, the number of the conference participating targets and the conference participation degree;
and when the number of the conference participating targets exceeds a second preset value, determining an alternate speaking image display strategy according to the conference request instruction, the number of the conference participating targets and the conference participation degree.
It should be noted that, in the multi-person conversation image display policy, the acquired participant target images are integrated to obtain a panorama including all the participant target images, the panorama is fixed at the lower end of the display image, three image frames are fixed at the upper end of the display image, when a speech of the participant target is detected, the display target image is adjusted, if the number of the participant target in the panorama is not enough for three, only the image frame corresponding to the number of people is displayed at the upper end of the display image, and a black frame is displayed in the number of people, in addition, the sequence of the three image frames is fixed, for example: at present, there are A, B, C, D, E, F six people, which image frame at the top of the display image is displayed in the order: ABC, ABD, ABE, ABF, ACD, ACE, ACF, ADE, ADF, AEF in 10 order.
It should be understood that in the alternative speaking image display strategy, there are no fixed number of image frames, and the display image is cut according to the number of participating targets, and the cutting rule is:
when only 1 person exists, the picture for displaying the image only has a portrait frame;
when only 2 persons exist, the picture for displaying the image is equally divided left and right, and 2 persons are displayed;
when only 3 people exist, the picture of the display image is divided into 4 parts according to the upper part, the lower part and the left part, 3 people are displayed, but the picture at the lower right corner is displayed by adopting a black screen;
when only 4 people exist, dividing the picture of the display image into 4 parts according to the upper part, the lower part, the left part and the right part, and displaying 4 people;
when only 5 people exist, dividing the picture into 6 parts as shown in figure 12, displaying 5 people, but displaying the picture at the lower right corner by adopting a black screen;
when there are only 6 persons, 6 are displayed as an equal division of 6 in fig. 12;
when only 7 people exist, the picture of the display image is divided into 9 parts according to the upper part, the lower part and the left part, 7 people are displayed, but 2 pictures at the lower right corner are displayed by adopting a black screen;
when only 8 people exist, the picture of the display image is divided into 9 parts according to the upper part, the lower part and the left part, 8 people are displayed, but the picture at the lower right corner is displayed by adopting a black screen;
when only 9 people exist, the picture of the display image is divided into 9 parts according to the upper part, the lower part, the left part and the right part, and 9 people are displayed;
when the number of people is more than 9, the picture display is only 9, the initially displayed people are respectively displayed clockwise according to the first speaking person, and when the 10 th person speaks, the last speaking person is replaced;
it can be understood that when the video conference equipment does not detect the information of the host, the number relation between the number of the participating targets and the second preset value is judged, and when the number of the participating targets does not exceed the second preset value, the target image display strategy is a multi-person conversation image display strategy; and when the number of the participating targets exceeds a second preset value, the target image display strategy is an alternate speaking image display strategy.
In a specific implementation, when the number of the participating targets does not exceed the second preset value, the target image display policy of A, B, C, D, E, F people is a multi-person conversation image display policy, and the display image is as shown in fig. 7.
When the number of the participating targets exceeds a second preset value, the target image display strategy is an alternate speech image display strategy, and the display image according to the cutting rule is shown in fig. 8.
In the embodiment, when a conference request instruction is received, a target image display mode is determined according to the conference request instruction, current conference participating target information is acquired, a lens orientation is determined according to the target image display mode, the lens orientation is adjusted according to the lens orientation, if the lens orientation is that the lens faces upwards, the number of conference participating targets and the conference participation degree in the current conference participating target information are acquired, a target image panoramic display strategy is determined according to the lens orientation, the number of conference participating targets and the conference participation degree, and when the lens orientation is rotated to the lens orientation, a preset privacy image or a current target image acquired by a camera is displayed according to the target image display strategy. According to the embodiment, the target image display mode is determined through a conference request instruction input by a user, different lens orientations are adjusted according to different target image display modes, the method is suitable for image display in multiple scenes, a proper target image panoramic display strategy is selected according to the combination of the lens orientations, the number of participating targets and the conference participation degree, different image display modes can be selected according to different participating target information, and the technical problem that the video conversation can be carried out only in a single application scene, different display modes cannot be selected according to the number of participating persons, and the user experience is poor is solved.
Referring to fig. 9, fig. 9 is a flowchart illustrating an image display method according to a third embodiment of the present invention.
Based on the first embodiment, in this embodiment, the step S30 includes:
and S301A, if the lens position is lens forward, determining a wide-angle display strategy of the target image according to the lens position and the current conference information.
It can be understood that if the target display mode is the wide-angle mode, that is, the lens position is the lens forward, and if the lens position is the lens forward, the target image wide-angle display strategy is determined according to the lens position and the current conference participation information.
It should be noted that, in this embodiment, since the lens of the video conference device is capable of capturing an image in a wide angle range of 220 ° when the lens is facing forward, the larger the angles on both sides of the lens, the larger the distortion of the image, and in order to have a good image display effect, the angles on both sides of the obtained 220 ° picture can be cut off by 20 °, so as to achieve an effect of 180 ° image.
In a specific implementation, referring to fig. 10, four people in the prior art A, B, C, D are divided into four people located in front of the video conference device, and at this time, the camera captures images of the four people, but since B, C two people are far away from the position of the A, D two people away from the lens, the portrait of the picture B, C displayed on the re-display is slightly larger than that of A, D, so as to distinguish the distance relationship between the conference object and the video conference device, and finally show an image, referring to fig. 11.
If three BCD people leave and only remain A and sit at the original position, the portrait of the A is correspondingly amplified and placed in the middle; when the ACD three persons leave from the original position, only the rest B is seated, and the position of the B is close to the side, so that the whole person B is preferentially kept in the display frame during amplification, and finally the position of the B is amplified to a certain extent and deviates to the left; when only two AD people exist on the whole conference table, the AD images are amplified in the same proportion and placed in the middle because the distances between the two AD people and the video conference equipment are the same; when only two BC persons exist on the whole conference table, the portrait is amplified in the same proportion after the B and the C are detected, but the amplification proportion of BC is smaller than AD because the two BC persons sit close to each other; when only two people including BD are on the whole conference table, the human figure and other scenes in the picture are amplified in the same proportion after B and D are detected, and because B is closer to the lens and is closer to the side, the position of B is detected simultaneously during amplification in the same proportion, and the body of B is prevented from exceeding the display picture.
In the embodiment, when a conference request instruction is received, a target image display mode is determined according to the conference request instruction, current conference participation target information is acquired, a lens position is determined according to the target image display mode, the lens orientation is adjusted according to the lens position, if the lens position is that the lens faces forward, a target image wide-angle display strategy is determined according to the lens position and the current conference participation information, and when the lens orientation rotates to the lens position, a preset privacy image or a current target image acquired by a camera is displayed according to the target image wide-angle display strategy. The embodiment determines a wide-angle display mode of a target image through a conference request instruction input by a user, adjusts the lens position to be forward according to the wide-angle display mode of the target image, is suitable for image display in multiple scenes, selects a proper target image display strategy according to the combination of the lens position and current meeting target information, and possibly selects different image display modes according to different meeting target information, so that the technical problem that the user experience is poor because the video conversation can be only performed in a single application scene and cannot be performed according to the number of people participating in a meeting, and different display modes are selected and used in the conference mode.
Referring to fig. 12, fig. 12 is a flowchart illustrating an image display method according to a fourth embodiment of the present invention.
Based on the first embodiment, in this embodiment, the step S30 includes:
step S301B: and if the lens position is that the lens faces downwards, determining a target image privacy display strategy according to the lens position.
It can be understood that if the target display mode is the privacy mode, that is, if the lens orientation is lens down, the target image privacy display policy is determined according to the lens orientation if the lens orientation is lens down.
It should be noted that, in this mode, protection of conference privacy is mainly achieved, and when there is a local place that needs to be discussed and does not want the opposite party to see a local picture and hear a local discussion sound, but does not close an ongoing video call conference, this mode can be implemented, and in addition, the privacy mode can be entered into the privacy mode from the panoramic mode or the wide-angle mode.
In a specific implementation, when the current target image display mode is the panoramic mode, one party may need to perform internal discussion and need to temporarily turn off the camera, at this time, a user sends an instruction by pressing a button of the video conference device or a remote controller, and the like, and when receiving a display prohibition instruction, the video conference device controller controls the camera to rotate according to the display prohibition instruction and controls the sensor to detect the current lens orientation, and if the current lens orientation is the lens downward, a preset privacy image is displayed and the microphone is turned off.
When the current target image display mode is the wide-angle mode, a user sends an instruction by pressing a button of video conference equipment or a remote controller and the like, when a video conference equipment controller receives a display prohibition instruction, the camera is controlled to rotate according to the display prohibition instruction, a sensor is controlled to detect the current lens position, if the current lens position is that a lens faces downwards, a preset privacy image is displayed, and a microphone is closed.
It should be noted that after the discussion is finished, the video conference needs to be continued, at this time, the user sends an instruction by pressing a button of the video conference device or a remote controller, and the like, the video conference device controller controls the camera to rotate according to the display start instruction when receiving the display start instruction, and when the lens reduction direction rotates to the lens direction corresponding to the target image display strategy, the microphone is started, the current participating target image is returned to be obtained, and the current participating target image is subjected to image segmentation through a preset image segmentation model, so that a plurality of segmented images are obtained.
It can be understood that the preset privacy image may be one or more images preset by the user, and is used for blocking the current video image and reminding other video users that the video is not interrupted.
In the embodiment, when a meeting request instruction is received, a target image display mode is determined according to the meeting request instruction, current meeting target information is acquired, a lens position is determined according to the target image display mode, the lens orientation is adjusted according to the lens position, if the lens position is that the lens faces downwards, a target image privacy display strategy is determined according to the lens position, and when the lens orientation rotates to the lens position, a preset privacy image is displayed according to the target image privacy display strategy. The embodiment determines the target image privacy display mode through a conference request instruction input by a user, adjusts the lens orientation to be lens-down according to the target image privacy display mode, is suitable for image display in multiple scenes, selects a proper target image display strategy according to the combination of the lens orientation and current conference participating target information, possibly selects different image display modes according to the difference of the conference participating target information, and solves the technical problem that the conference mode selects different display modes according to the difference of the conference participating target information, so that the video conversation can be only used in a single application scene, the number of people participating in the conference cannot be determined, and the user experience is poor.
Furthermore, an embodiment of the present invention further provides a storage medium, on which an image display program is stored, which when executed by a processor implements the steps of the image display method as described above.
Since the storage medium adopts all the technical solutions of all the embodiments, at least all the advantages brought by the technical solutions of the embodiments are available, and are not described in detail herein.
Referring to fig. 13, fig. 13 is a block diagram of the image display device according to the first embodiment of the present invention.
As shown in fig. 13, an image display device according to an embodiment of the present invention includes:
and the instruction receiving module 10 is configured to, when receiving a conference request instruction, determine a target image display mode according to the conference request instruction, and acquire current conference target information.
And a lens adjusting module 20, configured to determine a lens orientation according to the target image display mode, and adjust a lens orientation according to the lens orientation.
And the strategy confirming module 30 is used for determining a target image display strategy based on the lens orientation and the current meeting target information.
And the image display module 40 is configured to display a preset privacy image or a current target image acquired by the camera according to the target image display policy when the lens is rotated to the lens direction.
In an embodiment, the policy validation module 30 is further configured to, if the lens orientation is lens-up, obtain the number of conference targets and the conference participation degree in the current conference target information; determining a target image panoramic display strategy according to the lens orientation, the conference participating target number and the conference participation degree; when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps: and when the lens is rotated to the lens direction, displaying the current target image acquired by the camera according to the target image panoramic display strategy.
In an embodiment, the policy confirmation module 30 is further configured to determine a wide-angle display policy of the target image according to the lens orientation and the current meeting information if the lens orientation is lens-facing forward; when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps: and when the lens rotates to the lens position, displaying the current target image acquired by the camera according to the target image wide-angle display strategy.
In an embodiment, the image display module 30 is further configured to determine a target image privacy display policy according to the lens orientation if the lens orientation is lens-down; when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps: and when the lens rotates to the lens direction, displaying a preset privacy image according to the target image privacy display strategy, and closing the microphone.
In an embodiment, the image display module 40 is further configured to obtain an image of a current meeting target through a camera, and turn on a microphone; performing image segmentation on the current participating target image through a preset image segmentation model to obtain a segmented image; and carrying out image processing on the segmented image to obtain a current target image.
In one embodiment, the strategy confirmation module 40 is further configured to determine a segmentation initial point of the current conference target image based on a lens orientation; and carrying out image segmentation on the current participating target image through a preset image segmentation model based on the segmentation initial point and a preset direction to obtain a segmented image.
In an embodiment, the image display module 40 is further configured to perform mouth shape detection on the current target image through a preset mouth shape detection model, so as to obtain an initial speaker image; acquiring a current sound signal through a microphone, and determining speaker information according to the current sound signal; and determining a target speaker image according to the speaker information and the initial speaker image, and marking the target speaker image.
In the embodiment, when a conference request instruction is received, a target image display mode is determined according to the conference request instruction, current conference participating target information is acquired, a lens orientation is determined according to the target image display mode, the lens orientation is adjusted according to the lens orientation, a target image display strategy is determined based on the lens orientation and the current conference participating target information, and when the lens orientation is rotated to the lens orientation, a preset privacy image or a current target image acquired by a camera is displayed according to the target image display strategy. The embodiment determines the target image display mode through a conference request instruction input by a user, adjusts different lens orientations according to different target image display modes, is suitable for image display in multiple scenes, selects a proper target image display strategy according to the combination of the lens orientations and current conference participating target information, and possibly selects different image display modes according to the different conference participating target information, thereby avoiding the technical problems that the video conversation can only be used in a single application scene, the number of people can not be participated in, the conference mode selects different display modes, and the user experience is poor.
It should be understood that the above is only an example, and the technical solution of the present invention is not limited in any way, and in a specific application, a person skilled in the art may set the technical solution as needed, and the present invention is not limited thereto.
It should be noted that the above-described work flows are only exemplary, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of them to achieve the purpose of the solution of the embodiment according to actual needs, and the present invention is not limited herein.
In addition, the technical details that are not described in detail in this embodiment may refer to the image display method provided in any embodiment of the present invention, and are not described herein again.
Further, it is to be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention or portions thereof that contribute to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (e.g. Read Only Memory (ROM)/RAM, magnetic disk, optical disk), and includes several instructions for enabling a terminal device (e.g. a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An image display method, characterized in that the image display method comprises:
when a meeting request instruction is received, determining a target image display mode according to the meeting request instruction, and acquiring current meeting participation target information;
determining the orientation of a lens according to the target image display mode, and adjusting the orientation of the lens according to the orientation of the lens;
determining a target image display strategy based on the lens orientation and the current meeting target information;
and when the lens rotates to the lens direction, displaying a preset privacy image or a current target image acquired by the camera according to the target image display strategy.
2. The image display method according to claim 1, wherein the target image display policy includes a target image panorama display policy;
the determining a target image display strategy based on the lens orientation and the current meeting target information comprises:
if the lens orientation is that the lens faces upwards, acquiring the number of the conference targets and the conference participation degree in the current conference target information;
determining a target image panoramic display strategy according to the lens orientation, the conference participating target number and the conference participation degree;
when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps:
and when the lens is rotated to the lens direction, displaying the current target image acquired by the camera according to the target image panoramic display strategy.
3. The image display method according to claim 2, wherein the target image display policy includes a target image wide-angle display policy;
the determining a target image display strategy based on the lens orientation and the current meeting target information comprises:
if the lens position is that the lens faces forwards, determining a wide-angle display strategy of the target image according to the lens position and the current conference information;
when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps:
and when the lens rotates to the lens position, displaying the current target image acquired by the camera according to the target image wide-angle display strategy.
4. The image display method according to any one of claims 1 to 3, wherein the target image display policy includes a target image privacy display policy;
the determining a target image display strategy based on the lens orientation and the current meeting target information comprises:
if the lens orientation is that the lens faces downwards, determining a target image privacy display strategy according to the lens orientation;
when the lens orientation rotates to the lens position, the preset privacy image or the current target image obtained by the camera is displayed according to the target image display strategy, and the method further comprises the following steps:
and when the lens rotates to the lens direction, displaying a preset privacy image according to the target image privacy display strategy, and closing the microphone.
5. The image display method according to any one of claims 1 to 3, wherein before displaying a preset privacy image or a current target image acquired by a camera according to the target image display policy when the lens orientation is rotated to the lens orientation, the method comprises:
acquiring a current participating target image through a camera, and starting a microphone;
performing image segmentation on the current participating target image through a preset image segmentation model to obtain a segmented image;
and carrying out image processing on the segmented image to obtain a current target image.
6. The image display method according to claim 5, wherein the image segmentation of the current meeting target image through a preset image segmentation model to obtain a segmented image comprises:
determining a segmentation initial point of the current conferencing target image based on the lens orientation;
and carrying out image segmentation on the current participating target image through a preset image segmentation model based on the segmentation initial point and a preset direction to obtain a segmented image.
7. The image display method according to any one of claims 1 to 3, wherein after displaying a preset privacy image or a current target image acquired by a camera according to the target image display policy when the lens orientation is rotated to the lens orientation, the method further comprises:
carrying out mouth shape detection on the current target image through a preset mouth shape detection model to obtain an initial speaker image;
acquiring a current sound signal through a microphone, and determining speaker information according to the current sound signal;
and determining a target speaker image according to the speaker information and the initial speaker image, and marking the target speaker image.
8. An image display device characterized by comprising:
the instruction receiving module is used for determining a target image display mode according to a conference request instruction and acquiring current conference participating target information when the conference request instruction is received;
the lens adjusting module is used for determining the lens position according to the target image display mode and adjusting the lens orientation according to the lens position;
the strategy confirming module is used for confirming a target image display strategy based on the lens direction and the current meeting target information;
and the image display module is used for displaying a preset privacy image or a current target image acquired by the camera according to the target image display strategy when the lens is rotated to the lens direction.
9. An image display apparatus characterized by comprising: a memory, a processor, and an image display program stored on the memory and executable on the processor, the image display program configured to implement the image display method according to any one of claims 1 to 7.
10. A storage medium having stored thereon an image display program which, when executed by a processor, implements the image display method according to any one of claims 1 to 7.
CN202111047995.3A 2021-09-07 2021-09-07 Image display method, device, equipment and storage medium Active CN113905204B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111047995.3A CN113905204B (en) 2021-09-07 2021-09-07 Image display method, device, equipment and storage medium
PCT/CN2021/118489 WO2022262134A1 (en) 2021-09-07 2021-09-15 Image display method, apparatus and device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111047995.3A CN113905204B (en) 2021-09-07 2021-09-07 Image display method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113905204A true CN113905204A (en) 2022-01-07
CN113905204B CN113905204B (en) 2023-02-14

Family

ID=79188827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111047995.3A Active CN113905204B (en) 2021-09-07 2021-09-07 Image display method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN113905204B (en)
WO (1) WO2022262134A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116489502B (en) * 2023-05-12 2023-10-31 深圳星河创意科技开发有限公司 Remote conference method based on AI camera docking station and AI camera docking station
CN117640877B (en) * 2024-01-24 2024-03-29 浙江华创视讯科技有限公司 Picture reconstruction method for online conference and electronic equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102648626A (en) * 2009-10-14 2012-08-22 思科系统国际公司 Device and method for camera control
US20140176663A1 (en) * 2012-12-20 2014-06-26 Microsoft Corporation Privacy camera
US9113032B1 (en) * 2011-05-31 2015-08-18 Google Inc. Selecting participants in a video conference
WO2015192631A1 (en) * 2014-06-17 2015-12-23 中兴通讯股份有限公司 Video conferencing system and method
CN109257559A (en) * 2018-09-28 2019-01-22 苏州科达科技股份有限公司 A kind of image display method, device and the video conferencing system of panoramic video meeting
JP2020088618A (en) * 2018-11-27 2020-06-04 株式会社リコー Video conference system, communication terminal, and method for controlling microphone of communication terminal
CN112601044A (en) * 2020-12-08 2021-04-02 深圳市焦点数字科技有限公司 Conference scene picture self-adaption method
CN113139491A (en) * 2021-04-30 2021-07-20 厦门盈趣科技股份有限公司 Video conference control method, system, mobile terminal and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833518B (en) * 2011-06-13 2015-07-08 华为终端有限公司 Method and device for optimally configuring MCU (multipoint control unit) multipicture
US9237140B1 (en) * 2013-03-07 2016-01-12 Cisco Technologies, Inc. Acceptance of policies for cross-company online sessions
CN112351237A (en) * 2020-11-05 2021-02-09 安徽马钢和菱实业有限公司 Automatic switching decision algorithm for main video of video conference

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102648626A (en) * 2009-10-14 2012-08-22 思科系统国际公司 Device and method for camera control
US9113032B1 (en) * 2011-05-31 2015-08-18 Google Inc. Selecting participants in a video conference
US20140176663A1 (en) * 2012-12-20 2014-06-26 Microsoft Corporation Privacy camera
CN108234920A (en) * 2012-12-20 2018-06-29 微软技术许可有限责任公司 Video camera with privacy mode
WO2015192631A1 (en) * 2014-06-17 2015-12-23 中兴通讯股份有限公司 Video conferencing system and method
CN109257559A (en) * 2018-09-28 2019-01-22 苏州科达科技股份有限公司 A kind of image display method, device and the video conferencing system of panoramic video meeting
JP2020088618A (en) * 2018-11-27 2020-06-04 株式会社リコー Video conference system, communication terminal, and method for controlling microphone of communication terminal
CN112601044A (en) * 2020-12-08 2021-04-02 深圳市焦点数字科技有限公司 Conference scene picture self-adaption method
CN113139491A (en) * 2021-04-30 2021-07-20 厦门盈趣科技股份有限公司 Video conference control method, system, mobile terminal and storage medium

Also Published As

Publication number Publication date
WO2022262134A1 (en) 2022-12-22
CN113905204B (en) 2023-02-14

Similar Documents

Publication Publication Date Title
CA2874715C (en) Dynamic video and sound adjustment in a video conference
US8289363B2 (en) Video conferencing
US9860486B2 (en) Communication apparatus, communication method, and communication system
EP2180703A1 (en) Displaying dynamic caller identity during point-to-point and multipoint audio/videoconference
CN113905204B (en) Image display method, device, equipment and storage medium
GB2440376A (en) Wide angle video conference imaging
US20080063389A1 (en) Tracking a Focus Point by a Remote Camera
US20170127017A1 (en) Communication system, communication apparatus and communication method
JP4581210B2 (en) Video conference system
SG187168A1 (en) Image processing apparatus, image processing method, and computer-readable recording medium
JPH1042264A (en) Video conference system
JP6149433B2 (en) Video conference device, video conference device control method, and program
JP2022054192A (en) Remote conference system, server, photography device, audio output method, and program
US10785445B2 (en) Audiovisual transmissions adjustments via omnidirectional cameras
EP4106326A1 (en) Multi-camera automatic framing
CN116614598A (en) Video conference picture adjusting method, device, electronic equipment and medium
JP2013016933A (en) Terminal device, imaging method, and program
JP2007251355A (en) Relaying apparatus for interactive system, interactive system, and interactive method
CN113473066A (en) Video conference picture adjusting method
JP6565777B2 (en) COMMUNICATION DEVICE, CONFERENCE SYSTEM, PROGRAM, AND DISPLAY CONTROL METHOD
JP2017092675A (en) Information processing apparatus, conference system, information processing method, and program
JP2010028299A (en) Conference photographed image processing method, conference device, and the like
JP2002262138A (en) Image pickup system, video conference system, monitoring system, and information terminal with image pickup function
CN106454128B (en) Self-shooting bar adjusting method and device
CN115834822A (en) Video conference control method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant