CN112073613A - Conference portrait shooting method, interactive tablet, computer equipment and storage medium - Google Patents

Conference portrait shooting method, interactive tablet, computer equipment and storage medium Download PDF

Info

Publication number
CN112073613A
CN112073613A CN202010948507.5A CN202010948507A CN112073613A CN 112073613 A CN112073613 A CN 112073613A CN 202010948507 A CN202010948507 A CN 202010948507A CN 112073613 A CN112073613 A CN 112073613A
Authority
CN
China
Prior art keywords
target
image
wide
target person
face image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010948507.5A
Other languages
Chinese (zh)
Other versions
CN112073613B (en
Inventor
吴文宪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shirui Electronics Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shirui Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd, Guangzhou Shirui Electronics Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN202010948507.5A priority Critical patent/CN112073613B/en
Publication of CN112073613A publication Critical patent/CN112073613A/en
Application granted granted Critical
Publication of CN112073613B publication Critical patent/CN112073613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Studio Devices (AREA)

Abstract

The application discloses a method for shooting a conference portrait, which comprises the following steps: collecting wide-angle images in a video conference room; determining a target position of a target person in the wide-angle image; and controlling a zoom lens to shoot a target face image of the target person according to the target position. The application also discloses an interactive tablet, a computer device and a computer readable storage medium. The method and the device for obtaining the close-up image of the target person in the video conference are achieved.

Description

Conference portrait shooting method, interactive tablet, computer equipment and storage medium
Technical Field
The present application relates to the field of image capturing, and in particular, to a method for capturing a conference portrait, an interactive tablet, a computer device, and a computer-readable storage medium.
Background
In a common scene of a video conference, a plurality of people in a conference room participate in a video call, and in order to achieve a more excellent video effect, a speaker often performs close-up shooting and tracking of a portrait. However, since a wide-angle shot is generally used in a video camera to record images of all participants, when a close-up image of a target person (e.g., a speaker) needs to be displayed, the target person image is cut from the wide-angle shot, but the obtained close-up image of the target person is often unclear.
The above is only for the purpose of assisting understanding of the technical solutions of the present application, and does not represent an admission that the above is prior art.
Disclosure of Invention
The application mainly aims to provide a conference portrait shooting method, an interactive tablet, a computer device and a computer readable storage medium, and aims to solve the problem that a clear close-up image of a target person in a video conference is difficult to obtain.
In order to achieve the above object, the present application provides a method for capturing a conference portrait, comprising the following steps:
collecting wide-angle images in a video conference room;
determining a target position of a target person in the wide-angle image;
and controlling a zoom lens to shoot a target face image of the target person according to the target position.
Further, the step of determining the target position of the target person in the wide-angle image comprises:
determining a first position of the target person in the wide-angle image by using a sound positioning algorithm, identifying a face image of the target person in the wide-angle image, and determining a second position of the face image in the wide-angle image;
and obtaining the target position of the target person in the wide-angle image according to the first position and the second position.
Further, after the step of identifying the face image of the target person in the wide-angle image, the method further includes:
determining a first distance between the target person and a shooting position according to the number of pixel points corresponding to the face image;
and determining the focal length of the zoom lens according to the first distance, wherein the focal length is applied to shooting the target face image.
Further, after the step of determining the target position of the target person in the wide-angle image, the method further includes:
when the plurality of target characters are determined, determining a second distance between target positions corresponding to the plurality of target characters;
judging whether the second distance is smaller than or equal to a preset threshold value;
and if so, executing the step of controlling a zoom lens to shoot the target face image of the target person according to the target position.
Further, after the step of determining whether the second distance is smaller than a preset threshold, the method further includes:
if not, determining a first target person and a second target person according to the speaking time corresponding to each target person;
and controlling the zoom lens to shoot a target face image of the first target person according to the target position corresponding to the first target person, and taking a face image of the second target person in the wide-angle image as a target face image of the second target person.
Further, after the step of controlling the zoom lens to capture the target face image of the target person according to the target position, the method further includes:
generating a picture-in-picture image according to the target face image and the wide-angle image, and outputting the picture-in-picture image;
or outputting the target face image.
Further, the method for shooting the meeting portrait further comprises the following steps:
and after the picture-in-picture image or the target face image is output, if the voice information is not detected within a preset time, outputting the wide-angle image.
To achieve the above object, the present application further provides an interactive tablet, including:
the acquisition module is used for acquiring wide-angle images in a video conference room;
a determination module for determining a target position of a target person in the wide-angle image;
and the shooting module is used for controlling the zoom lens to shoot the target face image of the target person according to the target position.
To achieve the above object, the present application also provides a computer device, comprising:
the computer device comprises a memory, a processor and a conference portrait shooting program which is stored on the memory and can run on the processor, wherein the conference portrait shooting program realizes the steps of the conference portrait shooting method when being executed by the processor.
To achieve the above object, the present application further provides a computer-readable storage medium, on which a program for capturing a conference portrait is stored, and when the program for capturing a conference portrait is executed by a processor, the steps of the method for capturing a conference portrait are implemented.
The conference portrait shooting method, the interactive flat plate, the computer device and the computer readable storage medium collect wide-angle images in a video conference room; determining a target position of a target person in the wide-angle image; and controlling a zoom lens to shoot a target face image of the target person according to the target position. Therefore, by using the mode of combining the wide-angle lens and the zoom lens, a wide-angle image of the video conference is obtained, and meanwhile, a clear target face image of a target person can be obtained.
Drawings
Fig. 1 is a schematic diagram illustrating a step of a method for capturing a portrait of a conference in an embodiment of the present application;
fig. 2 is a schematic diagram illustrating another step of a method for capturing a portrait of a conference in an embodiment of the present application;
fig. 3 is a schematic diagram illustrating another step of a method for capturing a portrait of a conference in an embodiment of the present application;
fig. 4 is a schematic diagram of a further step of a method for capturing a portrait of a conference in an embodiment of the present application;
FIG. 5 is a block diagram illustrating a schematic structure of an interactive tablet in an embodiment of the present application;
FIG. 6 is a block diagram illustrating a computer device according to an embodiment of the present application;
fig. 7 is a block diagram schematically illustrating a configuration of a terminal system according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Referring to fig. 1, in an embodiment, the method for capturing a meeting portrait includes:
and step S10, acquiring a wide-angle image in the video conference room.
And step S20, determining the target position of the target person in the wide-angle image.
And step S30, controlling a zoom lens to shoot a target face image of the target person according to the target position.
In this embodiment, the execution terminal may be an interactive tablet (also referred to as an interactive smart tablet), a conference machine, a computer device, or the like, or may be a shooting device (such as an image processor) for a conference portrait.
As set forth in step S10: optionally, the terminal system may be configured as shown in fig. 7, and includes: the device comprises an image processor, a first image sensor, a second image sensor, a wide-angle lens, a zoom lens, a holder, a microphone array and a sound processing device. The wide-angle camera is used for shooting and recording the scene inside the video conference room, and a first image sensor transmits a wide-angle image acquired based on the wide-angle camera to an image processor; the microphone array is used for collecting the voice information in the conference room and transmitting the collected voice information to the sound processing device, and the sound processing device can analyze the voice information while outputting audio, so that a speaker in the conference room is subjected to sound source positioning and the sound source positioning data is transmitted to the image processor; the zoom lens is erected on the holder, the image processor can adjust the shooting angle of the zoom lens by controlling the holder to rotate, and the second image sensor is used for transmitting the image collected based on the zoom lens to the image processor; the image processor is also used for integrating the images transmitted by the first image sensor and the second image sensor to obtain corresponding videos and outputting the videos.
It should be noted that the wide-angle lens may be a fisheye lens; the holder is a supporting platform of the camera; the image sensor can convert the light image on the light sensing surface into an electric signal in a corresponding proportional relation with the light image by utilizing the photoelectric conversion function of the photoelectric device.
It should be understood that the zoom lens and the pan/tilt/second image sensor may be integrally formed, such as a zoom single-shot pan/tilt camera; the wide-angle lens and the second image sensor may also be integrally formed, such as a wide-angle camera.
Optionally, when detecting that the video conference is started, the terminal may acquire a wide-angle image in the video conference room by using the wide-angle lens.
As set forth in step S20: optionally, when the terminal collects the wide-angle image, the microphone array is further used for collecting the voice information in the conference room, and a speaker corresponding to the voice information is defined as the target person.
Further, the terminal determines a first position of a target person in the wide-angle image by utilizing a sound source positioning technology according to the voice information acquired by the microphone array.
It should be noted that, in the sound source localization technology, the position of a sound source can be obtained by using the difference between time points (or sound intensities) corresponding to the same sound source received by at least two microphones in a microphone array and combining with plane geometry according to distance information between the microphones.
Optionally, a rectangular plane coordinate system is constructed with the plane of the wide-angle image, and when the plane of the wide-angle image is perpendicular to the horizontal plane, the direction in which the intersection line extends is the horizontal direction (X-axis direction), and the direction perpendicular to the horizontal plane is the vertical direction (Y-axis direction). In addition, a direction perpendicular to the plane of the wide-angle image is defined as a Z-axis direction.
Optionally, the X-axis coordinate of the target person in the wide-angle image is analyzed by using a sound localization algorithm and recorded as the first position.
After the first position of the target person is obtained, the target person is positioned to a portrait position (namely the portrait corresponding to the target person) corresponding to the first position in the wide-angle image, then a face image of the target person is obtained by recognition through a face recognition technology, and then a Y-axis coordinate of the face image of the target person in the vertical direction is obtained according to the distance between a development area of the face image in the wide-angle image and the upper boundary and/or the lower boundary of the wide-angle image and is used as a second position of the face image in the wide-angle image.
Optionally, when the second position corresponding to the face image is determined, the central point of the image area of the face image may be determined, and then the Y-axis coordinate is obtained as the second position according to the distance between the central point of the image area and the upper boundary and/or the lower boundary of the wide-angle image.
It should be understood that when recognizing a face image, it may be possible to recognize an image including the entire face head.
Combining the first position (X-axis coordinate) and the second position (Y-axis coordinate), the target position (X, Y) of the target person in the wide-angle image can be obtained. Thus, the target position of the target person can be quickly located.
Optionally, after the terminal obtains the wide-angle image, the terminal may also identify the face images in the wide-angle image to obtain a second position corresponding to each face image, then determine the first position of the target person in the wide-angle image in real time or at regular time according to a sound positioning algorithm, then position the first position to the face image in the wide-angle image in the same vertical direction as the first position to obtain the second position corresponding to the face image, and combine the first position and the second position to obtain the target position of the target person in the wide-angle image.
Optionally, the terminal may also be a reference face image in which the target person is stored in advance, and after the terminal acquires the wide-angle image, the face image in the wide-angle image, which is the same as or similar to the reference face image, is identified directly by using a face identification technology as the face image of the target person. And further determining the X-axis coordinate and the Y-axis coordinate of the face image in the wide-angle image to obtain the target position of the target person in the wide-angle image.
As set forth in step S30: when the target position corresponding to the face image of the target person in the wide-angle image is obtained, the holder is controlled to rotate so that the shooting angle of the zoom lens is over against the direction of the target position, then the zoom lens is used for aligning the head portrait of the target person, the focal length of the zoom lens is adjusted so that the image proportion of the head portrait is within a set range, meanwhile, the head portrait is focused clearly, and the target person is shot in a close-up manner to obtain the target face image of the target person.
Optionally, the focal length of the zoom lens used for shooting the target face image may be preset according to actual needs (e.g., a preset factory value); or determining a first distance between the target person and the shooting position according to the number of pixel points corresponding to the face image, and then determining the focal length of the zoom lens according to the first distance.
Therefore, by using the mode of combining the wide-angle lens and the zoom lens, a wide-angle image of the video conference is obtained, and meanwhile, a clear target face image of a target person can be obtained.
In an embodiment, as shown in fig. 2, on the basis of the embodiment shown in fig. 1, the method for capturing a portrait of a conference further includes:
step S40, determining a first distance between the target person and the shooting position according to the number of pixel points corresponding to the face image of the target person in the wide-angle image;
and step S41, determining the focal length of the zoom lens according to the first distance, wherein the focal length is applied to shooting the target face image.
In this embodiment, while the terminal identifies and obtains the face image of the target person in the wide-angle image, the terminal may further obtain a development area corresponding to the face image, calculate the number of pixels in the development area, determine a ratio of the number of pixels corresponding to the face image to the total number of pixels in the wide-angle image, and determine a first distance along the Z-axis direction between the shooting positions of the target person (or the face image of the target person) and the wide-angle lens (or the zoom lens) according to the ratio.
It should be understood that the smaller the occupancy value, the farther the resulting first distance.
The terminal only needs to be subjected to data analysis training in advance, the relation between different ratio values and the corresponding first distance is stored, and after the ratio value of the pixel point number corresponding to the face image in the total pixel point number of the wide-angle image is obtained, the first distance corresponding to the ratio value can be obtained.
Optionally, the terminal determines in advance a corresponding relationship between different first distances and focal lengths of the zoom lens, and sets corresponding focal lengths for the different first distances, so that after the zoom lens is adjusted to a focal length corresponding to the current first distance, a ratio of a head portrait image of a target person captured by the zoom lens is within a set range, and the target person is focused clearly.
It should be understood that the setting range can be set according to the actual situation, and the application is not limited.
Therefore, according to the difference of the first distance between the target person and the shooting position, the corresponding focal length is determined, the obtained focal length control zoom lens carries out close-up shooting on the target person, and the clear target face image of the target person can be obtained more accurately.
In an embodiment, as shown in fig. 3, on the basis of the above embodiments of fig. 1 to 2, after the step of determining the target position of the target person in the wide-angle image, the method further includes:
in step S50, when the plurality of target persons are determined, a second distance between the target positions corresponding to the plurality of target persons is determined.
Step S51, judging whether the second distance is smaller than or equal to a preset threshold value;
and step S60, if yes, executing the step of controlling the zoom lens to shoot the target face image of the target person according to the target position.
In this embodiment, the terminal may perform sound localization by using the voice information acquired by the microphone array within the first preset time period. When a plurality of persons speak within the first preset time period, the terminal can determine a plurality of target persons and first positions of the target persons in the wide-angle image.
It should be noted that the first preset time period may be set according to actual requirements, such as 30 seconds, one minute, and the like.
Further, after obtaining the first position of each target person, the terminal respectively determines the face image of each target person in the wide-angle image, respectively determines the second position of the face image corresponding to each target person in the wide-angle image, and obtains the target position of the target person by combining the first position and the second position.
Optionally, if the terminal controls a plurality of zoom lenses, and the number of the zoom lenses is greater than or equal to the number of the target persons, the terminal assigns a corresponding zoom lens to each target person according to the target position corresponding to each target person, and captures a target face image of the target person by using the zoom lenses.
Optionally, if the terminal only controls one zoom lens, or the number of zoom lenses is smaller than the number of target persons, the terminal determines a second distance between target positions corresponding to face images of a plurality of target persons, and if the number of target persons is greater than or equal to 3, determines a second distance between a target person located at the leftmost side and a target person located at the rightmost side in the wide-angle image.
And after second distances among the target positions corresponding to the target people are obtained, whether the second distances are smaller than or equal to a preset threshold value is detected. The preset threshold value is characterized in that the maximum transverse length of a target face image shot by the zoom lens meets the requirement of definition.
Optionally, when the terminal detects that the second distance is smaller than or equal to the preset threshold, the step of controlling the zoom lens to shoot the target face image of the target person according to the target position is executed (step S30), so that the shot target face image includes the head portraits of all the target persons. The terminal can determine the central position of a plurality of target characters according to the target position corresponding to each target character, align the focus of the zoom lens with the central position and then shoot a target face image.
Therefore, when a plurality of target persons exist, the shooting of the target persons by the zoom lenses can be distributed or adjusted according to actual needs, and clear target face images of the target persons can be obtained to the maximum extent.
In an embodiment, as shown in fig. 4, on the basis of the embodiments of fig. 1 to fig. 3, after the step of determining whether the second distance is smaller than a preset threshold, the method further includes:
step S70, if not, determining a first target character and a second target character according to the speaking time corresponding to each target character;
step S71, controlling the zoom lens to capture a target face image of the first target person according to the target position corresponding to the first target person, and taking a face image of the second target person in the wide-angle image as a target face image of the second target person.
In this embodiment, after obtaining the second distances between the target positions corresponding to the plurality of target persons, it is detected whether the second distances are smaller than or equal to a preset threshold.
Optionally, when the terminal detects that the second distance is greater than the preset threshold, the terminal determines the first target person and the second target person according to the speaking time corresponding to each target person.
Optionally, the speaking time may be speaking duration, and the target person with the longest speaking duration in the first preset duration is taken as the first target person, and the rest of the target persons are taken as the second target person.
Optionally, the speaking time may be a speaking time point, and the target person whose corresponding speaking time point is closest to the current time point is taken as the first target person, and the remaining target persons are taken as the second target persons.
Optionally, after the terminal distinguishes a first target person from a second target person from the plurality of target persons, the zoom lens is controlled to capture a target face image of the first target person according to a target position corresponding to the first target person (that is, step S30 is executed only for the first target person, and the zoom lens is used to capture the target face image of the first target person); and for the second target person, directly cutting the face image of the second target person in the wide-angle image from the wide-angle image, and performing image amplification processing to obtain the final face image as the target face image of the second target person.
Therefore, when a plurality of target persons exist, the shooting of the target persons by the zoom lenses can be distributed or adjusted according to actual needs, and clear target face images of the target persons can be obtained to the maximum extent.
In one embodiment, after the step of controlling the zoom lens to capture the target face image of the target person according to the target position is performed, the method further includes:
and step S80, generating a picture-in-picture image according to the target face image and the wide-angle image, and outputting the picture-in-picture image.
Or, in step S81, outputting the target face image.
In this embodiment, when the video conference is started, especially in a remote conference scene, a video in a conference room acquired by a general local terminal may be transmitted through a network and output to a remote terminal for playing, so that a participant in the conference room of the remote terminal can know the condition in the local conference room.
Optionally, after obtaining the target face image of the target person, the terminal may generate a picture-in-picture image according to the target face image and the wide-angle image, for example, taking the wide-angle image as a bottom-layer image, superimpose the target face image on the wide-angle image, and display the target face image outside a person area in the wide-angle image, thereby obtaining the picture-in-picture image. And after the terminal obtains the picture-in-picture image, continuously outputting the picture-in-picture image, and integrating the picture-in-picture image into a video to be transmitted to the remote terminal.
Or, after obtaining the target face image of the target person, the terminal may only output the target face image, and integrate the continuously output target face image into a video to transmit to the remote terminal.
Therefore, the attention of the participants to the target person can be improved by emphasizing the target face image of the target person.
Optionally, after the terminal outputs the picture-in-picture image or the target face image, if the microphone array does not detect the voice information within the preset time (which is recorded as the second preset time) (for example, it takes the turn of speaking to the participant at the remote terminal, so that the participant at the local terminal remains quiet, which may be the case), the terminal switches the output picture-in-picture image or the target face image to the output wide-angle image, and integrates the continuously output wide-angle image into a video to be transmitted to the remote terminal.
It should be noted that the second preset time period may be set according to actual requirements, such as one minute, two minutes, and the like.
Optionally, when the terminal determines that a plurality of target persons are obtained, and detects that the second distances between the target positions corresponding to the face images of the plurality of target persons are greater than the preset threshold, it is determined that the distances of the target persons are too dispersed at the moment and are not suitable for shooting the target face images by using the zoom lens, and the terminal can directly output the wide-angle images in the video conference room and integrate the continuously output wide-angle images into a video to be transmitted to the remote terminal.
Therefore, by outputting the wide-angle image, remote participants can know the global situation of the local terminal conference conveniently.
Referring to fig. 5, an interactive tablet 10 is further provided in the embodiment of the present application, including:
the acquisition module 11 is used for acquiring wide-angle images in a video conference room;
a determining module 12 for determining a target position of a target person in the wide-angle image;
and the shooting module 13 is used for controlling the zoom lens to shoot a target face image of the target person according to the target position.
Referring to fig. 6, a computer device, which may be a server and whose internal structure may be as shown in fig. 6, is also provided in the embodiment of the present application. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer designed processor is used to provide computational and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for a program for capturing images of the conference. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of capturing a conference portrait.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the present teachings and is not intended to limit the scope of the present teachings as applied to computer devices.
Furthermore, the present application also proposes a computer-readable storage medium, which includes a program for capturing a conference portrait, and when the program for capturing a conference portrait is executed by a processor, the steps of the method for capturing a conference portrait as described in the above embodiments are implemented. It is to be understood that the computer-readable storage medium in the present embodiment may be a volatile-readable storage medium or a non-volatile-readable storage medium.
In summary, the wide-angle image in the video conference room is acquired for the conference portrait shooting method, the interactive tablet, the computer device and the storage medium provided in the embodiment of the application; determining a target position of a target person in the wide-angle image; and controlling a zoom lens to shoot a target face image of the target person according to the target position. Therefore, by using the mode of combining the wide-angle lens and the zoom lens, a wide-angle image of the video conference is obtained, and meanwhile, a clear target face image of a target person can be obtained.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above description is only for the preferred embodiment of the present application and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are intended to be included within the scope of the present application.

Claims (10)

1. A method for shooting a meeting portrait is characterized by comprising the following steps:
collecting wide-angle images in a video conference room;
determining a target position of a target person in the wide-angle image;
and controlling a zoom lens to shoot a target face image of the target person according to the target position.
2. The method of capturing a human image of a conference as set forth in claim 1, wherein the step of determining the target position of the target person in the wide-angle image includes:
determining a first position of the target person in the wide-angle image by using a sound positioning algorithm, identifying a face image of the target person in the wide-angle image, and determining a second position of the face image in the wide-angle image;
and obtaining the target position of the target person in the wide-angle image according to the first position and the second position.
3. The method for capturing a conference portrait according to claim 2, wherein the step of recognizing a face image of the target person in the wide-angle image is followed by further comprising:
determining a first distance between the target person and a shooting position according to the number of pixel points corresponding to the face image;
and determining the focal length of the zoom lens according to the first distance, wherein the focal length is applied to shooting the target face image.
4. The method for capturing a human image for a conference as set forth in any one of claims 1 to 3, wherein said step of determining a target position of a target person in said wide-angle image further comprises:
when the plurality of target characters are determined, determining a second distance between target positions corresponding to the plurality of target characters;
judging whether the second distance is smaller than or equal to a preset threshold value;
and if so, executing the step of controlling a zoom lens to shoot the target face image of the target person according to the target position.
5. The method for capturing the portrait of the conference as set forth in claim 4, wherein the step of determining whether the second distance is smaller than a preset threshold value further includes:
if not, determining a first target person and a second target person according to the speaking time corresponding to each target person;
and controlling the zoom lens to shoot a target face image of the first target person according to the target position corresponding to the first target person, and taking a face image of the second target person in the wide-angle image as a target face image of the second target person.
6. The method for photographing a human image in a conference as claimed in claim 1, wherein the step of controlling a zoom lens to photograph a face image of the subject person according to the subject position further comprises:
generating a picture-in-picture image according to the target face image and the wide-angle image, and outputting the picture-in-picture image;
or outputting the target face image.
7. The method for photographing a conference portrait according to claim 6, further comprising:
and after the picture-in-picture image or the target face image is output, if the voice information is not detected within a preset time, outputting the wide-angle image.
8. An interactive tablet, comprising:
the acquisition module is used for acquiring wide-angle images in a video conference room;
a determination module for determining a target position of a target person in the wide-angle image;
and the shooting module is used for controlling the zoom lens to shoot the target face image of the target person according to the target position.
9. A computer device, characterized in that the computer device comprises a memory, a processor and a program for capturing a conference figure stored on the memory and executable on the processor, the program for capturing a conference figure realizing the steps of the method for capturing a conference figure according to any one of claims 1 to 7 when executed by the processor.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a program for photographing a conference portrait, which when executed by a processor implements the steps of the method for photographing a conference portrait according to any one of claims 1 to 7.
CN202010948507.5A 2020-09-10 2020-09-10 Conference portrait shooting method, interactive tablet, computer equipment and storage medium Active CN112073613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010948507.5A CN112073613B (en) 2020-09-10 2020-09-10 Conference portrait shooting method, interactive tablet, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010948507.5A CN112073613B (en) 2020-09-10 2020-09-10 Conference portrait shooting method, interactive tablet, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112073613A true CN112073613A (en) 2020-12-11
CN112073613B CN112073613B (en) 2021-11-23

Family

ID=73663593

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010948507.5A Active CN112073613B (en) 2020-09-10 2020-09-10 Conference portrait shooting method, interactive tablet, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112073613B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907617A (en) * 2021-01-29 2021-06-04 深圳壹秘科技有限公司 Video processing method and device
CN113139422A (en) * 2021-03-02 2021-07-20 广州朗国电子科技有限公司 Conference portrait shooting verification method, equipment and storage medium
CN113473061A (en) * 2021-06-10 2021-10-01 荣耀终端有限公司 Video call method and electronic equipment
CN114040145A (en) * 2021-11-20 2022-02-11 深圳市音络科技有限公司 Video conference portrait display method, system, terminal and storage medium
CN115529435A (en) * 2022-11-29 2022-12-27 长沙朗源电子科技有限公司 High-definition conference picture wireless transmission method, system, equipment and storage medium
CN116567385A (en) * 2023-06-14 2023-08-08 深圳市宗匠科技有限公司 Image acquisition method and image acquisition device
WO2024087641A1 (en) * 2022-10-27 2024-05-02 深圳奥尼电子股份有限公司 Audio and video control method with intelligent wireless microphone tracking function
WO2024145878A1 (en) * 2023-01-05 2024-07-11 广州视源电子科技股份有限公司 Video processing method and apparatus, device, and storage medium

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120169895A1 (en) * 2010-03-24 2012-07-05 Industrial Technology Research Institute Method and apparatus for capturing facial expressions
CN103327250A (en) * 2013-06-24 2013-09-25 深圳锐取信息技术股份有限公司 Method for controlling camera lens based on pattern recognition
CN106650671A (en) * 2016-12-27 2017-05-10 深圳英飞拓科技股份有限公司 Human face identification method, apparatus and system
US20180082439A1 (en) * 2016-09-20 2018-03-22 Kabushiki Kaisha Toshiba Image collation system and image collation method
CN108377342A (en) * 2018-05-22 2018-08-07 Oppo广东移动通信有限公司 double-camera photographing method, device, storage medium and terminal
CN108682032A (en) * 2018-04-02 2018-10-19 广州视源电子科技股份有限公司 Method and device for controlling video image output, readable storage medium and terminal
KR101899318B1 (en) * 2017-04-24 2018-10-29 주식회사 이엠따블유 Hierarchical face object search method and face recognition method using the same, hierarchical face object search system and face recognition system using the same
CN108933915A (en) * 2017-05-26 2018-12-04 和硕联合科技股份有限公司 Video conference device and video conference management method
CN109257559A (en) * 2018-09-28 2019-01-22 苏州科达科技股份有限公司 A kind of image display method, device and the video conferencing system of panoramic video meeting
CN110113515A (en) * 2019-05-13 2019-08-09 Oppo广东移动通信有限公司 Camera control method and Related product
CN111163281A (en) * 2020-01-09 2020-05-15 北京中电慧声科技有限公司 Panoramic video recording method and device based on voice tracking
CN111263106A (en) * 2020-02-25 2020-06-09 厦门亿联网络技术股份有限公司 Picture tracking method and device for video conference
CN111586341A (en) * 2020-05-20 2020-08-25 深圳随锐云网科技有限公司 Shooting method and picture display method of video conference shooting device

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120169895A1 (en) * 2010-03-24 2012-07-05 Industrial Technology Research Institute Method and apparatus for capturing facial expressions
CN103327250A (en) * 2013-06-24 2013-09-25 深圳锐取信息技术股份有限公司 Method for controlling camera lens based on pattern recognition
US20180082439A1 (en) * 2016-09-20 2018-03-22 Kabushiki Kaisha Toshiba Image collation system and image collation method
CN106650671A (en) * 2016-12-27 2017-05-10 深圳英飞拓科技股份有限公司 Human face identification method, apparatus and system
KR101899318B1 (en) * 2017-04-24 2018-10-29 주식회사 이엠따블유 Hierarchical face object search method and face recognition method using the same, hierarchical face object search system and face recognition system using the same
CN108933915A (en) * 2017-05-26 2018-12-04 和硕联合科技股份有限公司 Video conference device and video conference management method
CN108682032A (en) * 2018-04-02 2018-10-19 广州视源电子科技股份有限公司 Method and device for controlling video image output, readable storage medium and terminal
CN108377342A (en) * 2018-05-22 2018-08-07 Oppo广东移动通信有限公司 double-camera photographing method, device, storage medium and terminal
CN109257559A (en) * 2018-09-28 2019-01-22 苏州科达科技股份有限公司 A kind of image display method, device and the video conferencing system of panoramic video meeting
CN110113515A (en) * 2019-05-13 2019-08-09 Oppo广东移动通信有限公司 Camera control method and Related product
CN111163281A (en) * 2020-01-09 2020-05-15 北京中电慧声科技有限公司 Panoramic video recording method and device based on voice tracking
CN111263106A (en) * 2020-02-25 2020-06-09 厦门亿联网络技术股份有限公司 Picture tracking method and device for video conference
CN111586341A (en) * 2020-05-20 2020-08-25 深圳随锐云网科技有限公司 Shooting method and picture display method of video conference shooting device

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907617A (en) * 2021-01-29 2021-06-04 深圳壹秘科技有限公司 Video processing method and device
WO2022160748A1 (en) * 2021-01-29 2022-08-04 深圳壹秘科技有限公司 Video processing method and apparatus
CN112907617B (en) * 2021-01-29 2024-02-20 深圳壹秘科技有限公司 Video processing method and device
CN113139422A (en) * 2021-03-02 2021-07-20 广州朗国电子科技有限公司 Conference portrait shooting verification method, equipment and storage medium
CN113139422B (en) * 2021-03-02 2023-12-15 广州朗国电子科技股份有限公司 Shooting verification method, equipment and storage medium for conference portrait
CN113473061A (en) * 2021-06-10 2021-10-01 荣耀终端有限公司 Video call method and electronic equipment
CN113473061B (en) * 2021-06-10 2022-08-12 荣耀终端有限公司 Video call method and electronic equipment
CN114040145A (en) * 2021-11-20 2022-02-11 深圳市音络科技有限公司 Video conference portrait display method, system, terminal and storage medium
WO2024087641A1 (en) * 2022-10-27 2024-05-02 深圳奥尼电子股份有限公司 Audio and video control method with intelligent wireless microphone tracking function
CN115529435A (en) * 2022-11-29 2022-12-27 长沙朗源电子科技有限公司 High-definition conference picture wireless transmission method, system, equipment and storage medium
WO2024145878A1 (en) * 2023-01-05 2024-07-11 广州视源电子科技股份有限公司 Video processing method and apparatus, device, and storage medium
CN116567385A (en) * 2023-06-14 2023-08-08 深圳市宗匠科技有限公司 Image acquisition method and image acquisition device

Also Published As

Publication number Publication date
CN112073613B (en) 2021-11-23

Similar Documents

Publication Publication Date Title
CN112073613B (en) Conference portrait shooting method, interactive tablet, computer equipment and storage medium
CN109754811B (en) Sound source tracking method, device, equipment and storage medium based on biological characteristics
TWI311286B (en)
US20210110188A1 (en) Stereo imaging device
KR100729280B1 (en) Iris Identification System and Method using Mobile Device with Stereo Camera
JP4639869B2 (en) Imaging apparatus and timer photographing method
CN101785306B (en) Method and system for automatic camera control
KR101831973B1 (en) Iris imaging apparatus and methods for configuring an iris imaging apparatus
US10614293B2 (en) Facial recognition apparatus, recognition method and program therefor, and information device
CN111263106B (en) Picture tracking method and device for video conference
CN104243800B (en) Control device and storage medium
CN109657576B (en) Image acquisition control method, device, storage medium and system
KR101530255B1 (en) Cctv system having auto tracking function of moving target
JP5477777B2 (en) Image acquisition device
JP2021114716A (en) Imaging apparatus
CN116665111A (en) Attention analysis method, system and storage medium based on video conference system
US8295605B2 (en) Method for identifying dimensions of shot subject
CN112839165A (en) Method and device for realizing face tracking camera shooting, computer equipment and storage medium
JP2023057090A (en) Photographing control system
CN116614598A (en) Video conference picture adjusting method, device, electronic equipment and medium
CN111062313A (en) Image identification method, image identification device, monitoring system and storage medium
Fiala et al. A panoramic video and acoustic beamforming sensor for videoconferencing
CN111918127B (en) Video clipping method and device, computer readable storage medium and camera
US20120188437A1 (en) Electronic camera
JP7363971B2 (en) Image processing device and image processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant