CN112073613A - Conference portrait shooting method, interactive tablet, computer equipment and storage medium - Google Patents
Conference portrait shooting method, interactive tablet, computer equipment and storage medium Download PDFInfo
- Publication number
- CN112073613A CN112073613A CN202010948507.5A CN202010948507A CN112073613A CN 112073613 A CN112073613 A CN 112073613A CN 202010948507 A CN202010948507 A CN 202010948507A CN 112073613 A CN112073613 A CN 112073613A
- Authority
- CN
- China
- Prior art keywords
- target
- image
- wide
- target person
- face image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 13
- 238000010586 diagram Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 230000004807 localization Effects 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000002789 length control Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Studio Devices (AREA)
Abstract
The application discloses a method for shooting a conference portrait, which comprises the following steps: collecting wide-angle images in a video conference room; determining a target position of a target person in the wide-angle image; and controlling a zoom lens to shoot a target face image of the target person according to the target position. The application also discloses an interactive tablet, a computer device and a computer readable storage medium. The method and the device for obtaining the close-up image of the target person in the video conference are achieved.
Description
Technical Field
The present application relates to the field of image capturing, and in particular, to a method for capturing a conference portrait, an interactive tablet, a computer device, and a computer-readable storage medium.
Background
In a common scene of a video conference, a plurality of people in a conference room participate in a video call, and in order to achieve a more excellent video effect, a speaker often performs close-up shooting and tracking of a portrait. However, since a wide-angle shot is generally used in a video camera to record images of all participants, when a close-up image of a target person (e.g., a speaker) needs to be displayed, the target person image is cut from the wide-angle shot, but the obtained close-up image of the target person is often unclear.
The above is only for the purpose of assisting understanding of the technical solutions of the present application, and does not represent an admission that the above is prior art.
Disclosure of Invention
The application mainly aims to provide a conference portrait shooting method, an interactive tablet, a computer device and a computer readable storage medium, and aims to solve the problem that a clear close-up image of a target person in a video conference is difficult to obtain.
In order to achieve the above object, the present application provides a method for capturing a conference portrait, comprising the following steps:
collecting wide-angle images in a video conference room;
determining a target position of a target person in the wide-angle image;
and controlling a zoom lens to shoot a target face image of the target person according to the target position.
Further, the step of determining the target position of the target person in the wide-angle image comprises:
determining a first position of the target person in the wide-angle image by using a sound positioning algorithm, identifying a face image of the target person in the wide-angle image, and determining a second position of the face image in the wide-angle image;
and obtaining the target position of the target person in the wide-angle image according to the first position and the second position.
Further, after the step of identifying the face image of the target person in the wide-angle image, the method further includes:
determining a first distance between the target person and a shooting position according to the number of pixel points corresponding to the face image;
and determining the focal length of the zoom lens according to the first distance, wherein the focal length is applied to shooting the target face image.
Further, after the step of determining the target position of the target person in the wide-angle image, the method further includes:
when the plurality of target characters are determined, determining a second distance between target positions corresponding to the plurality of target characters;
judging whether the second distance is smaller than or equal to a preset threshold value;
and if so, executing the step of controlling a zoom lens to shoot the target face image of the target person according to the target position.
Further, after the step of determining whether the second distance is smaller than a preset threshold, the method further includes:
if not, determining a first target person and a second target person according to the speaking time corresponding to each target person;
and controlling the zoom lens to shoot a target face image of the first target person according to the target position corresponding to the first target person, and taking a face image of the second target person in the wide-angle image as a target face image of the second target person.
Further, after the step of controlling the zoom lens to capture the target face image of the target person according to the target position, the method further includes:
generating a picture-in-picture image according to the target face image and the wide-angle image, and outputting the picture-in-picture image;
or outputting the target face image.
Further, the method for shooting the meeting portrait further comprises the following steps:
and after the picture-in-picture image or the target face image is output, if the voice information is not detected within a preset time, outputting the wide-angle image.
To achieve the above object, the present application further provides an interactive tablet, including:
the acquisition module is used for acquiring wide-angle images in a video conference room;
a determination module for determining a target position of a target person in the wide-angle image;
and the shooting module is used for controlling the zoom lens to shoot the target face image of the target person according to the target position.
To achieve the above object, the present application also provides a computer device, comprising:
the computer device comprises a memory, a processor and a conference portrait shooting program which is stored on the memory and can run on the processor, wherein the conference portrait shooting program realizes the steps of the conference portrait shooting method when being executed by the processor.
To achieve the above object, the present application further provides a computer-readable storage medium, on which a program for capturing a conference portrait is stored, and when the program for capturing a conference portrait is executed by a processor, the steps of the method for capturing a conference portrait are implemented.
The conference portrait shooting method, the interactive flat plate, the computer device and the computer readable storage medium collect wide-angle images in a video conference room; determining a target position of a target person in the wide-angle image; and controlling a zoom lens to shoot a target face image of the target person according to the target position. Therefore, by using the mode of combining the wide-angle lens and the zoom lens, a wide-angle image of the video conference is obtained, and meanwhile, a clear target face image of a target person can be obtained.
Drawings
Fig. 1 is a schematic diagram illustrating a step of a method for capturing a portrait of a conference in an embodiment of the present application;
fig. 2 is a schematic diagram illustrating another step of a method for capturing a portrait of a conference in an embodiment of the present application;
fig. 3 is a schematic diagram illustrating another step of a method for capturing a portrait of a conference in an embodiment of the present application;
fig. 4 is a schematic diagram of a further step of a method for capturing a portrait of a conference in an embodiment of the present application;
FIG. 5 is a block diagram illustrating a schematic structure of an interactive tablet in an embodiment of the present application;
FIG. 6 is a block diagram illustrating a computer device according to an embodiment of the present application;
fig. 7 is a block diagram schematically illustrating a configuration of a terminal system according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Referring to fig. 1, in an embodiment, the method for capturing a meeting portrait includes:
and step S10, acquiring a wide-angle image in the video conference room.
And step S20, determining the target position of the target person in the wide-angle image.
And step S30, controlling a zoom lens to shoot a target face image of the target person according to the target position.
In this embodiment, the execution terminal may be an interactive tablet (also referred to as an interactive smart tablet), a conference machine, a computer device, or the like, or may be a shooting device (such as an image processor) for a conference portrait.
As set forth in step S10: optionally, the terminal system may be configured as shown in fig. 7, and includes: the device comprises an image processor, a first image sensor, a second image sensor, a wide-angle lens, a zoom lens, a holder, a microphone array and a sound processing device. The wide-angle camera is used for shooting and recording the scene inside the video conference room, and a first image sensor transmits a wide-angle image acquired based on the wide-angle camera to an image processor; the microphone array is used for collecting the voice information in the conference room and transmitting the collected voice information to the sound processing device, and the sound processing device can analyze the voice information while outputting audio, so that a speaker in the conference room is subjected to sound source positioning and the sound source positioning data is transmitted to the image processor; the zoom lens is erected on the holder, the image processor can adjust the shooting angle of the zoom lens by controlling the holder to rotate, and the second image sensor is used for transmitting the image collected based on the zoom lens to the image processor; the image processor is also used for integrating the images transmitted by the first image sensor and the second image sensor to obtain corresponding videos and outputting the videos.
It should be noted that the wide-angle lens may be a fisheye lens; the holder is a supporting platform of the camera; the image sensor can convert the light image on the light sensing surface into an electric signal in a corresponding proportional relation with the light image by utilizing the photoelectric conversion function of the photoelectric device.
It should be understood that the zoom lens and the pan/tilt/second image sensor may be integrally formed, such as a zoom single-shot pan/tilt camera; the wide-angle lens and the second image sensor may also be integrally formed, such as a wide-angle camera.
Optionally, when detecting that the video conference is started, the terminal may acquire a wide-angle image in the video conference room by using the wide-angle lens.
As set forth in step S20: optionally, when the terminal collects the wide-angle image, the microphone array is further used for collecting the voice information in the conference room, and a speaker corresponding to the voice information is defined as the target person.
Further, the terminal determines a first position of a target person in the wide-angle image by utilizing a sound source positioning technology according to the voice information acquired by the microphone array.
It should be noted that, in the sound source localization technology, the position of a sound source can be obtained by using the difference between time points (or sound intensities) corresponding to the same sound source received by at least two microphones in a microphone array and combining with plane geometry according to distance information between the microphones.
Optionally, a rectangular plane coordinate system is constructed with the plane of the wide-angle image, and when the plane of the wide-angle image is perpendicular to the horizontal plane, the direction in which the intersection line extends is the horizontal direction (X-axis direction), and the direction perpendicular to the horizontal plane is the vertical direction (Y-axis direction). In addition, a direction perpendicular to the plane of the wide-angle image is defined as a Z-axis direction.
Optionally, the X-axis coordinate of the target person in the wide-angle image is analyzed by using a sound localization algorithm and recorded as the first position.
After the first position of the target person is obtained, the target person is positioned to a portrait position (namely the portrait corresponding to the target person) corresponding to the first position in the wide-angle image, then a face image of the target person is obtained by recognition through a face recognition technology, and then a Y-axis coordinate of the face image of the target person in the vertical direction is obtained according to the distance between a development area of the face image in the wide-angle image and the upper boundary and/or the lower boundary of the wide-angle image and is used as a second position of the face image in the wide-angle image.
Optionally, when the second position corresponding to the face image is determined, the central point of the image area of the face image may be determined, and then the Y-axis coordinate is obtained as the second position according to the distance between the central point of the image area and the upper boundary and/or the lower boundary of the wide-angle image.
It should be understood that when recognizing a face image, it may be possible to recognize an image including the entire face head.
Combining the first position (X-axis coordinate) and the second position (Y-axis coordinate), the target position (X, Y) of the target person in the wide-angle image can be obtained. Thus, the target position of the target person can be quickly located.
Optionally, after the terminal obtains the wide-angle image, the terminal may also identify the face images in the wide-angle image to obtain a second position corresponding to each face image, then determine the first position of the target person in the wide-angle image in real time or at regular time according to a sound positioning algorithm, then position the first position to the face image in the wide-angle image in the same vertical direction as the first position to obtain the second position corresponding to the face image, and combine the first position and the second position to obtain the target position of the target person in the wide-angle image.
Optionally, the terminal may also be a reference face image in which the target person is stored in advance, and after the terminal acquires the wide-angle image, the face image in the wide-angle image, which is the same as or similar to the reference face image, is identified directly by using a face identification technology as the face image of the target person. And further determining the X-axis coordinate and the Y-axis coordinate of the face image in the wide-angle image to obtain the target position of the target person in the wide-angle image.
As set forth in step S30: when the target position corresponding to the face image of the target person in the wide-angle image is obtained, the holder is controlled to rotate so that the shooting angle of the zoom lens is over against the direction of the target position, then the zoom lens is used for aligning the head portrait of the target person, the focal length of the zoom lens is adjusted so that the image proportion of the head portrait is within a set range, meanwhile, the head portrait is focused clearly, and the target person is shot in a close-up manner to obtain the target face image of the target person.
Optionally, the focal length of the zoom lens used for shooting the target face image may be preset according to actual needs (e.g., a preset factory value); or determining a first distance between the target person and the shooting position according to the number of pixel points corresponding to the face image, and then determining the focal length of the zoom lens according to the first distance.
Therefore, by using the mode of combining the wide-angle lens and the zoom lens, a wide-angle image of the video conference is obtained, and meanwhile, a clear target face image of a target person can be obtained.
In an embodiment, as shown in fig. 2, on the basis of the embodiment shown in fig. 1, the method for capturing a portrait of a conference further includes:
step S40, determining a first distance between the target person and the shooting position according to the number of pixel points corresponding to the face image of the target person in the wide-angle image;
and step S41, determining the focal length of the zoom lens according to the first distance, wherein the focal length is applied to shooting the target face image.
In this embodiment, while the terminal identifies and obtains the face image of the target person in the wide-angle image, the terminal may further obtain a development area corresponding to the face image, calculate the number of pixels in the development area, determine a ratio of the number of pixels corresponding to the face image to the total number of pixels in the wide-angle image, and determine a first distance along the Z-axis direction between the shooting positions of the target person (or the face image of the target person) and the wide-angle lens (or the zoom lens) according to the ratio.
It should be understood that the smaller the occupancy value, the farther the resulting first distance.
The terminal only needs to be subjected to data analysis training in advance, the relation between different ratio values and the corresponding first distance is stored, and after the ratio value of the pixel point number corresponding to the face image in the total pixel point number of the wide-angle image is obtained, the first distance corresponding to the ratio value can be obtained.
Optionally, the terminal determines in advance a corresponding relationship between different first distances and focal lengths of the zoom lens, and sets corresponding focal lengths for the different first distances, so that after the zoom lens is adjusted to a focal length corresponding to the current first distance, a ratio of a head portrait image of a target person captured by the zoom lens is within a set range, and the target person is focused clearly.
It should be understood that the setting range can be set according to the actual situation, and the application is not limited.
Therefore, according to the difference of the first distance between the target person and the shooting position, the corresponding focal length is determined, the obtained focal length control zoom lens carries out close-up shooting on the target person, and the clear target face image of the target person can be obtained more accurately.
In an embodiment, as shown in fig. 3, on the basis of the above embodiments of fig. 1 to 2, after the step of determining the target position of the target person in the wide-angle image, the method further includes:
in step S50, when the plurality of target persons are determined, a second distance between the target positions corresponding to the plurality of target persons is determined.
Step S51, judging whether the second distance is smaller than or equal to a preset threshold value;
and step S60, if yes, executing the step of controlling the zoom lens to shoot the target face image of the target person according to the target position.
In this embodiment, the terminal may perform sound localization by using the voice information acquired by the microphone array within the first preset time period. When a plurality of persons speak within the first preset time period, the terminal can determine a plurality of target persons and first positions of the target persons in the wide-angle image.
It should be noted that the first preset time period may be set according to actual requirements, such as 30 seconds, one minute, and the like.
Further, after obtaining the first position of each target person, the terminal respectively determines the face image of each target person in the wide-angle image, respectively determines the second position of the face image corresponding to each target person in the wide-angle image, and obtains the target position of the target person by combining the first position and the second position.
Optionally, if the terminal controls a plurality of zoom lenses, and the number of the zoom lenses is greater than or equal to the number of the target persons, the terminal assigns a corresponding zoom lens to each target person according to the target position corresponding to each target person, and captures a target face image of the target person by using the zoom lenses.
Optionally, if the terminal only controls one zoom lens, or the number of zoom lenses is smaller than the number of target persons, the terminal determines a second distance between target positions corresponding to face images of a plurality of target persons, and if the number of target persons is greater than or equal to 3, determines a second distance between a target person located at the leftmost side and a target person located at the rightmost side in the wide-angle image.
And after second distances among the target positions corresponding to the target people are obtained, whether the second distances are smaller than or equal to a preset threshold value is detected. The preset threshold value is characterized in that the maximum transverse length of a target face image shot by the zoom lens meets the requirement of definition.
Optionally, when the terminal detects that the second distance is smaller than or equal to the preset threshold, the step of controlling the zoom lens to shoot the target face image of the target person according to the target position is executed (step S30), so that the shot target face image includes the head portraits of all the target persons. The terminal can determine the central position of a plurality of target characters according to the target position corresponding to each target character, align the focus of the zoom lens with the central position and then shoot a target face image.
Therefore, when a plurality of target persons exist, the shooting of the target persons by the zoom lenses can be distributed or adjusted according to actual needs, and clear target face images of the target persons can be obtained to the maximum extent.
In an embodiment, as shown in fig. 4, on the basis of the embodiments of fig. 1 to fig. 3, after the step of determining whether the second distance is smaller than a preset threshold, the method further includes:
step S70, if not, determining a first target character and a second target character according to the speaking time corresponding to each target character;
step S71, controlling the zoom lens to capture a target face image of the first target person according to the target position corresponding to the first target person, and taking a face image of the second target person in the wide-angle image as a target face image of the second target person.
In this embodiment, after obtaining the second distances between the target positions corresponding to the plurality of target persons, it is detected whether the second distances are smaller than or equal to a preset threshold.
Optionally, when the terminal detects that the second distance is greater than the preset threshold, the terminal determines the first target person and the second target person according to the speaking time corresponding to each target person.
Optionally, the speaking time may be speaking duration, and the target person with the longest speaking duration in the first preset duration is taken as the first target person, and the rest of the target persons are taken as the second target person.
Optionally, the speaking time may be a speaking time point, and the target person whose corresponding speaking time point is closest to the current time point is taken as the first target person, and the remaining target persons are taken as the second target persons.
Optionally, after the terminal distinguishes a first target person from a second target person from the plurality of target persons, the zoom lens is controlled to capture a target face image of the first target person according to a target position corresponding to the first target person (that is, step S30 is executed only for the first target person, and the zoom lens is used to capture the target face image of the first target person); and for the second target person, directly cutting the face image of the second target person in the wide-angle image from the wide-angle image, and performing image amplification processing to obtain the final face image as the target face image of the second target person.
Therefore, when a plurality of target persons exist, the shooting of the target persons by the zoom lenses can be distributed or adjusted according to actual needs, and clear target face images of the target persons can be obtained to the maximum extent.
In one embodiment, after the step of controlling the zoom lens to capture the target face image of the target person according to the target position is performed, the method further includes:
and step S80, generating a picture-in-picture image according to the target face image and the wide-angle image, and outputting the picture-in-picture image.
Or, in step S81, outputting the target face image.
In this embodiment, when the video conference is started, especially in a remote conference scene, a video in a conference room acquired by a general local terminal may be transmitted through a network and output to a remote terminal for playing, so that a participant in the conference room of the remote terminal can know the condition in the local conference room.
Optionally, after obtaining the target face image of the target person, the terminal may generate a picture-in-picture image according to the target face image and the wide-angle image, for example, taking the wide-angle image as a bottom-layer image, superimpose the target face image on the wide-angle image, and display the target face image outside a person area in the wide-angle image, thereby obtaining the picture-in-picture image. And after the terminal obtains the picture-in-picture image, continuously outputting the picture-in-picture image, and integrating the picture-in-picture image into a video to be transmitted to the remote terminal.
Or, after obtaining the target face image of the target person, the terminal may only output the target face image, and integrate the continuously output target face image into a video to transmit to the remote terminal.
Therefore, the attention of the participants to the target person can be improved by emphasizing the target face image of the target person.
Optionally, after the terminal outputs the picture-in-picture image or the target face image, if the microphone array does not detect the voice information within the preset time (which is recorded as the second preset time) (for example, it takes the turn of speaking to the participant at the remote terminal, so that the participant at the local terminal remains quiet, which may be the case), the terminal switches the output picture-in-picture image or the target face image to the output wide-angle image, and integrates the continuously output wide-angle image into a video to be transmitted to the remote terminal.
It should be noted that the second preset time period may be set according to actual requirements, such as one minute, two minutes, and the like.
Optionally, when the terminal determines that a plurality of target persons are obtained, and detects that the second distances between the target positions corresponding to the face images of the plurality of target persons are greater than the preset threshold, it is determined that the distances of the target persons are too dispersed at the moment and are not suitable for shooting the target face images by using the zoom lens, and the terminal can directly output the wide-angle images in the video conference room and integrate the continuously output wide-angle images into a video to be transmitted to the remote terminal.
Therefore, by outputting the wide-angle image, remote participants can know the global situation of the local terminal conference conveniently.
Referring to fig. 5, an interactive tablet 10 is further provided in the embodiment of the present application, including:
the acquisition module 11 is used for acquiring wide-angle images in a video conference room;
a determining module 12 for determining a target position of a target person in the wide-angle image;
and the shooting module 13 is used for controlling the zoom lens to shoot a target face image of the target person according to the target position.
Referring to fig. 6, a computer device, which may be a server and whose internal structure may be as shown in fig. 6, is also provided in the embodiment of the present application. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer designed processor is used to provide computational and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for a program for capturing images of the conference. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of capturing a conference portrait.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the present teachings and is not intended to limit the scope of the present teachings as applied to computer devices.
Furthermore, the present application also proposes a computer-readable storage medium, which includes a program for capturing a conference portrait, and when the program for capturing a conference portrait is executed by a processor, the steps of the method for capturing a conference portrait as described in the above embodiments are implemented. It is to be understood that the computer-readable storage medium in the present embodiment may be a volatile-readable storage medium or a non-volatile-readable storage medium.
In summary, the wide-angle image in the video conference room is acquired for the conference portrait shooting method, the interactive tablet, the computer device and the storage medium provided in the embodiment of the application; determining a target position of a target person in the wide-angle image; and controlling a zoom lens to shoot a target face image of the target person according to the target position. Therefore, by using the mode of combining the wide-angle lens and the zoom lens, a wide-angle image of the video conference is obtained, and meanwhile, a clear target face image of a target person can be obtained.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above description is only for the preferred embodiment of the present application and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are intended to be included within the scope of the present application.
Claims (10)
1. A method for shooting a meeting portrait is characterized by comprising the following steps:
collecting wide-angle images in a video conference room;
determining a target position of a target person in the wide-angle image;
and controlling a zoom lens to shoot a target face image of the target person according to the target position.
2. The method of capturing a human image of a conference as set forth in claim 1, wherein the step of determining the target position of the target person in the wide-angle image includes:
determining a first position of the target person in the wide-angle image by using a sound positioning algorithm, identifying a face image of the target person in the wide-angle image, and determining a second position of the face image in the wide-angle image;
and obtaining the target position of the target person in the wide-angle image according to the first position and the second position.
3. The method for capturing a conference portrait according to claim 2, wherein the step of recognizing a face image of the target person in the wide-angle image is followed by further comprising:
determining a first distance between the target person and a shooting position according to the number of pixel points corresponding to the face image;
and determining the focal length of the zoom lens according to the first distance, wherein the focal length is applied to shooting the target face image.
4. The method for capturing a human image for a conference as set forth in any one of claims 1 to 3, wherein said step of determining a target position of a target person in said wide-angle image further comprises:
when the plurality of target characters are determined, determining a second distance between target positions corresponding to the plurality of target characters;
judging whether the second distance is smaller than or equal to a preset threshold value;
and if so, executing the step of controlling a zoom lens to shoot the target face image of the target person according to the target position.
5. The method for capturing the portrait of the conference as set forth in claim 4, wherein the step of determining whether the second distance is smaller than a preset threshold value further includes:
if not, determining a first target person and a second target person according to the speaking time corresponding to each target person;
and controlling the zoom lens to shoot a target face image of the first target person according to the target position corresponding to the first target person, and taking a face image of the second target person in the wide-angle image as a target face image of the second target person.
6. The method for photographing a human image in a conference as claimed in claim 1, wherein the step of controlling a zoom lens to photograph a face image of the subject person according to the subject position further comprises:
generating a picture-in-picture image according to the target face image and the wide-angle image, and outputting the picture-in-picture image;
or outputting the target face image.
7. The method for photographing a conference portrait according to claim 6, further comprising:
and after the picture-in-picture image or the target face image is output, if the voice information is not detected within a preset time, outputting the wide-angle image.
8. An interactive tablet, comprising:
the acquisition module is used for acquiring wide-angle images in a video conference room;
a determination module for determining a target position of a target person in the wide-angle image;
and the shooting module is used for controlling the zoom lens to shoot the target face image of the target person according to the target position.
9. A computer device, characterized in that the computer device comprises a memory, a processor and a program for capturing a conference figure stored on the memory and executable on the processor, the program for capturing a conference figure realizing the steps of the method for capturing a conference figure according to any one of claims 1 to 7 when executed by the processor.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a program for photographing a conference portrait, which when executed by a processor implements the steps of the method for photographing a conference portrait according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010948507.5A CN112073613B (en) | 2020-09-10 | 2020-09-10 | Conference portrait shooting method, interactive tablet, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010948507.5A CN112073613B (en) | 2020-09-10 | 2020-09-10 | Conference portrait shooting method, interactive tablet, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112073613A true CN112073613A (en) | 2020-12-11 |
CN112073613B CN112073613B (en) | 2021-11-23 |
Family
ID=73663593
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010948507.5A Active CN112073613B (en) | 2020-09-10 | 2020-09-10 | Conference portrait shooting method, interactive tablet, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112073613B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112907617A (en) * | 2021-01-29 | 2021-06-04 | 深圳壹秘科技有限公司 | Video processing method and device |
CN113139422A (en) * | 2021-03-02 | 2021-07-20 | 广州朗国电子科技有限公司 | Conference portrait shooting verification method, equipment and storage medium |
CN113473061A (en) * | 2021-06-10 | 2021-10-01 | 荣耀终端有限公司 | Video call method and electronic equipment |
CN114040145A (en) * | 2021-11-20 | 2022-02-11 | 深圳市音络科技有限公司 | Video conference portrait display method, system, terminal and storage medium |
CN115529435A (en) * | 2022-11-29 | 2022-12-27 | 长沙朗源电子科技有限公司 | High-definition conference picture wireless transmission method, system, equipment and storage medium |
CN116567385A (en) * | 2023-06-14 | 2023-08-08 | 深圳市宗匠科技有限公司 | Image acquisition method and image acquisition device |
WO2024087641A1 (en) * | 2022-10-27 | 2024-05-02 | 深圳奥尼电子股份有限公司 | Audio and video control method with intelligent wireless microphone tracking function |
WO2024145878A1 (en) * | 2023-01-05 | 2024-07-11 | 广州视源电子科技股份有限公司 | Video processing method and apparatus, device, and storage medium |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120169895A1 (en) * | 2010-03-24 | 2012-07-05 | Industrial Technology Research Institute | Method and apparatus for capturing facial expressions |
CN103327250A (en) * | 2013-06-24 | 2013-09-25 | 深圳锐取信息技术股份有限公司 | Method for controlling camera lens based on pattern recognition |
CN106650671A (en) * | 2016-12-27 | 2017-05-10 | 深圳英飞拓科技股份有限公司 | Human face identification method, apparatus and system |
US20180082439A1 (en) * | 2016-09-20 | 2018-03-22 | Kabushiki Kaisha Toshiba | Image collation system and image collation method |
CN108377342A (en) * | 2018-05-22 | 2018-08-07 | Oppo广东移动通信有限公司 | double-camera photographing method, device, storage medium and terminal |
CN108682032A (en) * | 2018-04-02 | 2018-10-19 | 广州视源电子科技股份有限公司 | Method and device for controlling video image output, readable storage medium and terminal |
KR101899318B1 (en) * | 2017-04-24 | 2018-10-29 | 주식회사 이엠따블유 | Hierarchical face object search method and face recognition method using the same, hierarchical face object search system and face recognition system using the same |
CN108933915A (en) * | 2017-05-26 | 2018-12-04 | 和硕联合科技股份有限公司 | Video conference device and video conference management method |
CN109257559A (en) * | 2018-09-28 | 2019-01-22 | 苏州科达科技股份有限公司 | A kind of image display method, device and the video conferencing system of panoramic video meeting |
CN110113515A (en) * | 2019-05-13 | 2019-08-09 | Oppo广东移动通信有限公司 | Camera control method and Related product |
CN111163281A (en) * | 2020-01-09 | 2020-05-15 | 北京中电慧声科技有限公司 | Panoramic video recording method and device based on voice tracking |
CN111263106A (en) * | 2020-02-25 | 2020-06-09 | 厦门亿联网络技术股份有限公司 | Picture tracking method and device for video conference |
CN111586341A (en) * | 2020-05-20 | 2020-08-25 | 深圳随锐云网科技有限公司 | Shooting method and picture display method of video conference shooting device |
-
2020
- 2020-09-10 CN CN202010948507.5A patent/CN112073613B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120169895A1 (en) * | 2010-03-24 | 2012-07-05 | Industrial Technology Research Institute | Method and apparatus for capturing facial expressions |
CN103327250A (en) * | 2013-06-24 | 2013-09-25 | 深圳锐取信息技术股份有限公司 | Method for controlling camera lens based on pattern recognition |
US20180082439A1 (en) * | 2016-09-20 | 2018-03-22 | Kabushiki Kaisha Toshiba | Image collation system and image collation method |
CN106650671A (en) * | 2016-12-27 | 2017-05-10 | 深圳英飞拓科技股份有限公司 | Human face identification method, apparatus and system |
KR101899318B1 (en) * | 2017-04-24 | 2018-10-29 | 주식회사 이엠따블유 | Hierarchical face object search method and face recognition method using the same, hierarchical face object search system and face recognition system using the same |
CN108933915A (en) * | 2017-05-26 | 2018-12-04 | 和硕联合科技股份有限公司 | Video conference device and video conference management method |
CN108682032A (en) * | 2018-04-02 | 2018-10-19 | 广州视源电子科技股份有限公司 | Method and device for controlling video image output, readable storage medium and terminal |
CN108377342A (en) * | 2018-05-22 | 2018-08-07 | Oppo广东移动通信有限公司 | double-camera photographing method, device, storage medium and terminal |
CN109257559A (en) * | 2018-09-28 | 2019-01-22 | 苏州科达科技股份有限公司 | A kind of image display method, device and the video conferencing system of panoramic video meeting |
CN110113515A (en) * | 2019-05-13 | 2019-08-09 | Oppo广东移动通信有限公司 | Camera control method and Related product |
CN111163281A (en) * | 2020-01-09 | 2020-05-15 | 北京中电慧声科技有限公司 | Panoramic video recording method and device based on voice tracking |
CN111263106A (en) * | 2020-02-25 | 2020-06-09 | 厦门亿联网络技术股份有限公司 | Picture tracking method and device for video conference |
CN111586341A (en) * | 2020-05-20 | 2020-08-25 | 深圳随锐云网科技有限公司 | Shooting method and picture display method of video conference shooting device |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112907617A (en) * | 2021-01-29 | 2021-06-04 | 深圳壹秘科技有限公司 | Video processing method and device |
WO2022160748A1 (en) * | 2021-01-29 | 2022-08-04 | 深圳壹秘科技有限公司 | Video processing method and apparatus |
CN112907617B (en) * | 2021-01-29 | 2024-02-20 | 深圳壹秘科技有限公司 | Video processing method and device |
CN113139422A (en) * | 2021-03-02 | 2021-07-20 | 广州朗国电子科技有限公司 | Conference portrait shooting verification method, equipment and storage medium |
CN113139422B (en) * | 2021-03-02 | 2023-12-15 | 广州朗国电子科技股份有限公司 | Shooting verification method, equipment and storage medium for conference portrait |
CN113473061A (en) * | 2021-06-10 | 2021-10-01 | 荣耀终端有限公司 | Video call method and electronic equipment |
CN113473061B (en) * | 2021-06-10 | 2022-08-12 | 荣耀终端有限公司 | Video call method and electronic equipment |
CN114040145A (en) * | 2021-11-20 | 2022-02-11 | 深圳市音络科技有限公司 | Video conference portrait display method, system, terminal and storage medium |
WO2024087641A1 (en) * | 2022-10-27 | 2024-05-02 | 深圳奥尼电子股份有限公司 | Audio and video control method with intelligent wireless microphone tracking function |
CN115529435A (en) * | 2022-11-29 | 2022-12-27 | 长沙朗源电子科技有限公司 | High-definition conference picture wireless transmission method, system, equipment and storage medium |
WO2024145878A1 (en) * | 2023-01-05 | 2024-07-11 | 广州视源电子科技股份有限公司 | Video processing method and apparatus, device, and storage medium |
CN116567385A (en) * | 2023-06-14 | 2023-08-08 | 深圳市宗匠科技有限公司 | Image acquisition method and image acquisition device |
Also Published As
Publication number | Publication date |
---|---|
CN112073613B (en) | 2021-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112073613B (en) | Conference portrait shooting method, interactive tablet, computer equipment and storage medium | |
CN109754811B (en) | Sound source tracking method, device, equipment and storage medium based on biological characteristics | |
TWI311286B (en) | ||
US20210110188A1 (en) | Stereo imaging device | |
KR100729280B1 (en) | Iris Identification System and Method using Mobile Device with Stereo Camera | |
JP4639869B2 (en) | Imaging apparatus and timer photographing method | |
CN101785306B (en) | Method and system for automatic camera control | |
KR101831973B1 (en) | Iris imaging apparatus and methods for configuring an iris imaging apparatus | |
US10614293B2 (en) | Facial recognition apparatus, recognition method and program therefor, and information device | |
CN111263106B (en) | Picture tracking method and device for video conference | |
CN104243800B (en) | Control device and storage medium | |
CN109657576B (en) | Image acquisition control method, device, storage medium and system | |
KR101530255B1 (en) | Cctv system having auto tracking function of moving target | |
JP5477777B2 (en) | Image acquisition device | |
JP2021114716A (en) | Imaging apparatus | |
CN116665111A (en) | Attention analysis method, system and storage medium based on video conference system | |
US8295605B2 (en) | Method for identifying dimensions of shot subject | |
CN112839165A (en) | Method and device for realizing face tracking camera shooting, computer equipment and storage medium | |
JP2023057090A (en) | Photographing control system | |
CN116614598A (en) | Video conference picture adjusting method, device, electronic equipment and medium | |
CN111062313A (en) | Image identification method, image identification device, monitoring system and storage medium | |
Fiala et al. | A panoramic video and acoustic beamforming sensor for videoconferencing | |
CN111918127B (en) | Video clipping method and device, computer readable storage medium and camera | |
US20120188437A1 (en) | Electronic camera | |
JP7363971B2 (en) | Image processing device and image processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |