CN110505399A - Control method, device and the acquisition terminal of Image Acquisition - Google Patents
Control method, device and the acquisition terminal of Image Acquisition Download PDFInfo
- Publication number
- CN110505399A CN110505399A CN201910746092.0A CN201910746092A CN110505399A CN 110505399 A CN110505399 A CN 110505399A CN 201910746092 A CN201910746092 A CN 201910746092A CN 110505399 A CN110505399 A CN 110505399A
- Authority
- CN
- China
- Prior art keywords
- spokesman
- audio
- camera
- image
- acquisition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/67—Focus control based on electronic image sensor signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/695—Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
Abstract
The disclosure discloses a kind of control method of Image Acquisition, is applied to acquisition terminal, comprising: carries out Application on Voiceprint Recognition to the audio of acquisition, determines whether spokesman changes by the Application on Voiceprint Recognition;If spokesman change, according to audio collected position the audio corresponding to the position of spokesman in space;According to the position navigated to, camera in the acquisition terminal is adjusted, after adjustment, spokesman corresponding to the audio is located at the shooting picture center of the camera, and the adjustment includes the shooting angle for adjusting the camera and/or the focal length of the adjustment camera;Camera by adjusting after carries out the image that Image Acquisition obtains spokesman corresponding to the audio.It realizes and spokesman's tracking and positioning is carried out according to audio, and adjust camera to acquire the image of spokesman, solve the problems, such as the image that cannot collect spokesman caused by being located at shooting blind area because of spokesman in the prior art.
Description
Technical field
This disclosure relates to multimedia technology field, in particular to a kind of control method of Image Acquisition, device and acquisition are eventually
End.
Background technique
In the prior art, with the development of Internet technology and the communication technology, the application of multipart video-meeting at work
It is more and more extensive.
In multipart video-meeting, display equipment real-time perfoming image is shown, shows the multi-party state of meeting.Wherein, it shows
Show that image shown by equipment is camera acquired image.
For camera, camera acquired image is limited by camera deployed position and camera is non-adjustable
Section, thus, the personnel participating in the meeting positioned at camera shooting blind area does not appear in camera acquired image.In turn, if
Spokesman is located at the shooting blind area of camera, due to that cannot collect the image in shooting blind area, to show shown by equipment
Picture in do not include spokesman portrait, cause other personnels participating in the meeting to cannot see that the image of spokesman.
From the foregoing, it will be observed that how to carry out Image Acquisition to guarantee that the problem of collecting the image of spokesman is urgently to be resolved.
Summary of the invention
In order to solve that spokesman's figure cannot be collected caused by being located at shooting blind area because of spokesman present in the relevant technologies
The problem of picture, present disclose provides a kind of control method and device of Image Acquisition.
In a first aspect, a kind of control method of Image Acquisition, is applied to acquisition terminal, which comprises
Application on Voiceprint Recognition is carried out to the audio of acquisition, determines whether spokesman changes by the Application on Voiceprint Recognition;
If spokesman change, according to audio collected position the audio corresponding to the position of spokesman in space
It sets;
According to the position navigated to, the camera in the acquisition terminal is adjusted, after adjustment, the audio institute
Corresponding spokesman is located at the shooting picture center of the camera, and the adjustment includes adjusting the shooting angle of the camera
And/or the focal length of the adjustment camera;
Camera by adjusting after carries out the image that Image Acquisition obtains spokesman corresponding to the audio.
Second aspect, a kind of control device of Image Acquisition are applied to acquisition terminal, and described device includes:
Voiceprint identification module determines spokesman by the Application on Voiceprint Recognition for carrying out Application on Voiceprint Recognition to the audio of acquisition
Whether change;
Locating module, if judging that spokesman changes for voiceprint identification module, according to audio collected positioning
The position of spokesman corresponding to audio in space;
Control module is adjusted for being adjusted to the camera in the acquisition terminal according to the position navigated to
Afterwards, spokesman corresponding to the audio is located at the shooting picture center of the camera, and the adjustment includes adjusting the camera shooting
The shooting angle of head and/or the focal length of the adjustment camera;
Image capture module carries out Image Acquisition for the camera by adjusting after and obtains speech corresponding to the audio
The image of people.
The third aspect, a kind of acquisition terminal, comprising:
Processor;And
Memory is stored with computer-readable instruction on the memory, and the computer-readable instruction is by the processing
Device realizes method as described above when executing.
The technical solution that the embodiment of the present disclosure provides can include the following benefits:
When judging spokesman's variation by Application on Voiceprint Recognition, the position of spokesman is determined according to acquired image, is gone forward side by side
And camera is adjusted according to the position of spokesman, it is the center that spokesman is located at camera shooting picture, has thereby may be ensured that
Effect collects the image of spokesman.It realizes and spokesman's tracking and positioning is carried out according to audio, and adjust camera to acquire speech
The image of people efficiently solves the image that spokesman cannot be collected caused by being located at shooting blind area because of spokesman in the prior art
The problem of.
It should be understood that the above general description and the following detailed description are merely exemplary, this can not be limited
It is open.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention
Example, and in specification together principle for explaining the present invention.
Fig. 1 is a kind of block diagram of terminal shown according to an exemplary embodiment;
Fig. 2 is a kind of flow chart of the control method of Image Acquisition shown according to an exemplary embodiment;
Fig. 3 is the flow chart of step 310 in one embodiment in Fig. 2 corresponding embodiment;
Fig. 4 is the flow chart of step 330 in one embodiment in Fig. 2 corresponding embodiment;
Fig. 5 is the flow chart of step 350 in one embodiment in Fig. 2 corresponding embodiment;
Fig. 6 is the flow chart of step 370 in one embodiment in Fig. 2 corresponding embodiment;
Fig. 7 is the flow chart of step 371 in one embodiment in Fig. 6 corresponding embodiment;
Fig. 8 is the flow chart of the control method of the Image Acquisition exemplified according to a specific implementation;
Fig. 9 is a kind of block diagram of the control device of Image Acquisition shown according to an exemplary embodiment.
Through the above attached drawings, it has been shown that the specific embodiment of the present invention will be hereinafter described in more detail, these attached drawings
It is not intended to limit the scope of the inventive concept in any manner with verbal description, but is by referring to specific embodiments
Those skilled in the art illustrate idea of the invention.
Specific embodiment
Here will the description is performed on the exemplary embodiment in detail, the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended
The example of device and method being described in detail in claims, some aspects of the invention are consistent.
Fig. 1 is a kind of block diagram of terminal 200 shown according to an exemplary embodiment.Terminal 200 can be used as fixed whole
End for carrying out Image Acquisition, the TV of terminal 200 such as integrated camera and sound acquisition module according to the disclosed method
Machine, desktop computer etc..
Referring to Fig. 2, terminal 200 may include following one or more components: processing component 202, memory 204, power supply
Component 206, multimedia component 208, sound collection component 210, camera 214 and communication component 216.
The integrated operation of the usual controlling terminal 200 of processing component 202, such as with display, Image Acquisition, data communication taken the photograph
As head rotates and record associated operation of operation etc..Processing component 202 may include one or more processors 218 to hold
Row instruction, to complete all or part of the steps of following methods.In addition, processing component 202 may include one or more moulds
Block, convenient for the interaction between processing component 202 and other assemblies.For example, processing component 202 may include multi-media module, with
Facilitate the interaction between multimedia component 208 and processing component 202.
Memory 204 is configured as storing various types of data to support the operation in terminal 200.These data are shown
Example includes the instruction of any application or method for operating in terminal 200.Memory 204 can be by any kind of
Volatibility or non-volatile memory device or their combination are realized, such as static random access memory (Static Random
Access Memory, abbreviation SRAM), electrically erasable programmable read-only memory (Electrically Erasable
Programmable Read-Only Memory, abbreviation EEPROM), Erasable Programmable Read Only Memory EPROM (Erasable
Programmable Read Only Memory, abbreviation EPROM), programmable read only memory (Programmable Red-
Only Memory, abbreviation PROM), read-only memory (Read-Only Memory, abbreviation ROM), magnetic memory, flash
Device, disk or CD.One or more modules are also stored in memory 204, which is configured to by this
One or more processors 218 execute, to complete all or part of step in following either method embodiments.
Power supply module 206 provides electric power for the various assemblies of terminal 200.Power supply module 206 may include power management system
System, one or more power supplys and other with for terminal 200 generate, manage, and distribute the associated component of electric power.
Multimedia component 208 includes the screen of one output interface of offer between the terminal 200 and user.One
In a little embodiments, screen may include liquid crystal display (Liquid Crystal Display, abbreviation LCD) and touch panel.
If screen includes touch panel, screen may be implemented as touch screen, to receive input signal from the user.Touch panel
Including one or more touch sensors to sense the gesture on touch, slide, and touch panel.The touch sensor can be with
The boundary of a touch or slide action is not only sensed, but also detects duration associated with the touch or slide operation and pressure
Power.Screen can also include display of organic electroluminescence (Organic Light Emitting Display, abbreviation OLED).
Wherein, it can be shown by screen by camera acquired image.
Sound collection component 210 is configured for audio collection, and wherein sound collection component 210 may include several
Sound acquisition module, sound acquisition module such as microphone (Microphone, abbreviation MIC), by sound collection component 210 into
Row audio collection.
Camera 214 is for carrying out Image Acquisition, to obtain image.In the scheme of the disclosure, in terminal 200 at least
Including one can controlled rotation camera.To be imaged according to the position control of spokesman after determining spokesman's variation
Head rotation, to acquire the image of spokesman.
Communication component 216 is configured to facilitate the communication of wired or wireless way between terminal 200 and other equipment.Terminal
200 can access the wireless network based on communication standard, such as WiFi (WIreless-Fidelity, Wireless Fidelity).Show at one
In example property embodiment, communication component 216 receives broadcast singal or broadcast from external broadcasting management system via broadcast channel
Relevant information.In one exemplary embodiment, the communication component 216 further includes near-field communication (Near Field
Communication, abbreviation NFC) module, to promote short range communication.For example, radio frequency identification (Radio can be based in NFC module
Frequency Identification, abbreviation RFID) technology, Infrared Data Association (Infrared Data
Association, abbreviation IrDA) technology, ultra wide band (Ultra Wideband, abbreviation UWB) technology, Bluetooth technology and other skills
Art is realized.
In the exemplary embodiment, terminal 200 can be by one or more application specific integrated circuit (Application
Specific Integrated Circuit, abbreviation ASIC), it is digital signal processor, digital signal processing appts, programmable
Logical device, field programmable gate array, controller, microcontroller, microprocessor or other electronic components are realized, for executing
Following methods.
Fig. 2 is a kind of flow chart of the control method of Image Acquisition shown according to an exemplary embodiment.The image is adopted
The control method of collection is applied to acquisition terminal, acquisition terminal terminal 200 for example shown in FIG. 1.As shown in Fig. 2, this method, it can
With the following steps are included:
Step 310, Application on Voiceprint Recognition is carried out to the audio of acquisition, determines whether spokesman changes by Application on Voiceprint Recognition.
Acquisition terminal includes sound acquisition module, carries out audio collection, the sound acquisition module by sound acquisition module
Such as microphone.In a particular embodiment, sound acquisition module can integrate inside acquisition terminal, can also dispose and acquire
Exterior of terminal, such as be connected by external interface with acquisition terminal.
The sound acquisition module of acquisition terminal persistently carries out signal acquisition, it is to be understood that since personnel are not to connect
It is continuous constantly to talk, thus, sound acquisition module signal collected includes sound signal and tone-off signal.Disclosure meaning
Audio has sound signal from sound acquisition module is collected, such as has a segment signal in sound signal or two adjacent
Whole section between tone-off signal has sound signal.
In a particular embodiment, determined by end-point detection sound acquisition module it is signal collected in have sound signal and
Tone-off signal.
Signal collected is divided before step 310 for the image for acquiring spokesman according to the disclosed method
Section carries out Image Acquisition control according to the method being disclosed to the audio that segmentation obtains.The segmentation carried out, such as according to endpoint
Detection determines on the basis of having sound signal and tone-off signal, has sound signal as a Duan Yin between two adjacent silences number
Frequently.
In another embodiment, signal collected can also be segmented according to set collection period, from
And will be segmented obtained has message number section as a segment of audio.
It in one embodiment, is to reduce operand, it is only adjacent to tone-off signal next to have the progress of message number section
In other words Application on Voiceprint Recognition if the adjacent upper signal segment of audio is still to have sound signal, does not execute step 310, to default
Spokesman corresponding to the audio still has spokesman corresponding to message number section by adjacent upper one.
Due to everyone sound organ, such as vocal cords, oral cavity, nasal cavity etc., in different poses and with different expressions, Yi Jifa is presented in pronunciation
Likeness of the deceased amount, pronouncing frequency the sound for being not quite similar, thus everyone sound organ being caused to issue necessarily have the characteristics that it is respective,
Form everyone unique vocal print.
The vocal print of people is characterized by vocal print feature.Vocal print feature is to carry out feature extraction according to audio collected to obtain
.Vocal print feature such as mel-frequency cepstrum coefficient (Mel Frequency Cepstral Coefficents, MFCC), in short-term
Energy, short-time average magnitude, short-time average zero-crossing rate, formant, linear prediction residue error (LPCC).
In a particular embodiment, it can be for progress Application on Voiceprint Recognition extracted vocal print feature from audio a kind of or more
Kind, herein without specifically limiting.
The Application on Voiceprint Recognition carried out identifies the vocal print feature of current audio collected and the sound of upper one acquired audio
Whether line feature is consistent, if it is inconsistent, showing spokesman and upper one acquired audio corresponding to current audio collected
Corresponding spokesman is inconsistent, i.e., spokesman changes;, whereas if it is consistent, then show that current audio institute collected is right
Ying spokesman is consistent with spokesman corresponding to upper one acquired audio, i.e., spokesman does not change.
Step 330, if spokesman changes, in space according to spokesman corresponding to audio collected positioning audio
Position.
The positioning carried out determines that audio institute is right using auditory localization technology that is, according to the time for collecting the audio
The position of Ying spokesman in space.
It is understood that the position of spokesman in space is actually one since spokesman has certain volume
Area of space.For the ease of being calculated, by a certain region in area of space occupied by spokesman (such as occupied by head
Region) or certain point be used to indicate the position of spokesman in space.
Wherein, auditory localization technology is that the time delay of audio is collected using multiple sound acquisition modules to determine that audio institute is right
The position of Ying spokesman.
By now it should be appreciated that acquisition terminal includes at least two sound acquisition modules.It is stored in acquisition terminal
Each sound acquisition module collects the time of the audio, it is thus possible to collect the time of audio according to each sound acquisition module
It is corresponding to calculate the time delay that the audio is collected to any two sound acquisition modules, and then realize the positioning of spokesman position.
Step 350, according to the position navigated to, the camera in acquisition terminal is adjusted, after adjustment, audio institute
Corresponding spokesman is located at the shooting picture center of camera, and adjustment includes the shooting angle and/or adjustment camera shooting of adjustment camera
The focal length of head.
According to the position navigated to, that is, it can determine azimuth-range of the spokesman corresponding to audio relative to camera.
For Image Acquisition, especially with the Image Acquisition for artificial target of making a speech for, to collect spokesman's
Adjustment clear and convenient for progress camera for the purpose of the image of identification.
Adjustment to be carried out can be the shooting angle of adjustment camera, so that adjustment rear camera is directed at audio institute
Corresponding spokesman;It is also possible to adjust the focal length of camera, thus guarantee the portrait of the spokesman ratio in acquired image,
Guarantee that viewing personnel can pass through image accurate recognition spokesman;It can also be while adjusting the shooting angle and coke of camera
Away from, it is determining with specific reference to actual conditions, i.e., judge whether to need to carry out shooting angle and coke according to identified distance and bearing
Away from adjustment.
When the spokesman according to corresponding to audio judges spokesman not in camera current shooting relative to the orientation of camera
In picture under angle or spokesman's deviation camera current shooting angle is larger, then is taken the photograph according to the control of identified orientation
As head rotation, that is, the shooting angle of camera shooting is adjusted, so that camera is directed at spokesman after guaranteeing adjustment.Conversely, if according to really
Fixed orientation judges the center for the shooting picture that spokesman is located under camera current shooting angle, then without shooting angle tune
It is whole.
When the spokesman according to corresponding to audio relative to camera Distance Judgment spokesman apart from camera farther out when, from
And make under current focus portrait in acquired image occupied ratio is smaller in the picture, then adjust the coke of camera
Away to guarantee that the ratio of the portrait of spokesman in acquired image in the picture meets the requirement of setting;, whereas if judging
Occupied ratio is met the requirements portrait in the picture in acquired image under current focus, then without Focussing.
Step 370, the camera by adjusting after carries out the image that Image Acquisition obtains spokesman corresponding to audio.
As above, after adjusting camera, spokesman corresponding to audio is located at the center of camera shooting picture, thus
Corresponding acquisition obtains the image of spokesman corresponding to audio.
Wherein, the image of spokesman can be the whole body images of spokesman, upper part of the body image etc., herein without specifically limiting
It is fixed.
In one embodiment, the image of acquired spokesman is the image based on spokesman corresponding to audio.
Wherein, the image of the acquired spokesman of the disclosure in acquisition terminal for showing, thus in speech human hair
While speech, the image of spokesman is shown.Wherein acquisition terminal can be shown by the display screen of itself, can also be led to
It crosses external display equipment to be shown, herein without specifically limiting.
In one embodiment, after step 370, this method further include:
Image shown by acquisition terminal is replaced with to the image of spokesman.
In the technical solution of the disclosure, when judging spokesman's variation according to audio, spokesman's positioning is carried out according to audio,
And camera is adjusted according to the position of navigated to spokesman, to collect the image of spokesman.Realize according to audio into
Row spokesman's tracking and positioning, and the image of the station acquisition spokesman according to spokesman.To guarantee shown by the acquisition terminal
Picture by acquisition spokesman image, can effectively solve the people that spokesman is not present in shown picture in the prior art
The problem of picture.
In one embodiment, before being shown, according to the scale of the display screen of acquisition terminal to spokesman
Image amplify, to guarantee that the image adaptation of spokesman obtained in display screen, guarantees display effect.
In one embodiment, after step 310, however, it is determined that spokesman does not change, then maintains the shooting angle of camera
It is constant, so as to continue to acquire image and the display of the spokesman.
In another embodiment, after step 310, however, it is determined that when spokesman does not change, do not replace acquisition terminal and show
The image shown, in other words, if acquiring the speech of an audio and this acquired audio artificially same people, shown by maintenance
Image it is constant.
In another embodiment, after step 310, however, it is determined that spokesman does not change, then judges audio according to the audio
Whether the position of corresponding spokesman changes, if spokesman position does not change, is adjusted according to the position of spokesman
Camera, wherein to camera carried out adjustment include adjustment camera shooting angle, and/or, according to spokesman with take the photograph
As the focal length of the distance between head adjustment camera.To, guarantee that spokesman is located at the center of the shooting picture of camera, thus
The image of clearly spokesman is collected, passes through the image identification spokesman of acquired spokesman convenient for viewing personnel.
Disclosed method can be applied in multipart video-meeting, to be collected according in multipart video-meeting
The corresponding image for acquiring spokesman according to the disclosed method of audio, to show the image of spokesman in screen, and should
The image synchronization of spokesman is shown in the display screen of other conferenced parties, so that the personnel participating in the meeting in multipart video-meeting
Spokesman can be determined according to shown image.
In one embodiment, as shown in figure 3, step 310, comprising:
Step 311, vocal print feature is extracted from audio.
As described above, extracted vocal print feature can be mel-frequency cepstrum coefficient, short-time energy, short-time average width
One or more of degree, short-time average zero-crossing rate, formant, linear prediction residue error, extracted vocal print feature can
To guarantee the accuracy of Application on Voiceprint Recognition, extracted vocal print feature is not limited specifically herein.
Step 313, vocal print phase of the extracted vocal print feature relative to vocal print feature corresponding to upper one acquired audio is calculated
Like degree.
Vocal print similarity is used to characterize the vocal print feature of current acquired audio relative to corresponding to upper one acquired audio
The similitude of vocal print feature.
In a particular embodiment, be carry out vocal print similarity calculating, according to by acquisition audio extraction vocal print feature
The vocal print vector of the audio is constructed, to carry out by the vocal print vector of present video and the vocal print vector of upper one acquired audio
Vocal print similarity calculation, such as it regard Euclidean distance, COS distance, mahalanobis distance of two vocal print vectors etc. as vocal print similarity.
Step 315, determine whether spokesman changes according to vocal print similarity.
When two vocal print feature of vocal print similarity characterization being calculated is similar, it is determined that spokesman does not change;Conversely,
If when the two vocal print feature dissmilarity of vocal print similarity characterization being calculated, it is determined that spokesman's variation.
In a particular embodiment, to determine whether spokesman changes according to vocal print similarity, similarity can be preset
Range, if vocal print similarity is located in the similarity dimensions, then it represents that two vocal print features corresponding to the vocal print similarity are similar.
To can determine by determining whether vocal print similarity be calculated is located at set similarity dimensions
Whether spokesman changes, and even vocal print similarity is located in similarity dimensions, it is determined that spokesman does not change;Conversely, if vocal print
Similarity exceeds similarity dimensions, it is determined that spokesman's variation.
In one embodiment, acquisition terminal includes a reference voice acquisition module and at least three non-reference sound collections
Module, as shown in figure 4, step 330, comprising:
Step 331, according to reference voice acquisition module and non-reference sound acquisition module collect respectively audio when
Between, the time delay that each non-reference sound acquisition module collects audio relative to reference voice acquisition module is calculated.
In the present embodiment, each sound acquisition module is while acquiring audio, it is corresponding store collect audio when
Between, thus, collect the time pair of the audio respectively according to reference voice acquisition module and each non-reference sound acquisition module
The time delay that each non-reference sound acquisition module collects the audio relative to reference voice acquisition module should be calculated.
Step 333, it is counted according to reference voice acquisition module, the position of non-reference sound acquisition module and time delay
It calculates, obtains the position coordinates of spokesman corresponding to audio.
Wherein, the position of reference voice acquisition module is as reference origin, and constructs coordinate system, thus according to reference voice
Acquisition module, each non-reference sound acquisition module position can be obtained each non-reference sound acquisition module relative in institute
Construct the coordinate in coordinate system.
And the time delay of the audio is collected relative to reference voice acquisition module according to each non-reference sound acquisition module
Spokesman corresponding to audio and non-reference sound acquisition module and the range difference with reference voice acquisition module can be calculated.
Following matrix equation is constructed by the coordinate and institute's calculated distance difference of each non-reference sound acquisition module:
AX=B
Wherein, matrix A is the matrix of n × 4, and n is the quantity of non-reference sound acquisition module, the i-th row element in matrix A
For [xi,yi,zi,di], xiFor the x-axis coordinate of i-th of non-reference sound acquisition module, yiFor i-th of non-reference sound collection mould
The y-axis coordinate of block, ziFor the z-axis coordinate of i-th of non-reference sound acquisition module, diFor spokesman corresponding to audio and i-th it is non-
Reference voice acquisition module and range difference with reference voice acquisition module;X=[x, y, z, R]T;Matrix B is the matrix of n × 4,
The i-th row element in matrix B is
Above-mentioned matrix equation is solved, the position coordinates (x, y, z) of spokesman corresponding to audio can be calculated.
In one embodiment, as shown in figure 5, step 350, comprising:
Step 351, according to the position navigated to, distance and side of the spokesman corresponding to audio relative to camera are determined
Position.
Step 353, the focal length of camera is adjusted according to identified distance, and is imaged according to identified orientation adjustment
The shooting angle of head.
Wherein, the adjustment of carried out shooting angle controls camera rotation according to identified orientation, to make to rotate
Spokesman corresponding to camera shooting alignment audio afterwards.
To carry out Focussing, can be carried out according to configuration file.It adjusts the distance in configuration file and is reflected with focal length
It penetrates, thus, after determining spokesman corresponding to audio at a distance from camera, this is obtained from configuration file apart from mapped
Focal length, thus, it is acquired focal length by the Focussing of camera.
In one embodiment, as shown in fig. 6, step 370, comprising:
Step 371, according to camera acquired image adjusted, spokesman's identification is carried out, in the picture positioning hair
Say the portrait of people.
In an application scenarios, if distance of the camera apart from spokesman is farther out, and in the space where acquisition terminal
The personnel of receiving are more, even if spokesman corresponding to audio is located at the center of camera shooting picture, and camera shooting after rotation
Under the shooting angle of head, it may include multiple personnel in institute's acquired image.
Under this application scenarios, in order to accurately obtain the image of spokesman corresponding to audio, spokesman's identification is carried out, really
Position of the portrait of spokesman corresponding to accordatura frequency in acquired image.
For personnel, lip correspondence is acted while speech.Spokesman's identification to be carried out can lead to
The lip motion for crossing everyone in acquired image identifies.Such as the lip picture of personnel is extracted from the image of continuous acquisition
Element judges whether the lip of personnel acts by comparing the extracted lip pixel from consecutive image, if movement, it is determined that
Portrait where the lip pixel is the portrait of spokesman;Conversely, if lip does not move, it is determined that portrait where lip pixel is not
The portrait of spokesman.
In other embodiments, to carry out spokesman's identification, movement agreement can be carried out in advance, such as agreement spokesman exists
It picked me when speech, arrange spokesman's standing speech, thus, it is moved in acquired image by what identification was arranged
Make, such as movement of raising one's hand, standing, and the portrait that the action state is presented in image is determined as to the portrait of spokesman.
Step 373, image is cut out according to the portrait navigated to, obtains the image of spokesman.
So far, then from include multiple portraits image in cut out the image obtained based on spokesman, i.e. spokesman
Image.Wherein spokesman's image obtained includes at least the face-image of spokesman.
In the more conference scenario of some personnels participating in the meeting, due to it is shown in display equipment be panorama, thus
Portrait in shown picture is more, causes its other party attended a meeting that can not rapidly navigate to from shown picture currently
The portrait of spokesman.
In the scheme of the present embodiment, by carrying out the positioning of spokesman's portrait, and it is cut out, to guarantee to be made a speech
The image of people is based on spokesman, and the personnel of raising identify the speed of spokesman from the image of spokesman.
In one embodiment, as shown in fig. 7, step 371, comprising:
Step 410, according to camera acquired image adjusted, by each portrait in acquisition image to specified
Organ carries out pixel extraction.
As described above, the spokesman's identification carried out can be lip motion or agreement based on everyone in image
Movement identify, but regardless of being lip or the movement arranged is realized by organ, such as lip, hand etc..
The execution organ of movement for spokesman's identification is designated organ, for example, if by lip motion come
Spokesman's identification is carried out, then lip is designated organ, if gesture carries out spokesman's identification, hand is designated organ.
To carry out spokesman's identification in acquired image, first carry out designated organ positioning in the picture, fixed correspondence mentions
Take the pixel of designated organ.
Step 430, action recognition is carried out according to extracted pixel, determines the movement that extracted pixel is characterized.
By extracted pixel, that is, restructural designated organ shape, so that corresponding determine according to the shape reconstructed
The movement that pixel is characterized.
Step 450, by the pixel place portrait that is consistent with predetermined action of characterization movement be determined as the portrait of spokesman.
Predetermined action for example arranges the movement for carrying out spokesman's identification, for example, raise one's hand, stand, lip is dynamic etc., In
This is without specifically limiting.
To, if the movement that institute's pixel is characterized is consistent with predetermined action, it is determined that the pixel place portrait be
The portrait of spokesman.
In one embodiment, this method further include:
Whether detection does not collect audio after being spaced set period of time yet.
If it has, then control camera is rotated to default shooting angle.
If it has not, then executing the step of carrying out Application on Voiceprint Recognition to the audio of acquisition.
After being spaced set period of time, if not collecting audio yet, control rotates camera to default shooting angle
Degree.Further, institute's acquired image under the shooting angle is shown in acquisition terminal.
Conversely, if collecting audio, going to after being spaced set period of time and executing step 310.
Fig. 8 is the flow chart of the Image Acquisition control method exemplified according to a specific implementation, in the present embodiment, acquisition
Terminal is the television set for including camera and sound acquisition module, as shown in figure 8, including the following steps:
Step 510, spokesman identifies: the portrait of spokesman, the hair carried out are identified according to camera acquired image
Speech people identifies the movement that can be moved or arrange by lip to identify.
Step 520, spokesman's image cutting-out: after the portrait for recognizing spokesman in the picture, to acquired image into
Row is cut, and the image of spokesman is obtained, to show the image of spokesman obtained on a television set.
Step 530, if continue to collect audio: the detection of real-time perfoming audio collection state (such as per second is examined
Survey), if continuing to collect audio, go to step 540;If audio is not collected, then step 560 is gone to.
Step 540, whether spokesman changes: an Application on Voiceprint Recognition is carried out by the collected audio of institute, to determine that spokesman is
No variation;If spokesman changes, step 550 is gone to;If spokesman does not change, do not deal with, i.e. continuation display TV
The currently displayed image of machine.
Step 550, camera is adjusted according to the position of spokesman: determines spokesman's according to the time of collected audio
Position, to accordingly adjust camera according to the position of spokesman.The adjustment carried out is for example according to spokesman relative to taking the photograph
As the shooting angle of the angle adjustment camera of head, in another example adjusting camera relative to the distance of camera according to spokesman
Focal length or shooting angle and focal length adjust.Then the camera by adjusting after carries out Image Acquisition, and goes to step
510。
Step 560, if be more than setting time: start timing when detection does not continue to collect audio, if being more than
Setting time (such as 30s) does not still collect audio, then goes to step 570;If the time for not collecting audio is less than
Setting time then continues timing.
Step 570, control camera is rotated to default shooting angle: Image Acquisition is carried out under default shooting angle, and
Acquired image is shown on a television set.While showing image, spokesman's identification is carried out according to acquired image, i.e.,
Go to step 510.
Following is embodiment of the present disclosure, can be used for executing the Image Acquisition that the above-mentioned terminal 200 of the disclosure executes
Control method embodiment.For those undisclosed details in the apparatus embodiments, the control of disclosure Image Acquisition is please referred to
Embodiment of the method.
Fig. 9 is a kind of block diagram of the control device of Image Acquisition shown according to an exemplary embodiment, which can be with
For executing all or part of step in either method embodiment in terminal 200 shown in FIG. 1.As shown in figure 9, the dress
It sets including but not limited to: voiceprint identification module 610, locating module 630, adjustment module 650 and image capture module 670,
In:
Voiceprint identification module 610 determines that spokesman is by Application on Voiceprint Recognition for carrying out Application on Voiceprint Recognition to the audio of acquisition
No variation.
Locating module 630 positions sound according to audio collected if judging that spokesman changes for voiceprint identification module
The position of spokesman corresponding to frequency in space.
Module 650 is adjusted, for being adjusted to the camera in acquisition terminal according to the position navigated to, is adjusted
Afterwards, spokesman corresponding to audio be located at camera shooting picture center, adjustment include adjustment camera shooting angle and/or
Adjust the focal length of camera.
Image capture module 670 carries out Image Acquisition for the camera by adjusting after and obtains speech corresponding to audio
The image of people.
The function of modules and the realization process of effect are specifically detailed in the controlling party of above-mentioned Image Acquisition in above-mentioned apparatus
The realization process of step is corresponded in method, details are not described herein.
It is appreciated that these modules can by hardware, software, or a combination of both realize.When realizing in hardware
When, these modules may be embodied as one or more hardware modules, such as one or more specific integrated circuits.When with software side
When formula is realized, these modules may be embodied as the one or more computer programs executed on the one or more processors, example
The program of storage in memory 204 as performed by the processor 218 of Fig. 1.
In one embodiment, voiceprint identification module 610, comprising:
Feature extraction unit, for extracting vocal print feature from audio.
Computing unit, for calculating sound of the extracted vocal print feature relative to vocal print feature corresponding to upper one acquired audio
Line similarity.
Determination unit, for determining whether spokesman changes according to vocal print similarity.
In one embodiment, acquisition terminal includes a reference voice acquisition module and at least three non-reference sound collections
Module, locating module 630, comprising:
Time-delay calculation unit, for being collected respectively according to reference voice acquisition module and non-reference sound acquisition module
The time of audio, be calculated each non-reference sound acquisition module relative to reference voice acquisition module collect audio when
Prolong.
Coordinate calculating unit, for according to the position of reference voice acquisition module, non-reference sound acquisition module and
Time delay is calculated, and the position coordinates of spokesman corresponding to audio are obtained.
In one embodiment, module 650 is adjusted, comprising:
Angle and orientation determination element, for according to the position navigated to, determine spokesman corresponding to audio relative to
The distance and bearing of camera.
Adjustment unit, for the focal length according to identified distance adjustment camera, and according to identified orientation tune
The shooting angle of whole camera.
In one embodiment, image capture module 670, comprising:
Portrait positioning unit, for spokesman's identification being carried out, in image according to camera acquired image adjusted
The portrait of middle positioning spokesman.
Unit is cut out, for being cut out according to the portrait navigated to image, obtains the image of spokesman.
In one embodiment, portrait positioning unit, comprising:
Pixel extraction unit, for according to camera acquired image adjusted, by it is every in acquisition image
One portrait carries out pixel extraction to designated organ.
Action recognition unit determines extracted pixel institute table for carrying out action recognition according to extracted pixel
The movement of sign.
Portrait determination unit, for by characterization movement be consistent with predetermined action pixel place portrait be determined as making a speech
The portrait of people.
In one embodiment, the device further include:
Replacement module is shown, for image shown by acquisition terminal to be replaced with to the image of spokesman.
In one embodiment, the device further include:
Whether detection module does not collect audio after being spaced set period of time for detecting yet.
Rotation adjustment module controls if not collecting audio after being spaced set period of time for detection module detection
Camera is rotated to default shooting angle.
If detection module detection collects audio after being spaced set period of time, voiceprint identification module 610 is gone to.
Modules/unit function and the realization process of effect are specifically detailed in above-mentioned image method acquisition in above-mentioned apparatus
The realization process of step is corresponded in control method, details are not described herein.
Optionally, the disclosure also provides a kind of acquisition terminal, which can be terminal 200 shown in FIG. 1, executes
All or part of step in any of the above embodiment of the method.Acquisition terminal includes:
Processor;And memory, computer-readable instruction is stored on memory, and computer-readable instruction is held by processor
The method in any of the above embodiment of the method is realized when row.
The processor of device in the embodiment executes the concrete mode of operation in the control in relation to the Image Acquisition
Detailed description is performed in the embodiment of method, no detailed explanation will be given here.
In the exemplary embodiment, a kind of computer readable storage medium is additionally provided, is stored thereon with computer-readable
Instruction when computer-readable instruction is executed by processor, realizes the method in any of the above embodiment of the method.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and change can executed without departing from the scope.The scope of the present invention is limited only by the attached claims.
Claims (10)
1. a kind of control method of Image Acquisition is applied to acquisition terminal, which is characterized in that the described method includes:
Application on Voiceprint Recognition is carried out to the audio of acquisition, determines whether spokesman changes by the Application on Voiceprint Recognition;
If spokesman change, according to audio collected position the audio corresponding to the position of spokesman in space;
According to the position navigated to, the camera in the acquisition terminal is adjusted, after adjustment, corresponding to the audio
Spokesman is located at the shooting picture center of the camera, the adjustment include adjust the camera shooting angle and/or
Adjust the focal length of the camera;
Camera by adjusting after carries out the image that Image Acquisition obtains spokesman corresponding to the audio.
2. the method according to claim 1, wherein described carry out Application on Voiceprint Recognition to the audio, by described
Application on Voiceprint Recognition judges whether spokesman changes, comprising:
Vocal print feature is extracted from the audio;
Calculate vocal print similarity of the extracted vocal print feature relative to vocal print feature corresponding to upper one acquired audio;
Determine whether spokesman changes according to the vocal print similarity.
3. the method according to claim 1, wherein the acquisition terminal includes a reference voice acquisition module
With at least three non-reference sound acquisition modules, the spokesman according to corresponding to the audio collected positioning audio is in sky
Between in position, comprising:
According to the reference voice acquisition module and the non-reference sound acquisition module collect respectively the audio when
Between, each non-reference sound acquisition module is calculated relative to the reference voice acquisition module and collects the audio
Time delay;
It is counted according to the reference voice acquisition module, the position of the non-reference sound acquisition module and the time delay
It calculates, obtains the position coordinates of spokesman corresponding to the audio.
4. the method according to claim 1, wherein described according to the position navigated to, eventually to the acquisition
Camera in end is adjusted, comprising:
According to the position navigated to, distance and bearing of the spokesman corresponding to the audio relative to the camera is determined;
Adjust the focal length of the camera according to identified distance, and the camera according to identified orientation adjustment
Shooting angle.
5. being obtained the method according to claim 1, wherein the camera by adjusting after carries out Image Acquisition
Obtain the image of spokesman corresponding to the audio, comprising:
According to camera acquired image adjusted, spokesman's identification is carried out, the spokesman is positioned in described image
Portrait;
Described image is cut out according to the portrait navigated to, obtains the image of the spokesman.
6. according to the method described in claim 5, it is characterized in that, described according to camera acquired image adjusted,
Spokesman's identification is carried out, the portrait of the spokesman is positioned in described image, comprising:
According to camera acquired image adjusted, by each portrait in acquisition image pixel is carried out to designated organ
Point extracts;
Action recognition is carried out according to extracted pixel, determines the movement that extracted pixel is characterized;
By the pixel place portrait that is consistent with predetermined action of characterization movement be determined as the portrait of spokesman.
7. being obtained the method according to claim 1, wherein the camera by adjusting after carries out Image Acquisition
After the image for obtaining spokesman corresponding to the audio, the method also includes:
Image shown by the acquisition terminal is replaced with to the image of the spokesman.
8. the method according to claim 1, wherein the method also includes:
Whether detection does not collect audio after being spaced set period of time yet;
It rotates if it has, then controlling the camera to default shooting angle;
If it has not, the step of audio for then executing described pair of acquisition carries out Application on Voiceprint Recognition.
9. a kind of control device of Image Acquisition, it is applied to acquisition terminal, which is characterized in that described device includes:
Whether voiceprint identification module determines spokesman by the Application on Voiceprint Recognition for carrying out Application on Voiceprint Recognition to the audio of acquisition
Variation;
Locating module positions the audio according to audio collected if judging that spokesman changes for voiceprint identification module
The position of corresponding spokesman in space;
Control module, for being adjusted to the camera in the acquisition terminal according to the position navigated to, after adjustment,
Spokesman corresponding to the audio is located at the shooting picture center of the camera, and the adjustment includes adjusting the camera
The focal length of shooting angle and/or the adjustment camera;
Image capture module carries out Image Acquisition for the camera by adjusting after and obtains spokesman's corresponding to the audio
Image.
10. a kind of acquisition terminal characterized by comprising
Processor;And
Memory is stored with computer-readable instruction on the memory, and the computer-readable instruction is held by the processor
Such as method described in any item of the claim 1 to 8 is realized when row.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910746092.0A CN110505399A (en) | 2019-08-13 | 2019-08-13 | Control method, device and the acquisition terminal of Image Acquisition |
PCT/CN2020/099455 WO2021027424A1 (en) | 2019-08-13 | 2020-06-30 | Image acquisition control method and acquisition terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910746092.0A CN110505399A (en) | 2019-08-13 | 2019-08-13 | Control method, device and the acquisition terminal of Image Acquisition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110505399A true CN110505399A (en) | 2019-11-26 |
Family
ID=68587511
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910746092.0A Pending CN110505399A (en) | 2019-08-13 | 2019-08-13 | Control method, device and the acquisition terminal of Image Acquisition |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110505399A (en) |
WO (1) | WO2021027424A1 (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111586341A (en) * | 2020-05-20 | 2020-08-25 | 深圳随锐云网科技有限公司 | Shooting method and picture display method of video conference shooting device |
CN111901524A (en) * | 2020-07-22 | 2020-11-06 | 维沃移动通信有限公司 | Focusing method and device and electronic equipment |
CN112073639A (en) * | 2020-09-11 | 2020-12-11 | Oppo(重庆)智能科技有限公司 | Shooting control method and device, computer readable medium and electronic equipment |
CN112312042A (en) * | 2020-10-30 | 2021-02-02 | 维沃移动通信有限公司 | Display control method, display control device, electronic equipment and storage medium |
WO2021027424A1 (en) * | 2019-08-13 | 2021-02-18 | 聚好看科技股份有限公司 | Image acquisition control method and acquisition terminal |
CN112541402A (en) * | 2020-11-20 | 2021-03-23 | 北京搜狗科技发展有限公司 | Data processing method and device and electronic equipment |
CN113542604A (en) * | 2021-07-12 | 2021-10-22 | 口碑(上海)信息技术有限公司 | Video focusing method and device |
CN113556499A (en) * | 2020-04-07 | 2021-10-26 | 上海汽车集团股份有限公司 | Vehicle-mounted video call method and vehicle-mounted system |
CN113824916A (en) * | 2021-08-19 | 2021-12-21 | 深圳壹秘科技有限公司 | Image display method, device, equipment and storage medium |
CN115242971A (en) * | 2022-06-21 | 2022-10-25 | 海南视联通信技术有限公司 | Camera control method and device, terminal equipment and storage medium |
TWI798867B (en) * | 2021-06-27 | 2023-04-11 | 瑞昱半導體股份有限公司 | Video processing method and associated system on chip |
CN117640877A (en) * | 2024-01-24 | 2024-03-01 | 浙江华创视讯科技有限公司 | Picture reconstruction method for online conference and electronic equipment |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113682319B (en) * | 2021-08-05 | 2023-08-01 | 地平线(上海)人工智能技术有限公司 | Camera adjustment method and device, electronic equipment and storage medium |
CN114554095B (en) * | 2022-02-25 | 2024-04-16 | 深圳锐取信息技术股份有限公司 | Target object determining method and related device of 4k camera |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100033585A1 (en) * | 2007-05-10 | 2010-02-11 | Huawei Technologies Co., Ltd. | System and method for controlling an image collecting device to carry out a target location |
CN104902203A (en) * | 2015-05-19 | 2015-09-09 | 广东欧珀移动通信有限公司 | Video recording method based on rotary camera, and terminal |
CN104991573A (en) * | 2015-06-25 | 2015-10-21 | 北京品创汇通科技有限公司 | Locating and tracking method and apparatus based on sound source array |
CN107144820A (en) * | 2017-06-21 | 2017-09-08 | 歌尔股份有限公司 | Sound localization method and device |
CN107247923A (en) * | 2017-05-18 | 2017-10-13 | 珠海格力电器股份有限公司 | A kind of instruction identification method, device, storage device, mobile terminal and electrical equipment |
CN109754811A (en) * | 2018-12-10 | 2019-05-14 | 平安科技(深圳)有限公司 | Sound-source follow-up method, apparatus, equipment and storage medium based on biological characteristic |
CN109783642A (en) * | 2019-01-09 | 2019-05-21 | 上海极链网络科技有限公司 | Structured content processing method, device, equipment and the medium of multi-person conference scene |
CN110082723A (en) * | 2019-05-16 | 2019-08-02 | 浙江大华技术股份有限公司 | A kind of sound localization method, device, equipment and storage medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10219095B2 (en) * | 2017-05-24 | 2019-02-26 | Glen A. Norris | User experience localizing binaural sound during a telephone call |
CN110505399A (en) * | 2019-08-13 | 2019-11-26 | 聚好看科技股份有限公司 | Control method, device and the acquisition terminal of Image Acquisition |
-
2019
- 2019-08-13 CN CN201910746092.0A patent/CN110505399A/en active Pending
-
2020
- 2020-06-30 WO PCT/CN2020/099455 patent/WO2021027424A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100033585A1 (en) * | 2007-05-10 | 2010-02-11 | Huawei Technologies Co., Ltd. | System and method for controlling an image collecting device to carry out a target location |
CN104902203A (en) * | 2015-05-19 | 2015-09-09 | 广东欧珀移动通信有限公司 | Video recording method based on rotary camera, and terminal |
CN104991573A (en) * | 2015-06-25 | 2015-10-21 | 北京品创汇通科技有限公司 | Locating and tracking method and apparatus based on sound source array |
CN107247923A (en) * | 2017-05-18 | 2017-10-13 | 珠海格力电器股份有限公司 | A kind of instruction identification method, device, storage device, mobile terminal and electrical equipment |
CN107144820A (en) * | 2017-06-21 | 2017-09-08 | 歌尔股份有限公司 | Sound localization method and device |
CN109754811A (en) * | 2018-12-10 | 2019-05-14 | 平安科技(深圳)有限公司 | Sound-source follow-up method, apparatus, equipment and storage medium based on biological characteristic |
CN109783642A (en) * | 2019-01-09 | 2019-05-21 | 上海极链网络科技有限公司 | Structured content processing method, device, equipment and the medium of multi-person conference scene |
CN110082723A (en) * | 2019-05-16 | 2019-08-02 | 浙江大华技术股份有限公司 | A kind of sound localization method, device, equipment and storage medium |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021027424A1 (en) * | 2019-08-13 | 2021-02-18 | 聚好看科技股份有限公司 | Image acquisition control method and acquisition terminal |
CN113556499A (en) * | 2020-04-07 | 2021-10-26 | 上海汽车集团股份有限公司 | Vehicle-mounted video call method and vehicle-mounted system |
CN113556499B (en) * | 2020-04-07 | 2023-05-09 | 上海汽车集团股份有限公司 | Vehicle-mounted video call method and vehicle-mounted system |
CN111586341A (en) * | 2020-05-20 | 2020-08-25 | 深圳随锐云网科技有限公司 | Shooting method and picture display method of video conference shooting device |
CN111901524A (en) * | 2020-07-22 | 2020-11-06 | 维沃移动通信有限公司 | Focusing method and device and electronic equipment |
CN112073639A (en) * | 2020-09-11 | 2020-12-11 | Oppo(重庆)智能科技有限公司 | Shooting control method and device, computer readable medium and electronic equipment |
CN112312042A (en) * | 2020-10-30 | 2021-02-02 | 维沃移动通信有限公司 | Display control method, display control device, electronic equipment and storage medium |
CN112541402A (en) * | 2020-11-20 | 2021-03-23 | 北京搜狗科技发展有限公司 | Data processing method and device and electronic equipment |
TWI798867B (en) * | 2021-06-27 | 2023-04-11 | 瑞昱半導體股份有限公司 | Video processing method and associated system on chip |
CN113542604A (en) * | 2021-07-12 | 2021-10-22 | 口碑(上海)信息技术有限公司 | Video focusing method and device |
CN113824916A (en) * | 2021-08-19 | 2021-12-21 | 深圳壹秘科技有限公司 | Image display method, device, equipment and storage medium |
CN115242971A (en) * | 2022-06-21 | 2022-10-25 | 海南视联通信技术有限公司 | Camera control method and device, terminal equipment and storage medium |
CN117640877A (en) * | 2024-01-24 | 2024-03-01 | 浙江华创视讯科技有限公司 | Picture reconstruction method for online conference and electronic equipment |
CN117640877B (en) * | 2024-01-24 | 2024-03-29 | 浙江华创视讯科技有限公司 | Picture reconstruction method for online conference and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
WO2021027424A1 (en) | 2021-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110505399A (en) | Control method, device and the acquisition terminal of Image Acquisition | |
CN108573701B (en) | Query endpointing based on lip detection | |
CN104240606B (en) | The adjusting method of display device and display device viewing angle | |
Yu et al. | Smart meeting systems: A survey of state-of-the-art and open issues | |
US20190341058A1 (en) | Joint neural network for speaker recognition | |
CN109754811A (en) | Sound-source follow-up method, apparatus, equipment and storage medium based on biological characteristic | |
CN110322760B (en) | Voice data generation method, device, terminal and storage medium | |
US20160343389A1 (en) | Voice Control System, Voice Control Method, Computer Program Product, and Computer Readable Medium | |
CN105765964A (en) | Shift camera focus based on speaker position | |
CN112037791B (en) | Conference summary transcription method, apparatus and storage medium | |
US11587548B2 (en) | Text-driven video synthesis with phonetic dictionary | |
WO2021017096A1 (en) | Method and installation for entering facial information into database | |
WO2019206186A1 (en) | Lip motion recognition method and device therefor, and augmented reality device and storage medium | |
WO2021120190A1 (en) | Data processing method and apparatus, electronic device, and storage medium | |
US10825224B2 (en) | Automatic viseme detection for generating animatable puppet | |
WO2022179453A1 (en) | Sound recording method and related device | |
Huang et al. | Audio-visual speech recognition using an infrared headset | |
US9298971B2 (en) | Method and apparatus for processing information of image including a face | |
JP2002006874A (en) | Voice processor, moving picture processor, voice and moving picture processor, and recording medium with voice and moving picture processing program recorded | |
Wen et al. | 3D Face Processing: Modeling, Analysis and Synthesis | |
US20240022772A1 (en) | Video processing method and apparatus, medium, and program product | |
Fanelli et al. | 3D vision technology for capturing multimodal corpora: chances and challenges | |
Sui et al. | A 3D audio-visual corpus for speech recognition | |
Cabañas-Molero et al. | Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis | |
CN109829067B (en) | Audio data processing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191126 |
|
RJ01 | Rejection of invention patent application after publication |