CN1997161A - A video terminal and audio code stream processing method - Google Patents

A video terminal and audio code stream processing method Download PDF

Info

Publication number
CN1997161A
CN1997161A CN 200610064656 CN200610064656A CN1997161A CN 1997161 A CN1997161 A CN 1997161A CN 200610064656 CN200610064656 CN 200610064656 CN 200610064656 A CN200610064656 A CN 200610064656A CN 1997161 A CN1997161 A CN 1997161A
Authority
CN
China
Prior art keywords
lip
sound
speaker
source
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200610064656
Other languages
Chinese (zh)
Other versions
CN100556151C (en
Inventor
詹五洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
FUGUE ACOUSTICS TECHNOLOGY CO., LTD.
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CNB2006100646565A priority Critical patent/CN100556151C/en
Publication of CN1997161A publication Critical patent/CN1997161A/en
Application granted granted Critical
Publication of CN100556151C publication Critical patent/CN100556151C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

This invention discloses one audio frequency code flow process method, which comprises the following steps: decoding the audio frequency compression codes flow to germ the audio source image and testing the said audio source position information in the said image; decoding the said audio compression codes flow to germ sound information; processing the sound information according to the position to make the repeat sound direction and position matching. This invention also discloses one visual frequency terminal.

Description

A kind of video terminal and a kind of audio code stream processing method
Technical field
The present invention relates to mechanics of communication, particularly relate to a kind of video terminal and a kind of audio code stream processing method.
Background technology
Along with popularizing of broadband, in occupation of more and more important position, the video epoch of communication have been opened curtain to video communication in our social life.But, screen of TV set is increasing at present, and the video communications system that has adopts projecting apparatus or video wall to show, the position that causes the participant to move on picture is bigger, and the sound of present multimedia communication system does not change according to speaker's position, be that sound does not have azimuth information, cause video communication to lack the sense of reality.
Prior art discloses a kind of solution to the problems described above: place the device of a long strip type at the television set top, a plurality of microphones are arranged, a plurality of loud speakers, and camera in this device.After the voice signal of a plurality of microphone collections handled, can obtain a voice signal, and the speaker's azimuth information with respect to the long strip type device.The transmitting terminal of video communications system is sent to receiving terminal with voice signal and the speaker's azimuth information that obtains by network, receiving terminal is according to the azimuth information that receives, select one or more loudspeaker plays, just can reappear speaker's azimuth information like this at receiving terminal.
In such scheme, speaker's azimuth information of transmitting terminal collection is with respect to the long strip type device, rather than with respect to camera lens.When rotating camera lens, the speaker in strip device dead ahead is just on the next door of picture, not even within picture, and the sound bearing information of gathering still is the dead ahead, so just causes the position of speaker in the picture and the sound bearing information of collection not to match.
In addition, transmitting terminal need send to receiving terminal by network with azimuth information, if transmitting terminal and receiving terminal are the equipment of different manufacturers, will have the problem of intercommunication, and receiving terminal can not correctly be handled the azimuth information of transmitting terminal in other words.
Summary of the invention
Embodiments of the invention provide a kind of video terminal and a kind of audio code stream processing method, make that transmitting terminal does not need sound source position information is sent to receiving terminal by network, and the sound of playback also can be realized accurate match with the position of source of sound.
A kind of audio code stream processing method is characterized in that, described method specifically comprises:
Compressed video stream is decoded, obtain to comprise the image of source of sound, in described image, detect the positional information of described source of sound;
Compressed video stream corresponding audio compressed bit stream is decoded, obtain voice messaging;
Positional information according to described source of sound is handled described voice messaging, and the sound bearing of playback and the position of described source of sound are complementary.
A kind of video terminal is characterized in that,
The video decode module is used for the compressed video stream that receives is decoded, and the image behind the output decoder;
The audio decoder module is used for the compressed video stream corresponding audio compressed bit stream that receives is decoded, and the voice messaging behind the output decoder;
The sound source position detection module is used for the image that the receiver, video decoder module sends, and extracts the feature of source of sound, thereby detects the positional information of source of sound;
The sound bearing processing module is used to receive the voice messaging of audio decoder module transmission and the sound source position information that the sound source position detection module sends, and the position of sound bearing and source of sound is mated mutually.
Embodiments of the invention are handled the sound of resetting by the positional information of source of sound in the detected image, can be so that mate mutually the position of source of sound in the orientation of the sound of resetting in the loud speaker and the image; Receiving terminal needn't rely on the transmission terminal sound source position information is provided simultaneously.
Description of drawings
Fig. 1 is the method flow diagram of the embodiment of the invention;
Fig. 2 is an application scenarios of the embodiment of the invention;
Fig. 3 is the moving flow chart that detects of lip in the embodiment of the invention;
Fig. 4 is the structure chart of video terminal in the embodiment of the invention.
Embodiment
Embodiments of the invention provide a kind of audio code stream processing method.As shown in Figure 1, this method is made up of following steps:
Compressed video stream is decoded, obtain to comprise the image of source of sound, in described image, detect the positional information of described source of sound;
Compressed video stream corresponding audio compressed bit stream is decoded, obtain voice messaging;
Positional information according to described source of sound is handled described voice messaging, and the sound bearing of playback and the position of described source of sound are complementary.
In order to make purpose of the present invention, technical scheme and advantage clearer,, the present invention is further elaborated below in conjunction with drawings and Examples.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
Describe the present invention with a video conference in detail as an application scenarios of the embodiment of the invention below.But this application scenarios is not used for limiting the present invention.
Fig. 2 is the schematic diagram of video communication system.In Fig. 2, the 10th, the transmitting terminal meeting-place, the 11st, the receiving terminal meeting-place, the 12nd, communication network, communication network can be IP network, PSTN network, wireless network etc.In meeting-place 10, the 101st, camera, the 102nd, video communication terminal, the 103rd, television set, the 104th, participant, the 105, the 106th, loud speaker.Being built-in with microphone in the terminal 102, also can be independently to place the outside, is connected with terminal 112 by transmission line.In meeting-place 11, the 111st, camera, the 112nd, video communication terminal, the 113rd, television set, 104a are participant 104 images, the 115, the 116th, loud speaker.Being built-in with microphone in the terminal 112, also can be independently to place the outside, is connected with terminal 102 by transmission line.After camera 101 in the transmitting terminal meeting-place 10 is caught image, be sent to terminal 102,102 pairs of images of terminal wait after the processing through coding, be transferred to terminal 112 by network 12,112 pairs of image code streams that receive of terminal are decoded, and the image after will decoding is transferred to demonstration on the television set 113.After the microphones capture voice signal in the meeting-place 10, pass to terminal 102, terminal 102 is carried out audio coding, and the audio code stream after will encoding by network 12 is transferred to terminal 112, after the audio code stream decoding that 112 pairs at terminal receives, send loud speaker 115,116 to and reset.
In 11 meeting-place of Fig. 2, have telepresenc in order to make sound, the sound of loud speaker 115,116 playbacks and the position of speaker 104a are complementary.
Below we with in video conference, the artificial source of sound of speaking in the meeting describes method of the present invention as an example:
Step1: the compressed video stream that transmitting terminal is sent carries out video decode, obtains the image of transmitting terminal, detects the positional information of speaker in the image then.
Compressed video stream is carried out video decode, and what obtain is multiple image, then the image in the frame sequence is detected, and obtains speaker's positional information.
Wherein, the method that detects the speaker position has many kinds, for example adopt image recognition technology, with some characteristic of speaker as the publish picture position of speaker in the picture of feature detection, the feature that can be used to detect comprises people's face, eyes, lip etc., below we with speaker's lip as being characterized as example, the positional information of how determining the speaker by the moving position of the lip that detects the speaker is described.
Please refer to the moving detection procedure of lip of Fig. 3.
S11: detect the moving position of lip of present frame, if present frame has lip moving, execution in step S12 then; Otherwise execution in step S14;
S12: judged whether that further a plurality of lips move the position,, then in the moving position of a plurality of lips, selected the moving position of a lip, or calculate the center of the moving position of a plurality of lips and position, execution in step S13 are moved as lip in this center if there are a plurality of lips to move the position; Otherwise, direct execution in step S13;
S13: the moving position of output lip;
S14: do not export the moving position of lip.
The moving position of lip is the position at speaker's lip place.Detect the moving position of lip and can adopt detection method of the prior art.Simple effective method is the color according to lip, and the search of lip look can be carried out at YIQ or YUV color space.For example, in the YIQ space, through statistics and experiment effect, the optimal threshold that obtains each component of lip look is respectively Y ∈ [80,220], I ∈ [12,78], Q ∈ [7,25].Can be relatively easy to search out the position of lip according to these threshold values.If only search for, inevitably can bring some erroneous judgements, thereby can also after searching out the lip position, further judge according to the colour of skin around the lip according to the lip look according to the lip look.The colour of skin also has a threshold range of concentrating relatively, and whether the color of utilizing these threshold ranges can judge the lip periphery is the colour of skin, is correct if the judgement of lip position then is described, otherwise incorrect.Utilizable in addition feature also has eye feature etc.
Need also after judging the position of lip to judge whether lip is kept in motion, this can just can make judgement easily according to the size of the lip of the some two field picture same positions in front and back and the speed that changes.Because the moving position of lip has continuity, therefore do not need every two field picture all in the gamut of image, to detect the moving position of lip, concrete grammar is if former frame has detected the moving position of lip, whether the moving position of lip of then detecting present frame can detection have lip to exist near the moving position of former frame lip, if do not have, then the moving position of search lip in the entire image scope if having, judges further that then whether lip is in motion; If in motion, then the position is moved as lip in the position of motion lip, otherwise, a predetermined frame number is set, all keep lip to move invariant position within the predetermined frame number after present frame, all do not have motion, then restart in the moving position of entire image range searching lip if surpass predetermined frame number lip.Adopt this method can reduce amount of calculation to a great extent, and can guarantee the continuity of sound bearing.
In video communication, particularly in the application of video conference, same meeting-place has a plurality of participants, and this moment is because there are reasons such as the people yawns, little sound words, the moving position of a plurality of lips can be detected, therefore the moving position of a suitable lip need be from the moving position of a plurality of lips, selected.As previously mentioned, if former frame has lip to move the position, near the moving position of the detection lip moving position of former frame lip only then, if therefore detect the moving position of a plurality of lips, also the moving position of search lip just takes place in the entire image scope.The strategy of selecting a lip to move the position from the moving position of a plurality of lips has multiple, for example selects the moving position of positive lip, filters out the moving position of lip of side; Perhaps select near the moving position of the lip in the middle of the picture, and the lip that filters out on the picture limit moves the position.In the meeting-place, also may there be a plurality of speakers simultaneously sometimes, if adopt above-mentioned method all can not select suitable lip move the position, can calculate the center of the moving position of these a plurality of speaker's lips this moment, and the position that this center is moved as the lip of output.
Step2: the compressed audio bitstream stream that transmitting terminal sends is decoded, obtain voice messaging;
The decoding to audio compression code stream and compressed video stream described in Step1 and the Step2 can be carried out simultaneously, also can separately carry out the branch of no sequencing.
Step3: the positional information according to the speaker is handled the voice messaging that receives, and makes speaker's sound bearing and its position be complementary.
Voice messaging is handled in position according to the speaker, can utilize the method for prior art to realize.Describe for example below.Application scenarios for Fig. 2, if what reset is two loud speakers, and two loud speakers are respectively at television set the right and left, an acoustic processing scheme is, by adjusting the amplitude of left and right acoustic channels sound, reach the purpose that the speaker position is complementary in the level orientation of sound and the picture, speaker's position and sound bearing are complementary.Two formula below available are described concrete method of adjustment:
D=(g1-g2)/(g1+g2)
C=g1*g1+g2*g2
C is a fixed value in above-mentioned two formulas, g1 is the L channel amplitude gain, g2 is the R channel amplitude gain, D is the relative distance of speaker's horizontal direction on picture of coming out according to the moving positional information calculation of lip, make the moving position of lip apart from the distance of picture intermediate vertical line be D ' (the moving position of lip on the picture left side on the occasion of, the right is a negative value), the width of television image horizontal direction is W, then D is calculated as follows:
D=D’/(W/2)
Can also adopt HRTF (Head RelatedTransfer Functions, head-related transfer function) according to sound source position information processing sound method.It is all open in existing technical literature to adopt HRTF to fictionalize the technology of a sound source, no longer describes in detail in the present invention.
In the method that embodiments of the invention provide,, make receiving terminal needn't rely on the transmission terminal speaker is provided positional information by detecting on sound reproduction ground and obtaining speaker's positional information; After obtaining positional information, according to this positional information the voice messaging of resetting is handled, thereby accurate match is realized in speaker's position in feasible sound of resetting and the image.
Need to prove that audio code stream processing method provided by the invention not only is confined to handle the audio code stream that receives from transmitting terminal, be equally applicable to handle being stored in local video, audio code stream.
Embodiments of the invention also provide a kind of video terminal.As shown in Figure 4, modules such as video decode, audio decoder, sound source position detection, sound bearing processing are arranged in video communication terminal.Compressed video stream outputs to television set on the one hand and shows after the decoding of video decode module, outputs to the sound source position detection module in addition on the one hand.The image of sound source position detection module receiver, video decoder module output, and image detected, extract the feature of source of sound, thereby obtain sound source position information, and sound source position information is outputed to the sound bearing processing module.Compressed audio bitstream is flowed through after the audio decoder module decoding, outputs to the sound bearing processing module.The audio code stream that the sound bearing processing module is received according to the sound source position information butt joint is handled, and the sound bearing after feasible processing the and the position of source of sound are consistent, and produces left and right sides two-way audio output, is transported to left and right speaker playback respectively.In order to have the better sound replaying effect, video communication terminal can external loud speaker more than three or three, and this moment, the sound bearing processing module was exported the audio stream more than three road or three tunnel accordingly.
The purpose of the sound source position detection module in the video terminal is that the image that the video decode module is exported is detected, and obtains the wherein positional information of source of sound.So when if source of sound is the speaker in video terminal, position probing can realize by the lip feature of extracting the speaker, also can be by detecting speaker's features such as people's face, as long as this module can detect the speaker position in the image of video decode module output.
If the lip with the speaker is the position that feature detects the speaker, then the sound source position detection module comprises:
First receiver module is used for the image that comprises the speaker that the receiver, video decoder module sends;
Characteristic extracting module is used to extract the lip feature of speaker described in the image that described first receiver module receives;
Position detecting module is used for the lip feature according to the described speaker of described characteristic extracting module extraction, determines described speaker's position.
Wherein, detect the moving position of lip and can adopt the moving detection method of the lip of introducing previously.
The sound bearing processing module comprises:
Second receiver module is used to receive the voice messaging of described audio decoder module transmission and the described speaker's that described position detecting module sends positional information;
Matching module is used for according to the voice messaging of described second receiver module reception and described speaker's positional information the sound bearing of playback and described speaker's position being complementary.
In sum, more than be preferred embodiment of the present invention only, be not to be used to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (11)

1, a kind of audio code stream processing method is characterized in that, comprising:
Compressed video stream is decoded, obtain to comprise the image of source of sound, in described image, detect the positional information of described source of sound;
Compressed video stream corresponding audio compressed bit stream is decoded, obtain voice messaging;
Positional information according to described source of sound is handled described voice messaging, and the sound bearing of playback and the position of described source of sound are complementary.
2, the method for claim 1 is characterized in that, when described source of sound was the speaker, the described positional information that detects described source of sound in described image was specially:
From described image, extract described speaker's lip feature, go out the moving position of lip, thereby determine described speaker's positional information according to described lip feature detection.
3, method as claimed in claim 2, if detected the moving position of lip in the former frame image that described compressed video stream decoding obtains, then whether near the present frame detection moving position of described former frame lip has lip to exist.
4, method as claimed in claim 2 is characterized in that, when with at least two described voice of speaker playback, described positional information according to described source of sound is handled described voice messaging and is specially:
Adjust the amplitude of described loud speaker left and right acoustic channels sound, the level orientation of sound and described speaker position are complementary.
5, method as claimed in claim 2 is characterized in that, the described positional information that detects described source of sound in described image further comprises:
When having a plurality of lips to move the position in the described image, calculate the center of the moving position of described a plurality of lip, and with the position of this center as the speaker of output.
6, method as claimed in claim 2 is characterized in that, described lip feature comprises the color of lip.
7, method as claimed in claim 6 is characterized in that, after determining the lip position according to the color of lip, judges further whether the color around the lip is the color of skin.
8,, after detecting the lip position, judge that further whether lip is in motion as claim 6 or 7 described methods; If in motion, then the position is moved as lip in the position of motion lip, otherwise, a predetermined frame number is set, all keep lip to move invariant position within the predetermined frame number after present frame, all do not have motion if surpass predetermined frame number lip, then restart the moving position of search lip in the entire image scope.
9, a kind of video terminal is characterized in that,
The video decode module is used for the compressed video stream that receives is decoded, and the image behind the output decoder;
The audio decoder module is used for the compressed video stream corresponding audio compressed bit stream that receives is decoded, and the voice messaging behind the output decoder;
The sound source position detection module is used for the image that the receiver, video decoder module sends, and extracts the feature of source of sound, thereby detects the positional information of source of sound;
The sound bearing processing module is used to receive the voice messaging of audio decoder module transmission and the sound source position information that the sound source position detection module sends, and the position of sound bearing and source of sound is mated mutually.
10, device as claimed in claim 9 is characterized in that, described sound source position detection module comprises:
First receiver module is used for the image that comprises the speaker that the receiver, video decoder module sends;
Characteristic extracting module is used to extract the lip feature of speaker described in the image that described first receiver module receives;
Position detecting module is used for the lip feature according to described characteristic extracting module extraction, determines described speaker's position.
11, device as claimed in claim 10 is characterized in that, described sound bearing processing module comprises:
Second receiver module is used to receive the voice messaging of described audio decoder module transmission and the described speaker's that described position detecting module sends positional information;
Matching module is used for according to the voice messaging of described second receiver module reception and described speaker's positional information the sound bearing of playback and described speaker's position being complementary.
CNB2006100646565A 2006-12-30 2006-12-30 A kind of video terminal and a kind of audio code stream processing method Expired - Fee Related CN100556151C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100646565A CN100556151C (en) 2006-12-30 2006-12-30 A kind of video terminal and a kind of audio code stream processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100646565A CN100556151C (en) 2006-12-30 2006-12-30 A kind of video terminal and a kind of audio code stream processing method

Publications (2)

Publication Number Publication Date
CN1997161A true CN1997161A (en) 2007-07-11
CN100556151C CN100556151C (en) 2009-10-28

Family

ID=38252055

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100646565A Expired - Fee Related CN100556151C (en) 2006-12-30 2006-12-30 A kind of video terminal and a kind of audio code stream processing method

Country Status (1)

Country Link
CN (1) CN100556151C (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101132516B (en) * 2007-09-28 2010-07-28 华为终端有限公司 Method, system for video communication and device used for the same
CN102136269A (en) * 2010-01-22 2011-07-27 微软公司 Speech recognition analysis via identification information
CN102170552A (en) * 2010-02-25 2011-08-31 株式会社理光 Video conference system and processing method used therein
CN102186049A (en) * 2011-04-22 2011-09-14 华为终端有限公司 Conference terminal audio signal processing method, conference terminal and video conference system
CN104584539A (en) * 2012-08-24 2015-04-29 高通股份有限公司 Person identification within video call
CN104735582A (en) * 2013-12-20 2015-06-24 华为技术有限公司 Sound signal processing method, equipment and device
CN107404682A (en) * 2017-08-10 2017-11-28 京东方科技集团股份有限公司 A kind of intelligent earphone
CN109413563A (en) * 2018-10-25 2019-03-01 Oppo广东移动通信有限公司 The sound effect treatment method and Related product of video
FR3074584A1 (en) * 2017-12-05 2019-06-07 Orange PROCESSING DATA OF A VIDEO SEQUENCE FOR A ZOOM ON A SPEAKER DETECTED IN THE SEQUENCE
CN112135226A (en) * 2020-08-11 2020-12-25 广东声音科技有限公司 Y-axis audio reproduction method and Y-axis audio reproduction system
CN113794830A (en) * 2021-08-04 2021-12-14 深圳市沃特沃德信息有限公司 Target track calibration method and device based on video and audio and computer equipment
CN114422935A (en) * 2022-03-16 2022-04-29 荣耀终端有限公司 Audio processing method, terminal and computer readable storage medium
CN115002401A (en) * 2022-08-03 2022-09-02 广州迈聆信息科技有限公司 Information processing method, electronic equipment, conference system and medium
WO2024066799A1 (en) * 2022-09-28 2024-04-04 华为技术有限公司 Playback control method and apparatus

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101132516B (en) * 2007-09-28 2010-07-28 华为终端有限公司 Method, system for video communication and device used for the same
US8259625B2 (en) 2007-09-28 2012-09-04 Huawei Technologies Co., Ltd. Method, system, and device of video communication
CN102136269A (en) * 2010-01-22 2011-07-27 微软公司 Speech recognition analysis via identification information
CN102170552A (en) * 2010-02-25 2011-08-31 株式会社理光 Video conference system and processing method used therein
CN102186049A (en) * 2011-04-22 2011-09-14 华为终端有限公司 Conference terminal audio signal processing method, conference terminal and video conference system
CN102186049B (en) * 2011-04-22 2013-03-20 华为终端有限公司 Conference terminal audio signal processing method, conference terminal and video conference system
CN104584539A (en) * 2012-08-24 2015-04-29 高通股份有限公司 Person identification within video call
CN104735582A (en) * 2013-12-20 2015-06-24 华为技术有限公司 Sound signal processing method, equipment and device
CN107404682B (en) * 2017-08-10 2019-11-05 京东方科技集团股份有限公司 A kind of intelligent earphone
CN107404682A (en) * 2017-08-10 2017-11-28 京东方科技集团股份有限公司 A kind of intelligent earphone
US10511910B2 (en) 2017-08-10 2019-12-17 Boe Technology Group Co., Ltd. Smart headphone
US11076224B2 (en) 2017-12-05 2021-07-27 Orange Processing of data of a video sequence in order to zoom to a speaker detected in the sequence
FR3074584A1 (en) * 2017-12-05 2019-06-07 Orange PROCESSING DATA OF A VIDEO SEQUENCE FOR A ZOOM ON A SPEAKER DETECTED IN THE SEQUENCE
WO2019110913A1 (en) * 2017-12-05 2019-06-13 Orange Processing of data of a video sequence in order to zoom on a speaker detected in the sequence
WO2020082902A1 (en) * 2018-10-25 2020-04-30 Oppo广东移动通信有限公司 Sound effect processing method for video, and related products
CN109413563A (en) * 2018-10-25 2019-03-01 Oppo广东移动通信有限公司 The sound effect treatment method and Related product of video
CN112135226A (en) * 2020-08-11 2020-12-25 广东声音科技有限公司 Y-axis audio reproduction method and Y-axis audio reproduction system
CN112135226B (en) * 2020-08-11 2022-06-10 广东声音科技有限公司 Y-axis audio reproduction method and Y-axis audio reproduction system
CN113794830A (en) * 2021-08-04 2021-12-14 深圳市沃特沃德信息有限公司 Target track calibration method and device based on video and audio and computer equipment
CN114422935A (en) * 2022-03-16 2022-04-29 荣耀终端有限公司 Audio processing method, terminal and computer readable storage medium
CN114422935B (en) * 2022-03-16 2022-09-23 荣耀终端有限公司 Audio processing method, terminal and computer readable storage medium
CN115002401A (en) * 2022-08-03 2022-09-02 广州迈聆信息科技有限公司 Information processing method, electronic equipment, conference system and medium
CN115002401B (en) * 2022-08-03 2023-02-10 广州迈聆信息科技有限公司 Information processing method, electronic equipment, conference system and medium
WO2024066799A1 (en) * 2022-09-28 2024-04-04 华为技术有限公司 Playback control method and apparatus

Also Published As

Publication number Publication date
CN100556151C (en) 2009-10-28

Similar Documents

Publication Publication Date Title
CN100556151C (en) A kind of video terminal and a kind of audio code stream processing method
CN102186049B (en) Conference terminal audio signal processing method, conference terminal and video conference system
US8115799B2 (en) Method and apparatus for obtaining acoustic source location information and a multimedia communication system
CN1984310B (en) Method and communication apparatus for reproducing a moving picture
US9641585B2 (en) Automated video editing based on activity in video conference
US8705778B2 (en) Method and apparatus for generating and playing audio signals, and system for processing audio signals
CN100459711C (en) Video compression method and video system using the method
CN101132516B (en) Method, system for video communication and device used for the same
CN111343411B (en) Intelligent remote video conference system
AU2012265335B2 (en) Audio decoding method and device
US20090252481A1 (en) Methods, apparatus, system and computer program product for audio input at video recording
JP2009501476A (en) Processing method and apparatus using video time up-conversion
WO2018209879A1 (en) Method and device for automatically selecting camera image, and audio and video system
CN103096020B (en) video conference system, video conference device and method thereof
EP1938208A1 (en) Face annotation in streaming video
US20080273116A1 (en) Method of Receiving a Multimedia Signal Comprising Audio and Video Frames
CN103888713A (en) Video conference communication method
CN102202206A (en) Communication device
CN114793287B (en) Audio and video content monitoring and broadcasting method based on two-way broadcasting guide
KR100836609B1 (en) Mobile communication terminal for frame rate controlling and its controlling method
US20220415003A1 (en) Video processing method and associated system on chip
KR20060081497A (en) Mobile phone and method for improving the quality of displaying part of mobile phone
Timofeyev et al. Modeling of software development processes with hidden Markov models
CN104639730A (en) Mobile phone media interaction platform based on 3G network
GB2594942A (en) Capturing and enabling rendering of spatial audio signals

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20171218

Address after: No. 2, F District, Enping City, Jiangmen, Guangdong

Patentee after: FUGUE ACOUSTICS TECHNOLOGY CO., LTD.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: Huawei Technologies Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20091028

Termination date: 20181230