WO2008075726A1 - ビデオ会議装置 - Google Patents

ビデオ会議装置 Download PDF

Info

Publication number
WO2008075726A1
WO2008075726A1 PCT/JP2007/074449 JP2007074449W WO2008075726A1 WO 2008075726 A1 WO2008075726 A1 WO 2008075726A1 JP 2007074449 W JP2007074449 W JP 2007074449W WO 2008075726 A1 WO2008075726 A1 WO 2008075726A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
sound
data
unit
video data
Prior art date
Application number
PCT/JP2007/074449
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
Toshiyuki Hata
Takuya Tamaru
Original Assignee
Yamaha Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corporation filed Critical Yamaha Corporation
Publication of WO2008075726A1 publication Critical patent/WO2008075726A1/ja

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0007Image acquisition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/698Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture

Definitions

  • the present invention relates to a video conferencing apparatus that communicates video and images and audio used when a video conference is performed between conference rooms separated from each other.
  • a video conference device (video conference device) as shown in Patent Document 1 is arranged at each point so as to surround the video conference device. The conference is attended and a conference is held.
  • each conference person is equipped with a microphone with a radio wave generator, and radio waves are radiated from the microphone that picks up the highest level of sound.
  • the person photographing camera detects the direction of the speaker by receiving this radio wave, directs the camera toward the direction of the speaker, and captures an image centered on the speaker.
  • the video data and audio data are encoded and transmitted to the destination video conference apparatus.
  • Patent Document 1 JP-A-6-276514
  • an object of the present invention is to provide a resource that is highly flexible along with audio and video. To provide a video conferencing device that can be transmitted accurately and clearly even if there is a fee.
  • the present invention provides an imaging unit that captures a predetermined area, a video data generation unit that generates video data based on video captured by the imaging unit, Generating voice data and generating communication data including a housing including a sound emission and collection unit for emitting sound emission sound data, sound collection sound data and video data, and transmitting the communication data to the outside.
  • the present invention relates to a video conferencing apparatus including a communication unit that acquires sound emission sound data from external communication data and applies the sound emission sound collection unit, and a support unit that supports the imaging unit in a predetermined manner.
  • the support unit uses the first mode in which the imaging unit is directed to the conference person imaging region around the housing, and the second mode in which the imaging unit is directed to a region near the imaging unit in the vicinity of the housing.
  • the imaging unit is supported by any one of the modes.
  • the video data generation unit of this video conference apparatus cuts out only the azimuth area corresponding to the sound collection direction information of the collected sound data from the video data.
  • the extracted video data is corrected by the first correction processing according to the first mode.
  • the video data generation unit cuts out a predetermined area centered on the front direction of the imaging unit from the video data, and is different from the first correction processing.
  • the video data cut out by the second correction process according to the above is corrected.
  • the video conference device of the present invention cuts out only the video data in the sound collection direction and performs the first correction process when the imaging unit is set to the first mode facing the conference person imaging area. Make adjustments so that they are easy to see. Then, the video conference device generates communication data from the video data and the collected sound data, and transmits the communication data to the counterpart device.
  • the image capturing unit captures a document or the like installed in a close area near the casing.
  • the image taken by the imaging unit from the front is corrected by the second correction process so that it can be easily viewed.
  • the video conference device generates communication data including the video data and transmits it to the counterpart device.
  • the video is corrected by the first correction process and the second correction process, which are different correction processes according to the respective modes. .
  • the conference participant video and the still image such as the document are corrected according to the respective shooting specifications, the conference participant video and the document image appropriately corrected with respect to the destination device. And can send power.
  • the support section of the video conference apparatus of the present invention is characterized by including a joint mechanism for switching between the first mode and the second mode, and forming a switch by the joint mechanism.
  • the video data generation unit of this video conference apparatus is characterized by detecting the selection between the first mode and the second mode based on the switch selection status by the joint mechanism.
  • the first mode and the second mode are selected by switching the switch by operating the joint mechanism of the support unit. Second mode is set.
  • the present invention also includes an imaging unit that images a predetermined area, a video data generation unit that generates video data based on the video captured by the imaging unit, and a sound around the device itself. Generates collected sound data, generates communication data including a sound emission / collection unit that emits sound output sound data, sound collection sound data and video data, and transmits the communication data to the outside.
  • the present invention relates to a video conferencing apparatus comprising: a communication unit that obtains sound emission sound data from communication data from and provides the sound emission and collection unit; and a support unit that supports the imaging unit with respect to the housing. is there. In this video conference apparatus, the imaging unit simultaneously captures the conference person imaging area and the area close to the imaging unit in the vicinity of the housing.
  • the video data generation unit cuts out only the azimuth area corresponding to the sound collection direction information of the collected sound data from the first partial video data corresponding to the conference person imaging area, and the third partial video data is extracted from the first partial video data.
  • the second partial video data corresponding to the area close to the imaging unit is corrected by a fourth correction process different from the third correction process.
  • the first partial video data corresponding to the conference person imaging area and the second partial video data corresponding to the area where the material close to the imaging unit is arranged are provided as one unit. Acquired simultaneously by the imaging unit. In the first partial video data, only the azimuth area corresponding to the collected sound data is cut out and appropriately corrected by the third correction process. The second part video data is adjusted so that it can be easily viewed by the corresponding fourth correction process.
  • the conference video and the still image such as the document are acquired at the same time, and each It is adjusted according to the shooting specifications.
  • the video conference apparatus includes a selection unit that selects partial video data used for communication data.
  • the video data generation unit of the video conference apparatus gives the partial video data selected by the selection unit to the communication unit.
  • the imaging unit has a fisheye lens
  • the central region of the region imaged by the fisheye lens is set as a region close to the imaging unit, and at least a peripheral region outside the central region is conferenced It is characterized by a person imaging area.
  • a fisheye lens is used as a specific specification of the imaging unit. Then, an area corresponding to the center of the fisheye lens is set as an area close to the imaging unit, and correction is appropriately performed by correction processing according to this area.
  • the center area may be used when the mode is changed, but the peripheral area is mainly used. Therefore, the video in the conference area is appropriately adjusted by the correction processing according to the selected area in each case. As a result, even if an image (image) in the vicinity area near the imaging unit and an image in the conference area are captured through the fisheye lens, the respective images are appropriately corrected.
  • the video data generation unit of the video conference apparatus of the present invention is integrally formed with the imaging unit.
  • the communication unit of the video conference apparatus according to the present invention is integrally formed with the housing together with the sound emission and collection unit.
  • the video data generation unit of the video conference apparatus according to the present invention is integrally formed with the casing together with the sound emission and collection unit.
  • the video conference apparatus of the present invention further includes a display monitor for reproducing video data.
  • the communication unit of this video conference apparatus acquires video data included in the communication data and supplies it to the display monitor.
  • the video conference apparatus of the present invention is arranged and connected to each point where the communication conference is performed. It is possible to easily share the conference video and materials between the two parties.
  • the video of the speaker is corrected by the correction processing according to the video of the speaker
  • the image of the material is corrected by the correction processing according to the image of the material by a simple operation of the imaging unit. Since the correction is made, it is possible to transmit both the speaker image and the document image to the other device accurately and clearly. As a result, in the video conference using this apparatus, it is possible to realize the conference more easily and easily!
  • FIG. 1 is an external view of a video conferencing apparatus according to a first embodiment in a conference shooting mode.
  • FIG. 2 is an external view of the video conference apparatus according to the first embodiment in a document shooting mode.
  • FIG. 3 is a block diagram illustrating a main configuration of the video conference apparatus according to the first embodiment.
  • FIG. 4 is a diagram illustrating a situation (conference shooting mode) in which the video conference apparatus according to the first embodiment is arranged and a video conference is performed with another point connected to the network.
  • FIG. 5 is an explanatory diagram used for explaining video data generation in the conference shooting mode.
  • FIG. 6 is a diagram showing a situation (material shooting mode) in which the video conference apparatus according to the first embodiment is arranged and a video conference is performed with another point connected to the network.
  • FIG. 7 is an explanatory diagram used for explaining video data generation in the document photographing mode.
  • FIG. 8 is an external view of an assembly member including a sound emission and collection device 1, a camera 2, and a support 7 in a video conference device according to a second embodiment.
  • FIG. 9 is a diagram showing a usage situation of a video conference apparatus using the video conference apparatus of the second embodiment.
  • FIG. 10 is a diagram for explaining generation of video data by the video conference apparatus according to the second embodiment.
  • FIGS. 1 and 2 are external views of the video conference apparatus of the present embodiment
  • FIG. (B) is a side view
  • Fig. 1 and Fig. 2 show only the structure of the sound emission and collection device, camera, and stage, which are mechanically characteristic, and the communication terminal, sound emission and collection device, and the cable that electrically connects the camera are not shown. Omitted.
  • Fig. 1 shows the mechanism state in the conference shooting mode
  • Fig. 2 shows the mechanism state in the document shooting mode.
  • FIG. 3 is a block diagram showing the main configuration of the video conference apparatus according to the present embodiment.
  • the video conference apparatus includes a sound emitting and collecting apparatus 1 having a disk shape in plan view, a camera 2 having an imaging function and a video data generating function, and the camera 2 with respect to the sound emitting and collecting apparatus 1 at a predetermined position. And stay 3 to be installed.
  • sound emitting and collecting apparatus 1 and camera 2 are electrically connected, and the video conferencing apparatus is electrically connected to sound emitting and collecting apparatus 1 and camera 2.
  • a communication terminal to be connected is provided.
  • the communication terminal 5 demodulates the communication data received from the communication terminal of the other party's video conferencing apparatus connected via the network 500, and outputs a sound signal for sound emission, the other party's apparatus ID, and the speaker orientation.
  • the data is acquired and given to the sound emitting and collecting device 1 on the own device side connected by the cable.
  • the communication terminal 5 generates communication data based on the collected sound signal and speaker position data received from the sound emitting and collecting device 1 on the own device side and the video data received from the camera 2.
  • Communication terminal 5 transmits the generated communication data to the communication terminal of the destination video conference device. To do. Further, the communication terminal 5 mediates transmission / reception of the speaker position data between the sound emitting and collecting apparatus 1 and the camera 2 depending on the situation.
  • the sound emission and collection device 1 includes a disk-shaped housing 11.
  • the casing 11 has a circular shape in plan view, and the shape in side view in which the area between the top surface and the bottom surface is narrower than the area of the middle part in the vertical direction is from a point in the height direction. It has a shape that narrows toward the surface and narrows from the one point toward the bottom surface. That is, it has a shape having inclined surfaces on the upper side and the lower side from the one point.
  • a concave portion 110 having a predetermined depth narrower than the area of the top surface is formed on the top surface of the casing 11 so that the center of the concave portion 110 and the center of the top surface coincide with each other. Is set to
  • each microphone MC ;! to MC16 has a single directivity
  • each microphone MC is arranged so as to have a strong directivity in the central direction as viewed from above.
  • the direction is the center of directivity
  • the number of microphones is not limited to this, and may be set as appropriate according to specifications.
  • each speaker SP;! To SP4 has a strong directivity in the front direction of the sound emitting surface.
  • the speakers SP;! To SP4 are arranged on the lower side of the casing 11, and the microphones MC;! To MC16 are arranged on the upper side of the casing 11, and the microphones MC;! To MC16 are accommodated.
  • the microphones MC ;! to MC16 are difficult to pick up the wraparound sound from the speakers SP ;! to SP4.
  • speaker position detection which will be described later, is less likely to be affected by wraparound speech, and the speaker position can be detected with higher accuracy.
  • the operation unit 111 is installed on an inclined surface on the upper side of the casing 11, and includes various operation buttons and a liquid crystal display panel (not shown).
  • the input / output I / F102 (not shown in FIGS. 1 and 2) is an inclined surface on the lower side of the casing 11, and is installed at a position where the SP force SP;! To SP4 is not installed. Equipped with a terminal that can communicate various control data. Then, by connecting the terminal of the input / output I / F 102 and the communication terminal with a cable or the like, communication is performed between the sound emission and collection device 1 and the communication terminal.
  • the sound emitting and collecting apparatus 1 has a functional configuration as shown in FIG. 3 in addition to such a structural configuration.
  • the control unit 101 performs general control such as setting, sound collection, and sound emission of the sound emission / collection device 1, and controls each part of the sound emission / collection device 1 based on the operation instruction content input by the operation unit 111.
  • the input / output I / F 102 outputs sound emission sound signals S 1 to S 3 received from the communication terminal 5 to the channels CH;! To CH 3, respectively.
  • the channel assignment may be set as appropriate according to the number of received sound signals for sound emission.
  • the input / output I / F 102 receives the counterpart device ID from the communication terminal 5 and assigns a channel CH to each counterpart device ID. For example, when there is one connected counterpart device, the audio data from the counterpart device is assigned to channel CH1 as sound output audio signal S1. Also, when there are two connected counterpart devices, the audio data from the two counterpart devices are individually assigned to channels CHI and CH2 as sound emission sound signals SI and S2, respectively.
  • the audio data from the three counterpart devices are individually assigned to channels CHI, CH2, and CH3 as sound output signals SI, S2, and S3, respectively.
  • the channels CH;! To CH3 are connected to the sound emission control unit 103 via the echo cancellation unit 107.
  • the input / output I / F 102 extracts the speaker orientation data Py at the other party sound emission and collection device from the communication terminal 5 and provides it to the sound emission control unit 103 together with the channel information.
  • the sound emission control unit 103 generates speaker output signals SPD;! To SPD4 to be given to the speakers SP;! To SP4 based on the sound signals for sound emission S1 to S3 and the speaker orientation information Py. To do.
  • the D / A-AMP 104 converts each speaker output signal SPD ;! to SPD4 from digital to analog, amplifies the signal with a constant amplification factor, and supplies it to the speakers SP ;! to SP4, respectively.
  • Speaker SP;! ⁇ SP4 converts the given speaker output signal SPD;! ⁇ SPD4 into sound and emits it
  • the sound emitted from each speaker SP;! To SP4 has a predetermined delay relationship and amplitude relationship. A sense of sound can be given to the conferees.
  • the microphones MC;! To MC16 collect sound from outside, such as the sound generated by the conference, and generate the collected signals MS;! To MS16.
  • Each A / D-AMP 105 amplifies the corresponding collected sound signal MS ;! to MS 16 with a predetermined amplification factor, converts the signal to analog-digital, and outputs it to the sound collection control unit 106.
  • the sound collection control unit 106 synthesizes the acquired sound collection signals MS;! To MS 16 with different delay control patterns and amplitude patterns, and sets the respective different directions as the central direction of directivity. A sound beam signal is generated. For example, with the sound emitting and collecting apparatus 1 as the center, eight sound collecting beam signals are generated in which the 360 ° of the entire circumference is divided into eight angles, that is, the central direction of the directivity is shifted every 45 °.
  • the sound collection control unit 106 compares the amplitude levels of these sound collection beam signals, selects the sound collection beam signal MBS having the highest amplitude level, and outputs it to the echo cancellation unit 107.
  • the sound collection control unit 106 acquires the speaker orientation corresponding to the selected sound collection beam signal, generates the speaker orientation information Pm, and provides it to the input / output I / F 102.
  • the echo cancellation unit 107 includes an adaptive filter that generates pseudo-regression sound signals based on the sound output sound signals S1 to S3 for the input sound pickup beam signal MBS, and a sound pickup beam signal. It consists of a post processor that subtracts the pseudo-regressive sound signal from the MBS.
  • the echo cancellation circuit subtracts the pseudo-regression sound signal from the output sound pickup beam signal MBS while sequentially optimizing the filter coefficients of the adaptive filter, so that the speaker SP;! Included in the output sound pickup beam signal MBS! ⁇ Remove the wraparound component from SP4 to microphone MC;! ⁇ MC16.
  • the collected sound beam signal MBS from which the wraparound component has been removed is output to the input / output I / F 102.
  • the input / output I / F 102 associates the collected sound beam signal MBS from which the return sound has been removed by the echo canceling unit 107 with the speaker orientation information Pm from the sound collecting control unit 106, and outputs it to the communication terminal 5. To do.
  • the camera 2 is installed at a position fixed to the sound emitting and collecting apparatus 1 by the stay 3 as shown in FIGS. At this time, the camera 2 is installed by the stay 3 so as to be rotatable between a horizontal direction (direction facing the camera 2 shown in FIG. 1) and a vertical downward direction (direction facing the camera 2 shown in FIG. 2). ing.
  • the stay 3 includes a main body part 31, a camera support part 32, a main body support part 33, and a sound emitting and collecting device attachment part 34.
  • the main body 31 is formed of a linear member having a predetermined width, and is installed in a shape extending in a direction of a predetermined angle with respect to the vertical direction by the main body support 33.
  • a camera support portion 32 is installed at one end in the extending direction of the main body portion 31 via a hinge 203, and a sound emission and collection device mounting portion 34 is installed at the other end.
  • the sound emitting and collecting device mounting portion 34 is formed of a flat plate having an opening portion into which the leg portion 12 of the housing 11 is fitted, and is integrally formed with the main body portion 31, for example.
  • the end portion on the camera support portion 32 side of the main portion 31 has a shape in which only both end walls in the width direction remain and the center portion in the width direction opens.
  • the opening has a shape that does not contact the main body 31 when the camera 2 installed in the camera support 32 rotates between the horizontal direction and the vertical downward direction.
  • the hinge 203 has a structure in which the camera support portion 32 is rotatably installed with respect to the main body portion 31. Further, the hinge 203 and the camera support portion 32 have a structure that is semi-fixed when the camera 2 and the camera support portion 32 face in the horizontal direction and when they face in the vertical downward direction. For example, the hinge 203 is fixed to the main body 31, and the recesses are formed at the horizontal position and the vertically downward position of the hinge 203, respectively.
  • a protrusion on the hinge side of the camera support 32 is provided with a protrusion that fits into the recess, and the protrusion is biased from within the camera support 32 with a panel or the like. As a result, the camera 2 can rotate between the horizontal direction and the vertically downward direction, and can maintain a mechanical state in the horizontal direction and the vertically downward direction.
  • the mechanism unit including the hinge 203 and the camera support unit 32 functions as the switch 4.
  • connection or detection signal is set so that different signals are obtained between the horizontal recess and the vertical downward recess.
  • the switch 4 is formed, and the detection result of the switch 4 is given to the camera 2.
  • the camera 2 can identify the power of the camera 2 facing in the horizontal direction and the power of capturing the video by identifying whether the camera 2 is facing down in the vertical direction.
  • the camera 2 includes an imaging unit 21 and a video processing unit 22.
  • the imaging unit 21 includes a fisheye lens, and images an area up to the installation surface of the fisheye lens with an infinite distance force in all directions around the front direction of the camera 2.
  • the imaging data is given to the video processing unit 22.
  • the image processing unit 22 acquires the direction in which the camera 2 is detected (hereinafter referred to as a shooting direction) from the force of the switch 4 (hinge 203 and camera support unit 32) of the stay 3. Based on the acquired shooting direction and the speaker orientation data P m from the sound emission and collection device 1 via the communication terminal 5, the video processing unit 22 extracts only the necessary part from the imaging data and corrects the image, thereby obtaining video data. Is generated. The generated video data is given to the communication terminal 5.
  • the number of power conferencing members shown when there are five conferencing members on the device side is not particularly limited to this.
  • FIG. 4 is a diagram showing a situation in which the video conference apparatus according to the present embodiment is arranged and a video conference is performed with another point connected to the network, and the camera 2 captures the conference participants 60;!-605.
  • FIG. 4 is a diagram showing a situation in which the video conference apparatus according to the present embodiment is arranged and a video conference is performed with another point connected to the network, and the camera 2 captures the conference participants 60;!-605.
  • FIG. 4 is a diagram showing a situation in which the video conference apparatus according to the present embodiment is arranged and a video conference is performed with another point connected to the network, and the camera 2 captures the conference participants 60;!-605.
  • FIG. 4 is a diagram showing a situation in which the video conference apparatus according to the present embodiment is arranged and a video conference is performed with another point connected to the network, and the camera 2 captures the conference participants 60;!-605.
  • FIG. 5 is an explanatory diagram used to explain video data generation.
  • (A) shows the video (image) taken through the fisheye lens
  • (B) and (C) are image correction concepts for each conference direction. Indicates.
  • FIG. 6 is a diagram illustrating a situation where the video conference apparatus according to the present embodiment is arranged and a video conference is performed with another point connected to the network, and the case where the camera 2 captures the document 650 is illustrated.
  • FIG. 6 is a diagram illustrating a situation where the video conference apparatus according to the present embodiment is arranged and a video conference is performed with another point connected to the network, and the case where the camera 2 captures the document 650 is illustrated.
  • Fig. 7 is an explanatory diagram used to explain the video data generation.
  • (A) shows the video (image) taken through the fisheye lens, and
  • (B) shows the concept of image correction during image capture.
  • the conferees 60;! To 605 are seated on the oval table 700 at positions other than one end in the longitudinal direction.
  • an integrated member of a circular sound emission and collection device 1 and a camera 2 fixed to the same by a stay 3 is installed on the table 700.
  • the force lens 2 is installed so that the axis parallel to the longitudinal direction of the table 700 coincides with the central axis of the fish-eye lens in a state of being horizontally oriented.
  • a communication terminal 5 is installed under the table 700.
  • the communication terminal 5 is electrically connected to the sound emission and collection device 1 and the camera 2 and is connected to the network 500.
  • the communication terminal 5 is electrically connected to the display 6.
  • the display 6 is composed of, for example, a liquid crystal display or the like, and is installed near the end of the table 700 where the participants 60 ;! to 605 are not seated. At this time, the display 6 is installed such that the display surface faces the direction of the table 700.
  • the video conference device including the sound emission and collection device 1, the camera 2, and the communication terminal 5 transmits the conference video to the destination video conference device in two modes. Send.
  • the video processor 22 of the camera 2 detects that the conference shooting mode has been selected by the detection signal from the switch 4. To do.
  • the video processing unit 22 detects the conference shooting mode, the video processing unit 22 provides the communication terminal 5 with a selection signal for the mode.
  • the imaging unit 21 of the camera 2 acquires imaging data obtained by imaging all conference persons 60;! To 605 present on the device side through the fisheye lens, and outputs the acquired imaging data to the video processing unit 22.
  • the imaging area becomes circular as shown in Fig. 5 (A).
  • the sound emission and collection device 1 acquires the voice of the conference participant who is speaking by the above-described processing, detects the conference direction, and transmits the collected sound data and the speaker orientation information ⁇ to the communication terminal 5.
  • the sound emission and collection device 1 detects the direction ⁇ 1 of the conference party 601 and collects the collected sound data and the speaker based on the voice from the direction of the conference party 601.
  • Direction information ⁇ 1 is given to communication terminal 5.
  • the sound emission and collection device 1 detects the orientation ⁇ 2 of the conference party 605, and collects the collected sound data based on the voice from the conference 605 direction and the speaker orientation information ⁇ 2.
  • the communication terminal 5 gives the speaker orientation information ⁇ to the video processing unit 22 of the camera 2.
  • the video processing unit 22 corrects the imaging data based on the speaker orientation information ⁇ from the communication terminal 5! /.
  • the video processing unit 22 stores in advance the relationship between the speaker orientation information ⁇ and the orientation angle ⁇ set in the imaging data.
  • the video processing unit 22 reads the corresponding orientation angle ⁇ .
  • the video processing unit 22 receives the speaker orientation information ⁇ 1 for the conference 601
  • the video processing unit 22 performs image correction conversion for each acquired image extraction area. Specifically, each pixel defined by two angular directions, ⁇ direction and ⁇ direction, is corrected so as to be applied to a pixel in an orthogonal two-dimensional plane coordinate (X— ⁇ coordinate system). At this time, the video processing unit 22 stores a conversion processing table between the ⁇ coordinate system and the X ⁇ coordinate system in advance, and calculates the X ⁇ coordinate based on the obtained ⁇ coordinate of each pixel. And compensate for transformation. Note that the video processing unit 22 stores a coordinate conversion calculation formula in advance, and may perform correction conversion using the coordinate conversion calculation formula.
  • the video processing unit 22 uses a plane coordinate system to store the image data 621 set in the azimuth range ⁇ 1 to ⁇ 2 and the elevation range ⁇ 1 to ⁇ 2. Converted to the corrected image data 621 'set by xl to x2 and yl to y2 with the horizontal direction as the X axis and the vertical direction as the vertical axis.
  • the person image 611 of the conference person 601 obtained in the ⁇ coordinate system is converted into a corrected person image 631 in the XY coordinate system (planar coordinate system).
  • the corrected person image 631 becomes close to the natural body image of the conference person 601.
  • the video processing unit 22 converts the image data 622 set in the azimuth angle range ⁇ 3 to ⁇ 4 and the elevation angle range ⁇ 3 to ⁇ 4 into plane coordinates.
  • horizontal direction X axis Is converted to the corrected image data 622 ′ set by x3 to x4 and y3 to y4 with the vertical direction as the Y axis.
  • the person image 615 of the conference person 605 acquired in the ⁇ coordinate system is converted into a corrected person image 635 in the XY coordinate system (planar coordinate system).
  • the corrected person image 635 becomes close to the natural body image of the conference participant 601.
  • the video processing unit 22 attaches time information to the corrected image data including the corrected human image approaching the natural body in this way, and outputs the corrected image data to the communication terminal 5 as video data. Such generation and output of the corrected image data are performed sequentially. If the received speaker orientation information ⁇ changes, the center direction of the corrected image data is switched according to the change.
  • the communication terminal 5 uses the video data from the video processing unit 22, the collected voice data, and the speaker orientation information.
  • Communication data is generated by associating with ⁇ , and transmitted to the video conference apparatus of the other party via the network 500.
  • the video processing unit 22 of the camera 2 causes the document shooting mode to be detected by the detection signal from the switch 4. Detects that a command was selected.
  • the video processing unit 22 detects the document photographing mode, the video processing unit 22 gives a selection signal for the mode to the communication terminal 5.
  • the force of any of the participants 60;! To 605 places the material 650 around the vertical downward position of the hinge 203 in the table 700. At this time, if the material placement marking is performed on the table 700 in advance, the material 650 can be placed easily and appropriately.
  • the imaging unit 21 of the camera 2 acquires imaging data obtained by imaging the material 650 placed on the table 700 through the fisheye lens, and outputs it to the video processing unit 22.
  • the imaging data passes through the fisheye lens, the imaging area becomes circular as shown in FIG.
  • the image processing unit 22 sets the center of the imaging data as the origin and the distance r extending in the radial direction from the origin, and a predetermined direction (see FIG. In Fig. 7, it is expressed as an angle 71 with respect to the image data from the origin in the right direction (0 ° direction). Obtained in the coordinate system.
  • the video processing unit 22 cuts out image data 680 in a preset range from the acquired imaging data.
  • the video processing unit 22 corrects the image data 680 in the r ⁇ coordinate system by converting it into corrected image data 680 ′ in the X ⁇ plane coordinate system. At this time, the video processing unit 22 stores in advance a coordinate conversion processing table in which the center coordinates of the r ⁇ coordinate system and the X ⁇ coordinate system coincide with each other, and the X— Y coordinate is calculated and corrected. Note that the video processing unit 22 stores a coordinate conversion calculation formula in advance, and V can be used for the coordinate conversion calculation formula and fi correction can be performed.
  • the material image 660 of the material 650 acquired in the r-7 coordinate system is converted into a corrected material image 670 in the XY coordinate system (planar coordinate system).
  • the corrected material image 670 becomes close to the natural body image of the material 650. That is, the image data of the material 650 can be acquired.
  • the communication terminal 5 generates communication data including the image data of the material 650 acquired from the video processing unit 22 and transmits the communication data to the partner video conference apparatus via the network 500. As a result, it is possible to provide clear and easy-to-see material images to the conference attendees who are present around the video conference device of the other party. At this time, if the collected sound data is acquired from the sound emission and collection device 1, the communication terminal 5 generates and transmits communication data including the collected sound data together with the image data of the material 650. Also good.
  • FIG. 8 is an external view of an assembly member including the sound emission and collection device 1, the camera 2, and the support 7 in the video conference device of the present embodiment, (A) is a plan view, and (B) is a side view. It is.
  • FIG. 9 is a diagram showing a usage situation of the video conference apparatus using the video conference apparatus of the present embodiment, where (A) is a plan view and (B) is a side view. 8 and 9, the sound emission and collection device 1 and cables connected to the camera 2 are not shown.
  • FIG. 10 is a diagram for explaining generation of video data by the video conference apparatus according to the present embodiment.
  • (A) is a diagram showing imaging data
  • (B) is a concept of image correction at the center of the imaging data.
  • FIGS. 2C and 2C are conceptual diagrams of image correction around the image data.
  • the configuration and processing of the sound emitting and collecting apparatus 1 and the communication terminal 5 are the same as those of the video conference apparatus of the first embodiment.
  • the video conferencing apparatus of the present embodiment is different from the first embodiment in that the switch 4 is installed in the structure of the camera 2, that is, the structure of the support 7 and the video processing method in the video processing unit 22 of the camera 2. It is omitted.
  • a support 7 is disposed around the disc-shaped sound emitting and collecting apparatus 1.
  • the support 7 includes four vertical support shafts extending in the vertical direction, two horizontal support shafts disposed at a distance h 1 from the top surface of the sound emitting and collecting device 1, and the top surface of the sound emitting and collecting device 1. It consists of four horizontal spindles arranged at a distance h2 (> hl).
  • the two horizontal support shafts arranged at the distance hi have a structure that intersects at a substantially central position when the sound emitting and collecting apparatus 1 is viewed in plan, and are held at the distance hi by the four vertical support shafts.
  • the horizontal support shafts arranged at the distance h2 are assembled so as to be substantially square in a plan view, and are held at the distance h2 by four vertical support shafts.
  • Camera 2 is installed at the intersection of two horizontal spindles at distance hi. Camera 2 is installed so that the shooting direction is vertically upward.
  • the mounting table 8 is supported by four horizontal support shafts at a distance h2, and the mounting table 8 is formed of a highly transmissive glass, an acrylic plate, or the like. At this time, the mounting table 8 and the camera 2 are installed so that the center of the mounting table 8 and the axis of the fisheye lens of the camera 2 substantially coincide with each other in a plan view.
  • the material 650 is placed with the printing surface in a vertically downward direction, that is, in a direction in contact with the mounting tape nozzle 8.
  • the height of the camera 2 and the height of the mounting table 8, that is, the distances hi and h2, are as shown in FIG. It should be set so that it can be photographed and is not hidden by the horizontal spindle that supports the mounting table 8.
  • the video conferencing apparatus having such a configuration When the video conferencing apparatus having such a configuration is used, it is acquired by the imaging unit 21 of the camera 2.
  • the imaging data is as shown in Fig. 10 (A).
  • the entire imaging area is a circular all-area image data 610, and the document image 660 of the document 650 is projected at the center, and each of the surrounding areas is each image data 660.
  • Personnel image 60;! ⁇ 604 People image 64;! ⁇ 644 are shown.
  • the image processing unit 22 uses the center of the image data as the origin, the distance r extending in the radial direction from the origin, and a predetermined direction (in FIG. 10, from the origin to the image data). It is obtained in the r 7] coordinate system expressed by the angle ⁇ with respect to the right direction (0 ° direction). The video processing unit 22 cuts out a predetermined range of image data 681 from the acquired imaging data.
  • the video processing unit 22 corrects the image data 681 in the r ⁇ coordinate system by converting it into corrected image data 681 ′ in the X ⁇ plane coordinate system.
  • the video processing unit 22 stores in advance a coordinate conversion processing table in which the center coordinates of the r ⁇ coordinate system and the X ⁇ coordinate system coincide with each other, and based on the acquired r 7] coordinates of each pixel, — Calculate Y coordinate and perform correction conversion.
  • the video processing unit 22 stores a coordinate conversion calculation formula in advance, and V can be used for the coordinate conversion calculation formula and fi correction can be performed.
  • the material image 660 of the material 650 acquired in the r ⁇ coordinate system is converted into a corrected material image 670 in the ⁇ - ⁇ coordinate system (planar coordinate system).
  • The By transforming into the X-coordinate system in this way, the corrected material image 670 becomes close to the natural body image of the material 650. In other words, it is possible to obtain the image data of the document 650 that is not distorted
  • the video processing unit 22 acquires peripheral image data 682 by removing the image data 681 near the center from the entire region image data 610. Based on the speaker position information acquired from the sound emission and collection device 1 via the communication terminal 5, the video processing unit 22 sets an area to be extracted as in the first embodiment. That is, the video processing unit 22 extracts a region including the image of the conference participant who is speaking, and acquires the partial image data 683. At this time, the video processing unit 22 acquires partial image data in the rV coordinate system. Specifically, as shown in FIG. 10 (C), the video processing unit 22 determines the coordinates of the four corners of the fan shape including the image of the corresponding conference (rlO) based on the speaker orientation information. , 7] 10), (rlO, ⁇ 20), (r20, ⁇ 20), (r20,] 10) ⁇ To obtain.
  • the video processing unit 22 performs correction conversion on the acquired partial image data 683. Specifically, each pixel defined in the r coordinate system is compensated and transformed so as to be applied to a pixel in the orthogonal two-dimensional plane coordinate (XY coordinate system). At this time, the video processing unit 22 stores in advance a conversion processing table between the rn coordinate system and the XY coordinate system, and calculates the XY coordinate based on the acquired rn coordinate of each pixel. , Make corrections. Note that the video processing unit 22 stores a coordinate conversion calculation formula in advance, and may perform correction conversion using the coordinate conversion calculation formula.
  • the video processing unit 22 has a planar coordinate system for displaying the partial image data 683 set in the distance range rl0 to r20 and the azimuth angle range ⁇ 10 to ⁇ 20. Converted to the corrected image data 683 'set by xl0 to x20 and yl0 to y20 with the horizontal direction as the X axis and the vertical direction as the vertical axis. By this conversion, the person image 644 of the conference person 604 acquired in the rn coordinate system is converted into a corrected person image 654 in the XY coordinate system (planar coordinate system). By converting to the XY coordinate system in this way, the corrected human image 654 becomes close to the natural image of the conference person 604.
  • the video processing unit 22 attaches time information to the corrected image data including the acquired correction material image 670 and the corrected image data including the corrected human image 654, and outputs it to the communication terminal 5 as video data. Generation and output of such corrected image data are performed sequentially. If the received speaker orientation information ⁇ changes, only the corrected image data including the corrected human image is switched according to the change. Video data is output.
  • the communication terminal 5 uses the video data from the video processing unit 22, the collected voice data, and the speaker orientation information.
  • Communication data is generated by associating with ⁇ , and transmitted to the video conference apparatus of the other party via the network 500.
  • Communication data
  • Communication data is generated by associating with ⁇ , and transmitted to the video conference apparatus of the other party via the network 500.
  • the processing and network load are reduced by the amount of data of the document image, so that processing and transmission can be performed at higher speed.
  • the document image acquisition timing is different from the previous image by providing an image analysis unit that can input the acquisition operation from the operation unit when a new document is placed. Time may be a new acquisition timing.
  • the power shown in the example in which the video processing unit is provided in the camera can be realized by a device independent of the camera, or the sound emitting and collecting device or the communication terminal. You may equip it.
  • a general-purpose video camera can be used as long as it has a lens capable of shooting the necessary area described above.
  • the communication terminal is provided independently of the sound emission and collection device, but the function of the communication terminal may be provided in the sound emission and collection device.
  • the number of components of the video conference apparatus is reduced, so that a simpler and smaller video conference apparatus can be realized.
  • the present invention is based on a Japanese patent application filed on December 19, 2006 (Japanese Patent Application No. 2006-341175), the contents of which are incorporated herein by reference.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Studio Devices (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
PCT/JP2007/074449 2006-12-19 2007-12-19 ビデオ会議装置 WO2008075726A1 (ja)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2006341175A JP4862645B2 (ja) 2006-12-19 2006-12-19 ビデオ会議装置
JP2006-341175 2006-12-19

Publications (1)

Publication Number Publication Date
WO2008075726A1 true WO2008075726A1 (ja) 2008-06-26

Family

ID=39536354

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2007/074449 WO2008075726A1 (ja) 2006-12-19 2007-12-19 ビデオ会議装置

Country Status (3)

Country Link
JP (1) JP4862645B2 (zh)
CN (1) CN101518049A (zh)
WO (1) WO2008075726A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012039195A (ja) * 2010-08-03 2012-02-23 Kokuyo Co Ltd テレビ会議用テーブルシステム
CN104580992A (zh) * 2014-12-31 2015-04-29 广东欧珀移动通信有限公司 一种控制方法及移动终端
CN104967777A (zh) * 2015-06-11 2015-10-07 广东欧珀移动通信有限公司 一种控制摄像头拍摄方法及终端

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101969541A (zh) * 2010-10-28 2011-02-09 上海杰图软件技术有限公司 全景视频通讯系统和方法
JP2013009304A (ja) 2011-05-20 2013-01-10 Ricoh Co Ltd 画像入力装置、会議装置、画像処理制御プログラム、記録媒体
CN104932665B (zh) * 2014-03-19 2018-07-06 联想(北京)有限公司 一种信息处理方法以及一种电子设备
CN105100677A (zh) * 2014-05-21 2015-11-25 华为技术有限公司 用于视频会议呈现的方法、装置和系统
CN104410778A (zh) * 2014-10-09 2015-03-11 深圳市金立通信设备有限公司 一种终端
CN104320729A (zh) * 2014-10-09 2015-01-28 深圳市金立通信设备有限公司 一种拾音方法
JP6450604B2 (ja) * 2015-01-28 2019-01-09 オリンパス株式会社 画像取得装置及び画像取得方法
CN105163024A (zh) * 2015-08-27 2015-12-16 华为技术有限公司 一种获取目标图像的方法以及目标追踪设备
CN107066039B (zh) * 2016-12-25 2020-02-18 重庆警蜂科技有限公司 便携式巡多功能数字庭审终端
CN106791538B (zh) * 2016-12-25 2019-08-27 重庆警蜂科技有限公司 用于巡回法庭的数字系统
CN108200515B (zh) * 2017-12-29 2021-01-22 苏州科达科技股份有限公司 多波束会议拾音系统及方法
JP7135360B2 (ja) 2018-03-23 2022-09-13 ヤマハ株式会社 発光表示スイッチ及び収音装置
CN113923305B (zh) * 2021-12-14 2022-06-21 荣耀终端有限公司 一种多屏协同的通话方法、系统、终端及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5436654A (en) * 1994-02-07 1995-07-25 Sony Electronics, Inc. Lens tilt mechanism for video teleconferencing unit
JPH07327217A (ja) * 1994-06-02 1995-12-12 Canon Inc 画像入力装置
JPH11331827A (ja) * 1998-05-12 1999-11-30 Fujitsu Ltd テレビカメラ装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005311619A (ja) * 2004-04-20 2005-11-04 Yakichiro Sakai 通信システム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5436654A (en) * 1994-02-07 1995-07-25 Sony Electronics, Inc. Lens tilt mechanism for video teleconferencing unit
JPH07327217A (ja) * 1994-06-02 1995-12-12 Canon Inc 画像入力装置
JPH11331827A (ja) * 1998-05-12 1999-11-30 Fujitsu Ltd テレビカメラ装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012039195A (ja) * 2010-08-03 2012-02-23 Kokuyo Co Ltd テレビ会議用テーブルシステム
CN104580992A (zh) * 2014-12-31 2015-04-29 广东欧珀移动通信有限公司 一种控制方法及移动终端
CN104967777A (zh) * 2015-06-11 2015-10-07 广东欧珀移动通信有限公司 一种控制摄像头拍摄方法及终端
CN104967777B (zh) * 2015-06-11 2018-03-27 广东欧珀移动通信有限公司 一种控制摄像头拍摄方法及终端

Also Published As

Publication number Publication date
JP4862645B2 (ja) 2012-01-25
JP2008154055A (ja) 2008-07-03
CN101518049A (zh) 2009-08-26

Similar Documents

Publication Publication Date Title
WO2008075726A1 (ja) ビデオ会議装置
EP1377041B1 (en) Integrated design for omni-directional camera and microphone array
CN109218651B (zh) 视频会议中的最佳视图选择方法
US5079627A (en) Videophone
US5612733A (en) Optics orienting arrangement for videoconferencing system
US7227566B2 (en) Communication apparatus and TV conference apparatus
JP2007228070A (ja) テレビ会議装置
JP6551155B2 (ja) 通信システム、通信装置、通信方法およびプログラム
US20040008423A1 (en) Visual teleconferencing apparatus
JP2017034502A (ja) 通信装置、通信方法、プログラムおよび通信システム
JP2008288785A (ja) テレビ会議装置
WO2001011881A1 (fr) Videophone
JP2007274463A (ja) 遠隔会議装置
JPS62167506A (ja) 会議用テ−ブル
JP2006121709A (ja) 天井マイクロホンアセンブリ
WO2008047804A1 (fr) Dispositif de conférence audio et système de conférence audio
JP4411959B2 (ja) 音声集音・映像撮像装置
JP2007274462A (ja) テレビ会議装置、テレビ会議システム
US7940923B2 (en) Speakerphone with a novel loudspeaker placement
JP2009171486A (ja) テレビ会議システム
JP2008005346A (ja) 音響反射装置
CN213213666U (zh) 一种视音频通讯设备
JP2005151471A (ja) 音声集音・映像撮像装置および撮像条件決定方法
KR20100006029A (ko) 원격 화상회의시스템
KR100565184B1 (ko) 단체화상회의시스템의음량제어회로

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200780034288.2

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07850919

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07850919

Country of ref document: EP

Kind code of ref document: A1