CN112511757B - Video conference implementation method and system based on mobile robot - Google Patents

Video conference implementation method and system based on mobile robot Download PDF

Info

Publication number
CN112511757B
CN112511757B CN202110157473.2A CN202110157473A CN112511757B CN 112511757 B CN112511757 B CN 112511757B CN 202110157473 A CN202110157473 A CN 202110157473A CN 112511757 B CN112511757 B CN 112511757B
Authority
CN
China
Prior art keywords
speaker
conference
module
camera
point cloud
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110157473.2A
Other languages
Chinese (zh)
Other versions
CN112511757A (en
Inventor
焦显伟
孟夏冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Telecom Easiness Information Technology Co Ltd
Original Assignee
Beijing Telecom Easiness Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Telecom Easiness Information Technology Co Ltd filed Critical Beijing Telecom Easiness Information Technology Co Ltd
Priority to CN202110157473.2A priority Critical patent/CN112511757B/en
Publication of CN112511757A publication Critical patent/CN112511757A/en
Application granted granted Critical
Publication of CN112511757B publication Critical patent/CN112511757B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/695Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Abstract

The invention provides a video conference realization method and a system based on a mobile robot, wherein the method comprises the following steps: calibrating the positions of a microphone and a camera, calibrating the positions and postures of an annular lens and a laser radar, building a picture by using laser slam, obtaining a point cloud distribution map of a conference room, and identifying a main area of the conference room; determining the position of a speaker and generating a central axis of the speaker; selecting different processing logics according to the position of the speaker; planning a path track in the point cloud map and moving to a target position; adjusting and shooting a holder according to the position, the posture and the speaker position of the mobile robot; the data is transmitted to the intelligent conference management. According to the invention, the position of the mobile robot is automatically adjusted according to the position of the speaker in the participants, and the camera on the robot can automatically move to a proper position to collect the information of the front visual angle of the speaker when the fixed camera cannot independently acquire the front visual angle of the complete speaker, so that the efficiency of the video conference is improved.

Description

Video conference implementation method and system based on mobile robot
Technical Field
The invention relates to the technical field of mobile robots, in particular to a video conference realization method and system based on a mobile robot.
Background
In the existing multi-person video conference implementation scene, the switching of the front view angle of a speaker and the close-up shot of the speaker is a function required by the multi-person video conference, and can bring real meeting place atmosphere feeling to other participants.
In current scheme, adopt many cameras to shoot speaker close-up shot usually, the camera generally needs the manual work to control, extravagant manpower, and is inefficient, and when speaker position, when not shooting the scope in the ideal of camera, can't reach the effect that close-up shot was gathered moreover.
Disclosure of Invention
In view of this, the present invention provides a video conference implementation method based on a mobile robot, which can automatically adjust the position of the robot according to the position of a speaker among participants, and the robot has a camera, and can automatically move to a proper position to collect information of the front view angle of the speaker when the fixed camera cannot independently acquire the front view angle of the speaker.
The invention provides a video conference implementation method based on a mobile robot, which comprises the following steps:
s1, installing each component in the intelligent conference management system under the same coordinate system, calibrating the positions of a microphone and a camera of the intelligent conference management system, calibrating the positions and postures of an annular lens and a laser radar, and performing laser slam mapping work by adopting a laser point cloud segmentation algorithm to obtain a point cloud distribution map of a conference room;
preferably, obtaining a point cloud distribution map of the conference room is completed by using A3_1 point cloud data;
the laser point cloud segmentation algorithm comprises the following steps:
a1: placing the robot with the annular lens and the laser radar installed in an open room;
a2: the laser radar starts working to obtain all point cloud data, and clustering is carried out according to the characteristics of the point cloud data into three types of farthest, common and nearest;
a3: due to the characteristics of the hardware structure, the laser beam has three distribution modes, and the following processing results are respectively obtained:
a3_ 1: irradiating a laser beam onto a transparent lens, wherein laser directly penetrates through the lens, and a laser point at the moment is irradiated onto a wall surface and marks angle information of the laser beam corresponding to point cloud data with the farthest distance measurement distance;
a3_ 2: when a laser beam irradiates a reflector, the laser is directly reflected to the ground, a laser point at the moment irradiates the ground, and the angle information of the laser beam is marked corresponding to point cloud data with common ranging distance;
a3_ 3: the laser beam does not carry out A3_1 and A3_2, the distance between the laser radar and the annular lens is directly measured, the point cloud is closest, the numerical value is fixed, and the point cloud is directly discarded;
s2, identifying a main area of the conference room in the point cloud map;
preferably, the main areas of the conference room such as desks, projector screens, etc.;
s3, determining the position of the speaker according to the speech of the speaker, and generating a central axis of the speaker;
the central axis is a directed line segment which takes the speaker position as a starting point and takes the central area of the desk as an end point;
the position analysis method for determining the speaker comprises the following steps:
Figure 444633DEST_PATH_IMAGE001
xa, Ya, Za, Xb, Yb, Zb, Xc, Yc, Zc, Xd, Yd, and Zd are coordinates of the microphone abcd, Tba is the time difference of sound reaching the microphones b and a, Tca is the time difference of sound reaching the microphones c and a, Tda is the time difference of sound reaching the microphones d and a, V is the propagation speed of sound, and x and y are final resolving results, namely the position of a person;
s4, selecting different processing logics according to the position of the speaker;
s5, planning a path track in a point cloud map according to the self position and the best shooting position of the mobile robot, and moving to a target position;
in the process, the obstacle avoidance and collision avoidance sub-module ensures the safety of equipment and personnel, and uses point cloud data in A3_ 2;
s6, after the target position is reached, adjusting a holder according to the position, the posture and the speaker position of the mobile robot, and starting shooting;
preferably, in the moving process, the pan-tilt can be adjusted according to the position, the posture and the speaker position of the robot to shoot;
and S7, transmitting the data to the intelligent conference management system, and ending the process.
Further, the step of S4 includes:
4 a: if the camera of the intelligent conference management can directly acquire the front visual angle of the position of the speaker, namely the close-up shot, the image data of the camera is directly selected, and the process is ended;
4 b: if the camera of the intelligent conference management does not have a proper camera for collecting the front visual angle of the position of the speaker, selecting the camera on the mobile robot, establishing connection with the mobile robot through second communication by the intelligent conference management, and sending an instruction, wherein the instruction content is the position of the speaker and the best shooting position;
preferably, the optimal photographing angle is a position opposite to the speaker and symmetrical to the desk as an axis of symmetry.
Further, in the step S3, the speaker position may be equivalently replaced by a conference focus, conference staff information is acquired through camera acquisition and image acquisition processing, the face orientation of the current participant is analyzed and judged, and a focus is extended and fitted according to the face orientations of a plurality of participants, and the focus is used as the conference focus.
Further, when the conference focus is outside the conference room, the mobile robot shoots the PPT position; when the meeting focus is among the participants, the mobile robot shoots the position;
when the conference focus is outside the conference room, preferably, the mobile robot shoots the PPT position by considering that the participants are watching PPT (or similar behaviors); when the conference focus is in the middle of the participant, it is preferable that the mobile robot photographs the position considering that the participant is watching an article spoken or displayed by the speaker.
The invention also provides an implementation system of the video conference implementation method, which comprises the following steps:
a calibration module: the intelligent conference management system is used for installing all components contained in the intelligent conference management system under the same coordinate system and calibrating the positions of the microphone and the camera; calibrating the positions and postures of the annular lens and the laser radar; performing laser slam mapping work by adopting a laser point cloud segmentation algorithm to obtain a point cloud distribution map of a conference room; the system comprises a point cloud map, a database and a database, wherein the point cloud map is used for identifying a main area of a meeting room;
a path planning and moving module: the system comprises a point cloud map, a target position and a position acquisition unit, wherein the point cloud map is used for planning a path track according to the self position and the best shooting position of the robot and moving to the target position;
a robot module: after the target position is reached, the cradle head is adjusted according to the position, the posture and the position of the speaker, and shooting is started;
a voiceprint positioning module: determining the position of a speaker according to the speech of the speaker, and generating a central axis of the speaker;
a conference management module: selecting different processing logics according to the position of the speaker;
a data transmission module: for transmitting data to the intelligent conference management system.
Further, the calibration module comprises a sensor and a positioning resolving unit;
wherein the sensor comprises a laser radar; the laser radar can also be used for an obstacle avoidance and collision avoidance submodule;
and the positioning calculation unit selects different algorithms according to the selected sensor to calculate the final position.
Furthermore, the voiceprint positioning module comprises a synchronizer, a plurality of microphones and an operation unit, and acquires the position of the speaker by acquiring the arrival time of the voice of the speaker;
the synchronizer generates a synchronization signal for unifying the time references of the microphones;
the microphones are used for collecting sound signals and acquiring the arrival time of the signals;
and the computing unit is used for computing the position of the speaker according to the arrival time of the sound signal collected by the microphone.
Furthermore, the robot module further comprises an obstacle avoidance and collision avoidance sub-module, a video acquisition and tracking sub-module and a first communication sub-module;
the video acquisition and tracking submodule comprises a camera and a holder; the camera is used for collecting image information, and the holder is used for rotating the camera and tracking a conference focus;
the first communication sub-module is a wireless communication device and is used for communicating with the second communication sub-module and transmitting data and commands;
the obstacle avoidance and collision avoidance sub-module comprises a laser radar, an annular lens and a gas collision avoidance detector;
the annular lens is used for reflecting laser lines of the laser radar to other angles and measuring the part outside the plane of the laser radar;
the gas collision avoidance detector comprises a gas bag and a gas pressure sensor, wherein the gas bag is distributed on the surface of the robot, and gas is filled in the gas bag; the air pressure sensor is positioned in the air bag and used for measuring air pressure. When the robot bumps, the gas bag can be compressed, and the pressure sensor detects that the pressure changes, triggering the anti-collision effect.
Further, the intelligent conference management system comprises an image acquisition processing sub-module and the second communication sub-module;
the image acquisition processing submodule comprises a conference focus judgment unit and a conference focus extraction module, wherein the conference focus judgment unit is used for processing image information and extracting a focus of a conference;
the focus of the conference is extracted by acquiring conference staff information through a camera, analyzing and judging the face orientation of the current participant, and extending and fitting a focus according to the face orientations of a plurality of participants, wherein the focus is used as a conference focus;
and the second communication sub-module is a wireless communication device and is used for communicating with the first communication sub-system and transmitting data and commands.
Further, the path planning and moving module comprises a path planning algorithm, a motor control algorithm and a robot vehicle body.
Compared with the prior art, the invention has the beneficial effects that:
the invention can automatically adjust the position of the robot according to the position of a speaker in participants, and the mobile robot is provided with the camera, so that the mobile robot can automatically move to a proper position to collect the information of the front visual angle of the speaker when the fixed camera cannot independently acquire the front visual angle of the complete speaker, thereby improving the efficiency of the video conference.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention.
In the drawings:
FIG. 1 is a block diagram of a system module according to an embodiment of the present invention;
FIG. 2 is a block diagram of a positioning module according to an embodiment of the present invention;
FIG. 3 is a block diagram of an obstacle avoidance and collision avoidance module according to an embodiment of the present invention;
FIG. 4 is a schematic front and top view of a toroidal lens construction according to an embodiment of the present invention;
fig. 5 is a flowchart of a video conference implementation method based on a mobile robot according to the present invention.
Reference numerals in fig. 4 denote:
121. a laser radar; 122. an annular lens; 122_1, a mirror; 122_2, transparent lenses; 13. a video acquisition and tracking subsystem; 14. a path planning and moving subsystem.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, and third may be used in this disclosure to describe various information, this information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
The embodiment of the invention provides a video conference implementation method based on a mobile robot, which is shown in fig. 5 and comprises the following steps:
s1, installing each component in the intelligent conference management system under the same coordinate system, calibrating the positions of a microphone and a camera of the intelligent conference management system, calibrating the positions and postures of an annular lens and a laser radar, and performing laser slam mapping work by adopting a laser point cloud segmentation algorithm to obtain a point cloud distribution map of a conference room;
preferably, obtaining a point cloud distribution map of the conference room is completed by using A3_1 point cloud data;
the laser point cloud segmentation algorithm comprises the following steps:
a1: placing the robot with the annular lens and the laser radar installed in an open room;
a2: the laser radar starts working to obtain all point cloud data, and clustering is carried out according to the characteristics of the point cloud data into three types of farthest, common and nearest;
a3: due to the characteristics of the hardware structure, the laser beam has three distribution modes, and the following processing results are respectively obtained:
a3_ 1: irradiating a laser beam onto a transparent lens, wherein laser directly penetrates through the lens, and a laser point at the moment is irradiated onto a wall surface and marks angle information of the laser beam corresponding to point cloud data with the farthest distance measurement distance;
a3_ 2: when a laser beam irradiates a reflector, the laser is directly reflected to the ground, a laser point at the moment irradiates the ground, and the angle information of the laser beam is marked corresponding to point cloud data with common ranging distance;
a3_ 3: the laser beam does not carry out A3_1 and A3_2, the distance between the laser radar and the annular lens is directly measured, the point cloud is closest, the numerical value is fixed, and the point cloud is directly discarded;
s2, identifying a main area of the conference room in the point cloud map;
preferably, the main areas of the conference room such as desks, projector screens, etc.;
s3, determining the position of the speaker according to the speech of the speaker, and generating a central axis of the speaker;
the central axis is a directed line segment which takes the speaker position as a starting point and takes the central area of the desk as an end point;
the position analysis method for determining the speaker comprises the following steps:
Figure 630895DEST_PATH_IMAGE002
xa, Ya, Za, Xb, Yb, Zb, Xc, Yc, Zc, Xd, Yd, and Zd are coordinates of the microphone abcd, Tba is the time difference of sound reaching the microphones b and a, Tca is the time difference of sound reaching the microphones c and a, Tda is the time difference of sound reaching the microphones d and a, V is the propagation speed of sound, and x and y are final resolving results, namely the position of a person;
s4, selecting different processing logics according to the position of the speaker;
s5, planning a path track in a point cloud map according to the self position and the best shooting position of the mobile robot, and moving to a target position;
in the process, the obstacle avoidance and collision avoidance sub-module ensures the safety of equipment and personnel, and uses point cloud data in A3_ 2;
s6, after the target position is reached, adjusting a holder according to the position, the posture and the speaker position of the mobile robot, and starting shooting;
preferably, during the moving process, the pan-tilt can be adjusted according to the position and the posture of the vehicle and the position of the speaker, so as to shoot;
and S7, transmitting the data to the intelligent conference management system, and ending the process.
The step of S4 includes:
4 a: if the camera of the intelligent conference management can directly acquire the front visual angle of the position of the speaker, namely the close-up shot, the image data of the camera is directly selected, and the process is ended;
4 b: if the camera of the intelligent conference management does not have a proper camera for collecting the front visual angle of the position of the speaker, selecting the camera on the mobile robot, establishing connection with the mobile robot through second communication by the intelligent conference management, and sending an instruction, wherein the instruction content is the position of the speaker and the best shooting position;
preferably, the optimal photographing angle is a position opposite to the speaker and symmetrical to the desk as an axis of symmetry.
And in the step S3, the speaker position may be equivalently replaced by a conference focus, conference staff information is acquired through camera acquisition and image acquisition processing, the face orientation of the current participant is analyzed and judged, and a focus is extended and fitted according to the face orientations of a plurality of participants, and is used as the conference focus.
When the conference focus is located outside the conference room, the mobile robot shoots the PPT position; when the meeting focus is among the participants, the mobile robot shoots the position;
when the conference focus is outside the conference room, preferably, the mobile robot shoots the PPT position by considering that the participants are watching PPT (or similar behaviors); when the conference focus is in the middle of the participant, it is preferable that the mobile robot photographs the position considering that the participant is watching an article spoken or displayed by the speaker.
The invention also provides an implementation system of the video conference implementation method, which comprises the following steps:
a calibration module: the intelligent conference management system is used for installing all components contained in the intelligent conference management system under the same coordinate system and calibrating the positions of the microphone and the camera; calibrating the positions and postures of the annular lens and the laser radar; performing laser slam mapping work by adopting a laser point cloud segmentation algorithm to obtain a point cloud distribution map of a conference room; the system comprises a point cloud map, a database and a database, wherein the point cloud map is used for identifying a main area of a meeting room;
a path planning and moving module: the system comprises a point cloud map, a target position and a position acquisition unit, wherein the point cloud map is used for planning a path track according to the self position and the best shooting position of the robot and moving to the target position;
a robot module: after the target position is reached, the cradle head is adjusted according to the position, the posture and the position of the speaker, and shooting is started;
a voiceprint positioning module: determining the position of a speaker according to the speech of the speaker, and generating a central axis of the speaker;
a conference management module: selecting different processing logics according to the position of the speaker;
a data transmission module: for transmitting data to the intelligent conference management system.
The calibration module comprises a sensor and a positioning resolving unit;
wherein the sensor comprises a laser radar; the laser radar can also be used for an obstacle avoidance and collision avoidance submodule;
and the positioning calculation unit selects different algorithms according to the selected sensor to calculate the final position.
The voiceprint positioning module comprises a synchronizer, a plurality of microphones and an operation unit, and acquires the position of a speaker by acquiring the arrival time of the voice of the speaker;
the synchronizer generates a synchronization signal for unifying the time references of the microphones;
the microphones are used for collecting sound signals and acquiring the arrival time of the signals;
and the computing unit is used for computing the position of the speaker according to the arrival time of the sound signal collected by the microphone.
The robot module further comprises an obstacle avoidance and collision avoidance submodule, a video acquisition and tracking submodule and a first communication submodule;
the video acquisition and tracking submodule comprises a camera and a holder; the camera is used for collecting image information, and the holder is used for rotating the camera and tracking a conference focus;
the first communication sub-module is a wireless communication device and is used for communicating with the second communication sub-module and transmitting data and commands;
the obstacle avoidance and collision avoidance sub-module comprises a laser radar, an annular lens and a gas collision avoidance detector;
the annular lens is used for reflecting laser lines of the laser radar to other angles and measuring the part outside the plane of the laser radar;
the gas collision avoidance detector comprises a gas bag and a gas pressure sensor, wherein the gas bag is distributed on the surface of the robot, and gas is filled in the gas bag; the air pressure sensor is positioned in the air bag and used for measuring air pressure. When the robot bumps, the gas bag can be compressed, and the pressure sensor detects that the pressure changes, triggering the anti-collision effect.
The intelligent conference management system comprises an image acquisition processing submodule and the second communication submodule;
the image acquisition processing submodule comprises a conference focus judgment unit and a conference focus extraction module, wherein the conference focus judgment unit is used for processing image information and extracting a focus of a conference;
the focus of the conference is extracted by acquiring conference staff information through a camera, analyzing and judging the face orientation of the current participant, and extending and fitting a focus according to the face orientations of a plurality of participants, wherein the focus is used as a conference focus;
and the second communication sub-module is a wireless communication device and is used for communicating with the first communication sub-system and transmitting data and commands.
The path planning and moving module comprises a path planning algorithm, a motor control algorithm and a robot body and is used for planning a moving path of the robot and controlling the robot to move to the position.
The embodiment of the invention is shown in figures 1, 2 and 3 and comprises a mobile robot and an intelligent conference management system.
The mobile robot comprises a calibration module, an obstacle avoidance and collision avoidance sub-module, a video acquisition and tracking sub-module, a path planning and moving module and a first communication sub-module.
And the calibration module comprises a sensor and a positioning calculation unit.
The sensor is generally a laser radar, and other sensors do not need to be installed in the environment for matching; additionally, ultra-wideband positioning and RFID positioning sensors can be selected, and the sensors need to be matched with other sensors installed in the external environment. The laser radar can also be used for obstacle avoidance and collision avoidance.
The positioning calculation unit selects different algorithms according to different selected sensors to calculate the final position, and generally, when a laser radar or a camera is used, the positioning is performed by using algorithms such as SLAM and the like; when the ultra-wideband positioning and RFID positioning sensors are used, triangulation methods such as electromagnetic wave arrival time/angle are used for positioning.
Keep away barrier and anticollision submodule piece, including laser radar, annular lens, gaseous anticollision detector.
The annular lens is used for reflecting laser lines of the laser radar to other angles and is used for measuring the part outside the plane of the laser radar.
A gas collision avoidance detector includes a gas bag and a gas pressure sensor. Wherein the gas bag is distributed on the surface of the robot like the outer coating, and the gas is filled in the gas bag; the air pressure sensor is positioned inside the air bag and can measure air pressure. When the robot bumps, the gas bag can be compressed, and the pressure sensor detects that the pressure changes, triggering the anti-collision effect.
As shown in the schematic front view and the top view of the annular lens structure shown in fig. 4, the annular lens 122 is a circular ring type structure, and the circular ring type structure is composed of a reflective mirror 122_1 and a transparent lens 122_2, which are alternately distributed, and the installation position is parallel to the scanning surface of the laser radar 121. When a laser beam irradiates the reflector 122_1, the laser beam is directly reflected to the ground, is 121_1 and is used for detecting whether the vertical angle of the side surface of the robot is blocked; when the laser beam irradiates on the transparent lens 122_2, the laser directly penetrates through the transparent lens 122_2, and the laser beam is 121_2 and is used for detecting a horizontal angle obstacle of the robot.
Preferably, the distribution of the reflective mirror 122_1 and the transparent mirror 122_2 should be calibrated before use.
Preferably, in order to match the laser radar 121 with the annular lens 122, a laser point cloud segmentation algorithm is further included:
a1: and (4) placing the robot with the annular lens and the laser radar installed in an open room.
A2: the laser radar starts working to obtain all point cloud data, and according to the characteristics of the point cloud data, the point cloud data are clustered into three types of farthest, common and nearest points.
A3: due to the characteristics of the hardware structure, the laser beam has three distribution modes, and the following processing results are obtained respectively:
a3_ 1: and irradiating the laser beam onto the transparent lens, wherein the laser directly penetrates through the lens, and the laser point at the moment is irradiated onto the wall surface and marks the angle information of the laser beam corresponding to the point cloud data with the farthest distance measurement distance.
A3_ 2: when a laser beam irradiates the reflector, the laser is directly reflected to the ground, the laser point irradiates the ground, and the angle information of the laser beam is marked corresponding to the point cloud data with the common ranging distance.
A3_ 3: the laser beam does not carry out the two conditions of A3_1 and A3_2, the distance between the laser radar and the annular lens is directly measured, the point cloud is nearest, the numerical value is fixed, and the point cloud is directly discarded.
Video acquisition and tracking submodule includes camera, cloud platform. The camera is used for collecting image information, and the holder is used for rotating the camera and tracking a conference focus.
And the path planning and moving module comprises a path planning algorithm, a motor control algorithm and a robot body and is used for planning the moving path of the robot and controlling the robot to move to the position.
The first communication sub-module is a wireless communication device and is used for communicating data and commands with the second communication module subsystem.
The intelligent conference management system comprises a voiceprint positioning module, an image acquisition processing sub-module and a second communication sub-module.
The voiceprint positioning module comprises a synchronizer, a plurality of microphones and an operation unit, and acquires the position of a speaker by acquiring the arrival time of the voice of the speaker.
The synchronizer can generate a synchronous signal which can synchronize all the microphones to work under the same clock system.
The number of the microphones is four or more, and the microphones are used for collecting sound information when a conference occurs and recording the time when the information reaches the microphones.
The microphone needs to be calibrated before being used, the calibration content is the installation position of the microphone, and the calibration is carried out through measuring equipment such as a total station.
The microphone will also mark the arrival time of the sound when collecting the sound information. According to the number of the microphones (numbers a, b, c, d, etc.), the numbers are marked as ta, tb, tc, td, etc., wherein ta is the time when the signal reaches the number a microphone, and tb is the time when the signal reaches the number b microphone, and the ratio is the same.
And the operation unit analyzes the position of the current speaker according to the acquired information and determines the position of the speaker.
Further, the position analysis method is as follows:
Figure 760525DEST_PATH_IMAGE003
xa, Ya, Za, Xb, Yb, Zb, Xc, Yc, Zc, Xd, Yd, and Zd are the coordinates of the microphone abcd, Tba is the time difference of sound reaching the microphones b and a, Tca is the time difference of sound reaching the microphones c and a, Tda is the time difference of sound reaching the microphones d and a, V is the propagation speed of sound, and x and y are the final calculation results, i.e. the position of the person.
And the image acquisition processing submodule comprises a plurality of cameras and a conference focus judgment unit and is used for acquiring image information and processing and acquiring the focus of the conference.
The conference focus extraction is to collect conference staff information through a camera, analyze and judge the face orientation of the current participant, prolong and fit a focus according to the face orientations of a plurality of participants, and the focus is used as a conference focus.
And the second communication sub-module is a wireless communication device and is used for communicating with the first communication sub-module and transmitting data and commands.
Compared with the prior art, the invention has the beneficial effects that:
the invention can automatically adjust the position of the robot according to the position of a speaker in participants, and the mobile robot is provided with the camera, so that the mobile robot can automatically move to a proper position to collect the information of the front visual angle of the speaker when the fixed camera cannot independently acquire the front visual angle of the complete speaker, thereby improving the efficiency of the video conference.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Without departing from the principle of the invention, a person skilled in the art can make the same changes or substitutions on the related technical features, and the technical solutions after the changes or substitutions will fall within the protection scope of the invention.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention; various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, substitution and improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (9)

1. A video conference realization method based on a mobile robot is characterized by comprising the following steps:
s1, installing each component in the intelligent conference management system under the same coordinate system, calibrating the positions of a microphone and a camera of the intelligent conference management system, calibrating the positions and postures of an annular lens and a laser radar, and performing laser slam mapping work by adopting a laser point cloud segmentation algorithm to obtain a point cloud distribution map of a conference room;
the laser point cloud segmentation algorithm comprises the following steps:
a1: placing the robot with the annular lens and the laser radar installed in an open room;
a2: the laser radar starts working to obtain all point cloud data, and clustering is carried out according to the characteristics of the point cloud data into three types of farthest, common and nearest;
a3: due to the characteristics of the hardware structure, the laser beam has three distribution modes, and the following processing results are respectively obtained:
a3_ 1: irradiating a laser beam onto the transparent lens, wherein the laser directly penetrates through the lens, and a laser point at the moment is irradiated onto a wall surface and marks angle information of the laser beam corresponding to point cloud data with the farthest distance measurement distance;
a3_ 2: when a laser beam irradiates a reflector, the laser is directly reflected to the ground, a laser point at the moment irradiates the ground, and the angle information of the laser beam is marked corresponding to point cloud data with common ranging distance;
a3_ 3: the laser beam does not carry out A3_1 and A3_2, the distance between the laser radar and the annular lens is directly measured, the point cloud is closest, the numerical value is fixed, and the point cloud is directly discarded;
s2, identifying a main area of the conference room in the point cloud map;
s3, determining the position of the speaker according to the speech of the speaker, and generating a central axis of the speaker;
the central axis is a directed line segment which takes the speaker position as a starting point and takes the central area of the desk as an end point;
the position analysis method for determining the speaker comprises the following steps:
Figure DEST_PATH_IMAGE002
xa, Ya, Za, Xb, Yb, Zb, Xc, Yc, Zc, Xd, Yd, and Zd are the coordinates of the microphone abcd, Tba is the time difference of sound reaching the microphones b and a, Tca is the time difference of sound reaching the microphones c and a, Tda is the time difference of sound reaching the microphones d and a, V is the propagation speed of sound, and x and y are the final resolving results, namely the position of the person;
s4, selecting different processing logics according to the position of the speaker; the method comprises the following steps:
4 a: if the camera of the intelligent conference management can directly acquire the front visual angle of the position of the speaker, namely the close-up shot, the image data of the camera is directly selected, and the process is ended;
4 b: if the camera of the intelligent conference management does not have a proper camera for collecting the front visual angle of the position of the speaker, selecting the camera on the mobile robot, establishing connection with the mobile robot through second communication by the intelligent conference management, and sending an instruction, wherein the instruction content is the position of the speaker and the best shooting position;
s5, planning a path track in a point cloud map according to the self position and the best shooting position of the mobile robot, and moving to a target position;
s6, after the target position is reached, adjusting a holder according to the position, the posture and the speaker position of the mobile robot, and starting shooting;
and S7, transmitting the data to the intelligent conference management system, and ending the process.
2. The method of claim 1, wherein in step S3, the speaker position is equivalently replaced by a conference focus, and the conference position is acquired by a camera, and the image acquisition process acquires conference information, analyzes and determines the face orientation of the current participant, and extends and fits a focus according to the face orientations of a plurality of participants, where the focus is used as the conference focus.
3. The video conference implementation method of claim 2, wherein when the conference focus is outside a conference room, the mobile robot shoots a PPT location; when the meeting focus is among the participants, the mobile robot shoots the position.
4. The system for implementing the video conference implementation method according to any one of claims 1 to 3, comprising:
a calibration module: the intelligent conference management system is used for installing all components contained in the intelligent conference management system under the same coordinate system and calibrating the positions of the microphone and the camera; calibrating the positions and postures of the annular lens and the laser radar; performing laser slam mapping work by adopting a laser point cloud segmentation algorithm to obtain a point cloud distribution map of a conference room; the system comprises a point cloud map, a database and a database, wherein the point cloud map is used for identifying a main area of a meeting room;
a path planning and moving module: the system comprises a point cloud map, a target position and a position acquisition unit, wherein the point cloud map is used for planning a path track according to the self position and the best shooting position of the robot and moving to the target position;
a robot module: after the target position is reached, the cradle head is adjusted according to the position, the posture and the position of the speaker, and shooting is started;
a voiceprint positioning module: determining the position of a speaker according to the speech of the speaker, and generating a central axis of the speaker;
a conference management module: selecting different processing logics according to the position of the speaker; the method comprises the following steps:
if the camera of the intelligent conference management can directly acquire the front visual angle of the position of the speaker, namely the close-up shot, the image data of the camera is directly selected, and the process is ended;
if the camera of the intelligent conference management does not have a proper camera for collecting the front visual angle of the position of the speaker, selecting the camera on the mobile robot, establishing connection with the mobile robot through second communication by the intelligent conference management, and sending an instruction, wherein the instruction content is the position of the speaker and the best shooting position;
a data transmission module: for transmitting data to the intelligent conference management system.
5. The video conference implementation system of claim 4, wherein the calibration module comprises a sensor and a positioning solution unit;
wherein the sensor comprises a laser radar; the laser radar can also be used for an obstacle avoidance and collision avoidance module;
and the positioning calculation unit selects different algorithms according to the selected sensor to calculate the final position.
6. The video conference realization system of claim 4, wherein the voiceprint positioning module comprises a synchronizer, a plurality of microphones, and an arithmetic unit, and acquires the position of the speaker by collecting the arrival time of the voice of the speaker;
the synchronizer generates a synchronization signal for unifying the time references of the microphones;
the microphones are used for collecting sound signals and acquiring the arrival time of the signals;
and the computing unit is used for computing the position of the speaker according to the arrival time of the sound signal collected by the microphone.
7. The video conference implementation system of claim 4, wherein the robot module further comprises an obstacle avoidance and collision avoidance sub-module, a video acquisition and tracking sub-module, and a first communication sub-module;
the video acquisition and tracking submodule comprises a camera and a holder; the camera is used for collecting image information, and the holder is used for rotating the camera and tracking a conference focus;
the first communication sub-module is a wireless communication device and is used for communicating with the second communication sub-module and transmitting data and commands;
the collision avoidance and collision avoidance sub-module comprises a laser radar, an annular lens and a gas collision avoidance detector;
the annular lens is used for reflecting laser lines of the laser radar to other angles and measuring the part outside the plane of the laser radar;
the gas collision avoidance detector comprises a gas bag and a gas pressure sensor, wherein the gas bag is distributed on the surface of the robot, and gas is filled in the gas bag; the air pressure sensor is positioned in the air bag and used for measuring air pressure.
8. The video conference implementation system of claim 4, wherein the intelligent conference management system comprises an image acquisition processing sub-module and a second communication sub-module;
the image acquisition processing submodule comprises a conference focus judgment unit and a conference focus extraction module, wherein the conference focus judgment unit is used for processing image information and extracting a focus of a conference;
the focus of the conference is extracted by acquiring conference staff information through a camera, analyzing and judging the face orientation of the current participant, and extending and fitting a focus according to the face orientations of a plurality of participants, wherein the focus is used as a conference focus;
the second communication sub-module is a wireless communication device and is used for communicating with the first communication sub-module and transmitting data and commands.
9. The video conference fulfillment system of claim 4 wherein said path planning and movement module comprises a path planning algorithm, a motor control algorithm and a robotic vehicle body.
CN202110157473.2A 2021-02-05 2021-02-05 Video conference implementation method and system based on mobile robot Active CN112511757B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110157473.2A CN112511757B (en) 2021-02-05 2021-02-05 Video conference implementation method and system based on mobile robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110157473.2A CN112511757B (en) 2021-02-05 2021-02-05 Video conference implementation method and system based on mobile robot

Publications (2)

Publication Number Publication Date
CN112511757A CN112511757A (en) 2021-03-16
CN112511757B true CN112511757B (en) 2021-05-04

Family

ID=74952681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110157473.2A Active CN112511757B (en) 2021-02-05 2021-02-05 Video conference implementation method and system based on mobile robot

Country Status (1)

Country Link
CN (1) CN112511757B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115695708A (en) * 2022-10-27 2023-02-03 深圳奥尼电子股份有限公司 Audio and video control method with wireless microphone intelligent tracking function
CN116996801B (en) * 2023-09-25 2023-12-12 福州天地众和信息技术有限公司 Intelligent conference debugging speaking system with wired and wireless access AI

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7593030B2 (en) * 2002-07-25 2009-09-22 Intouch Technologies, Inc. Tele-robotic videoconferencing in a corporate environment
JP4989596B2 (en) * 2008-09-17 2012-08-01 日本放送協会 Shooting shot control device, cooperative shooting system and program thereof
CN102572373A (en) * 2011-12-30 2012-07-11 南京超然科技有限公司 Image acquisition automatic control system and method for video conference
CN105611167B (en) * 2015-12-30 2020-01-31 联想(北京)有限公司 focusing plane adjusting method and electronic equipment
CN109413359B (en) * 2017-08-16 2020-07-28 华为技术有限公司 Camera tracking method, device and equipment
CN109492521B (en) * 2018-09-13 2022-05-13 北京米文动力科技有限公司 Face positioning method and robot
CN109451207B (en) * 2018-10-22 2020-09-04 中研电子(杭州)有限公司 Camera robot with complex illumination change
KR20190095181A (en) * 2019-07-25 2019-08-14 엘지전자 주식회사 Video conference system using artificial intelligence

Also Published As

Publication number Publication date
CN112511757A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN112511757B (en) Video conference implementation method and system based on mobile robot
US7769203B2 (en) Target object detection apparatus and robot provided with the same
CN109737981B (en) Unmanned vehicle target searching device and method based on multiple sensors
US20090167867A1 (en) Camera control system capable of positioning and tracking object in space and method thereof
JP2016177640A (en) Video monitoring system
CN105718862A (en) Method, device and recording-broadcasting system for automatically tracking teacher via single camera
JP2008087140A (en) Speech recognition robot and control method of speech recognition robot
CN106162144A (en) A kind of visual pattern processing equipment, system and intelligent machine for overnight sight
CN108259827B (en) Method, device, AR equipment and system for realizing security
Aliakbarpour et al. An efficient algorithm for extrinsic calibration between a 3d laser range finder and a stereo camera for surveillance
CN106370160A (en) Robot indoor positioning system and method
CN109857112A (en) Obstacle Avoidance and device
CN109773783A (en) A kind of patrol intelligent robot and its police system based on spatial point cloud identification
CN112863113A (en) Intelligent fire-fighting system and method for automatic detector alarming and fire extinguishing and storage medium
CN106352871A (en) Indoor visual positioning system and method based on artificial ceiling beacon
KR101750390B1 (en) Apparatus for tracing and monitoring target object in real time, method thereof
JP2018173707A (en) Person estimation system and estimation program
KR20200067286A (en) 3D scan and VR inspection system of exposed pipe using drone
KR20170058612A (en) Indoor positioning method based on images and system thereof
JP2009129058A (en) Position specifying apparatus, operation instruction apparatus, and self-propelled robot
CN111596259A (en) Infrared positioning system, positioning method and application thereof
US11832016B2 (en) 3D tour photographing apparatus and method
CN113014658B (en) Device control, device, electronic device, and storage medium
CN112601021B (en) Method and system for processing monitoring video of network camera
WO2018150515A1 (en) Image database creation device, location and inclination estimation device, and image database creation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant