CN112911189B - Intelligent base station system supporting non-terminal user and communication method - Google Patents

Intelligent base station system supporting non-terminal user and communication method Download PDF

Info

Publication number
CN112911189B
CN112911189B CN201911225930.6A CN201911225930A CN112911189B CN 112911189 B CN112911189 B CN 112911189B CN 201911225930 A CN201911225930 A CN 201911225930A CN 112911189 B CN112911189 B CN 112911189B
Authority
CN
China
Prior art keywords
voice
user
video
base station
subsystem
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911225930.6A
Other languages
Chinese (zh)
Other versions
CN112911189A (en
Inventor
吴建军
李昊尘
刘宇邦
徐开明
王凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201911225930.6A priority Critical patent/CN112911189B/en
Publication of CN112911189A publication Critical patent/CN112911189A/en
Application granted granted Critical
Publication of CN112911189B publication Critical patent/CN112911189B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B11/00Transmission systems employing sonic, ultrasonic or infrasonic waves
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/22Adaptations for optical transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Quality & Reliability (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

An intelligent base station system supporting a non-terminal user, comprising: 1) the voice subsystem comprises a microphone array, an ultrasonic transducer, an ultrasonic carrier generation and modulation unit and a voice signal processing unit; 2) the video subsystem comprises a camera (group), a projection unit and a video signal processing unit; 3) the communication subsystem comprises an antenna radio frequency unit, a Distributed Unit (DU) in the baseband signal processing unit and a group of network interfaces; 4) a base station iron tower and a matched cabinet. A communication method of an intelligent base station system supporting a non-terminal user is characterized in that the access and data transmission of the non-terminal user are completed based on the cooperation of the voice subsystem, the video subsystem and the communication subsystem. The invention provides an intelligent base station system supporting a user without a terminal and a communication method, which are based on mobile edge calculation and can be deployed outdoors or indoors according to requirements, so that the user can complete network access and data transmission under the condition of no terminal, thereby realizing some functions the same as those of the user with the terminal.

Description

Intelligent base station system supporting non-terminal user and communication method
Technical Field
The invention discloses an intelligent base station system and a communication method supporting a non-terminal user, relates to a mobile communication technology and a signal acquisition and processing technology, in particular to a device and a system for mobile communication without a user terminal and a communication method thereof, and belongs to the fields of communication technology, signal processing technology and the like.
Background
With the rapid development and deployment of 5G and future communication technologies, the frequency band used by mobile communication is gradually increasing [3GPP Release 15TS 38series ], and then the density of base stations is greatly increased, and typical values of the coverage radius of the base stations from 2G mobile communication to 5G are 5-10 km, 2-5 km, 1-3 km, and 100 + 300 m, respectively. With the use of millimeter waves (mmWave), the coverage area of a base station is further reduced, and therefore, in order to achieve full coverage of an area, the density of deployed base stations is further increased, and it is expected that with the development of 5G, 6G and future communication technologies, ultra-dense cells are gradually deployed, the distance between base stations is gradually reduced to the order of tens of meters or even several meters, and thus, the distance of mobile communication also enters the range of distances where sound and video can be applied.
Mobile Edge Computing (MEC) is a 5G evolution-based architecture proposed by European Telecommunications Standards Institute (ETSI) [ Yun Chao Hu, Milan Paper, Dario Sabella, Nurit spoke and valley young. ETSI White Paper No.1 11Mobile Edge Computing-a key technology towards5G ], a technology for deeply fusing a Mobile access network and internet services, which can provide services and cloud Computing functions required by a user nearby by using a wireless access network, thereby creating a high-performance, low-delay and high-bandwidth telecommunications-level service environment, accelerating the rapid download of various contents, services and applications in the network, and enabling the user to enjoy an uninterrupted high-quality network experience.
Mobile communication using sound waves mainly includes the following two steps: far-field speech signal extraction and far-field speech signal transmission. The far-field voice signal extraction technology is applied to the current intelligent home system, and the voice can be accurately captured within the range of 5-10m by using a microphone array and a voice signal processing system to collect and process the voice. With the development of nonlinear acoustics, particularly with the proposal of an Acoustic parametric Array [ P J western velt.parametric acoustics Array [ J ]. j.acoust.soc.am,1963], a sound wave directional loudspeaker with high sound field directivity is rapidly developed and applied, the working principle is that audible sound is modulated onto ultrasonic waves by utilizing the characteristic of high ultrasonic wave directivity, and the nonlinear interaction of the ultrasonic waves and air is utilized to generate self-demodulation difference frequency audible sound with high directivity, so that the directional transmission of the audible sound wave [ Zhengzhen, high directivity audio wave theory and control research [ D ]. wuhan: china university of science and technology, 2005] [ M Yoneyama, J Fujimoto.the audio spot: An application of nonliner interaction of sound waves to a new type of sound maker design [ J ]. J.Acoustine Society of America, May 1983 ].
The face recognition is a recognition technology for carrying out identity recognition based on face feature information of a person, and is characterized in that a camera or a camera is used for collecting images or video streams containing the face, the face is automatically detected and tracked in the images, and then the face recognition is carried out on the detected face. The face recognition technology is subjected to three-step development processes of visible light image face recognition, three-dimensional image face recognition/thermal imaging face recognition and multi-light source face recognition based on active near-infrared images, the influence of changes of environmental factors such as light on face recognition is gradually solved, and in addition, the continuous development of traditional algorithms [ Zhoujie, Luchun rain, and the like ] face automatic recognition methods are summarized in [ J ]. electronic newspapers, 2000] and deep neural network algorithms [ face recognition technology summary based on deep learning, 2018], and the face recognition technology gradually enters more and more new application fields. At present, the face recognition technology has gradually played an increasingly important role in the fields of stores, traffic, public security, banks and the like, and a personal identity authentication system based on the face recognition technology has a wider application prospect in the future.
With the rapid development of projection technology, especially holographic projection technology, real-time and convenient human-computer interaction can be realized step by step. Projection technology mainly includes implementation using a Cathode Ray Tube (CRT) or a Liquid Crystal Display (LCD). Holographic projection technology (Front-projected Holographic Display) [ Improvements in and relating to microscopy, GB685286A, GABOR projections ] is a technology for recording and reproducing a true three-dimensional image of an object using the principles of interference and diffraction, and can not only produce a stereoscopic aerial illusion, but also enable the illusion to interact with a user.
However, in the existing mobile communication system, a user still needs to access a communication network by using a mobile terminal such as a mobile phone to complete data transmission.
Disclosure of Invention
The invention aims to provide an intelligent base station system and a communication method for supporting a user without a terminal, so as to overcome the defect that the prior art does not support the user without the terminal in mobile communication.
An intelligent base station system supporting a non-terminal user, comprising:
1) a speech subsystem comprising
a. The microphone array is used for receiving voice signals;
b. the ultrasonic transducer is used for sending ultrasonic audio signals;
c. an ultrasonic carrier generation and modulation unit;
d. the voice signal processing unit comprises a baseband signal processing module, a voice recognition module, a natural language processing module and a voice synthesis module;
2) a video subsystem comprising:
a. one or a group of cameras are used for video data acquisition;
b. a projection unit for video interaction;
c. the video signal processing unit comprises a baseband signal processing module, a video compression module and an image identification module;
3) a communication subsystem, comprising:
a. an Antenna radio frequency Unit (AAU);
b. a Distributed Unit (DU) in a baseband signal processing Unit (BBU);
c. a set of network interfaces for DU connection to other subsystems;
4) basic station iron tower and supporting rack, wherein:
a. the microphone array, the ultrasonic transducer, the camera, the projection unit and the antenna radio frequency unit (AAU) are arranged on the base station iron tower main body;
b. the voice signal processing unit, the video signal processing unit and the DU are arranged in a matched cabinet;
the AAU on the iron tower main body of the base station is connected to the DU in the matched cabinet, the microphone array and the ultrasonic transducer are connected to the voice signal processing unit in the matched cabinet, the camera and the projection unit are connected to the video signal processing unit in the matched cabinet, and the DU is connected with the voice subsystem and the video subsystem at the same time and is connected to the core network.
The ultrasonic transducers in the voice subsystem can be ultrasonic directional transducer groups, namely transducer groups consisting of single ultrasonic transducers, or ultrasonic transducer arrays, namely arrays consisting of ultrasonic transducer point groups; furthermore, each ultrasonic transducer in the ultrasonic directional transducer group needs to be configured with a corresponding mechanical adjusting device, the ultrasonic transducer array does not need mechanical adjustment, and each transducer point tuple in the ultrasonic transducer array can be electrically controlled to complete signal weighting and beam synthesis. Furthermore, the ultrasonic carrier generation and modulation unit of the voice subsystem can be integrated in the voice signal processing unit and placed in a matched cabinet; or integrated with the ultrasonic transducer to form an ultrasonic speaker together, and placed on the iron tower body on the base station.
The voice signal processing unit and the video signal processing unit are embedded system boards, can be realized by sharing one embedded system board, and can also be realized by respectively using one embedded system board; furthermore, if the voice subsystem and the video subsystem share one system board, data exchange can be carried out in a mode of sharing a memory; if the voice subsystem and the video subsystem respectively use one system board, data exchange can be carried out through interconnection of network ports. The ultrasonic transducer and the microphone array of the voice subsystem are connected to the voice signal processing unit through a network port, the camera and the projection unit of the video subsystem are connected to the video signal processing unit through the network port, the voice and video signal processing unit is connected to the DU through the network port, the AAU is connected to the DU through the optical fiber, and the DU is accessed to the core network through the optical fiber
Each intelligent base station can independently dispose a voice signal processing unit and a video signal processing unit, and a plurality of intelligent base stations can share one group of voice signal processing unit and video signal processing unit to carry out combined processing of data.
A voice subsystem and a video subsystem in the system are integrated with a communication subsystem of a base station to jointly form an intelligent base station for an outdoor communication scene; or a voice subsystem and a video subsystem in the system are integrated with the indoor micro base station to jointly form the indoor intelligent micro base station for an indoor communication scene.
A communication method of an intelligent base station system supporting a non-terminal user specifically comprises the following steps:
step one, a camera group in a video subsystem captures video data of a user without a terminal in real time and inputs the video data to a video signal processing unit for face detection and recognition so as to determine a user ID and a real-time position;
secondly, the projection unit projects in front of the user according to the position of the user and is used for interacting with the video of the user;
step three, a microphone array of the voice subsystem generates a synthesized beam pointing to the user for voice input of the user;
fourthly, an ultrasonic wave transducer of the voice subsystem generates ultrasonic wave beams to point to the user for voice broadcasting and voice information output;
and step five, the voice subsystem and the video subsystem are connected to a local DU through a network interface, and the local audio and video data are interacted with other user data or control data through the local DU.
The video monitoring system comprises a video subsystem and a video subsystem, wherein the video subsystem comprises 1 or more cameras to form a camera group, and the video monitoring of the whole coverage area is jointly completed by adopting a single-base-station independent working mode or a multi-base-station cooperation mode according to the coverage area and the use requirement of an intelligent base station.
The video signal processing system supports processing of video signals collected by 1 or more cameras, and comprises: performing face detection, identification and positioning in the acquired video so as to determine the ID and the position of the user without the terminal; and the method also supports map construction and mapping aiming at the use scene of single base station with multiple cameras or multiple base stations in cooperation.
The projection unit directly projects images or videos in the air in front of a user without other additional media according to the position of the user without a terminal; further, the projection employs 3D holographic projection.
A microphone array of the voice subsystem inputs acquired signals into a voice signal processing unit, the voice signal processing unit carries out A/D conversion and band-pass filtering on user signals of each path, and then based on the principle of a phased array, weighting coefficients of beam forming are calculated according to the positions of users and the positions of the users, so that 1 or more narrow synthesized beams are generated and are directed to the users to directionally acquire the voice signals of the users.
The voice signal processing unit inputs the obtained voice signals of all users into a voice recognition module for recognition; and according to the recognition result, calling a natural language processing module and a voice synthesis module to generate synthetic voice to interact with a user or directly transmitting voice data to the DU through a network interface to transmit to a far end.
The ultrasonic transducers of the speech subsystem may be ultrasonic directional transducer groups, i.e. transducer groups consisting of a single ultrasonic transducer, or ultrasonic transducer arrays, i.e. arrays consisting of groups of ultrasonic transducer dots. Each ultrasonic transducer in the ultrasonic directional transducer group needs to be provided with a corresponding mechanical adjusting device, a certain ultrasonic transducer is selected according to the position of a user and the self position and an optimization rule, the mechanical adjusting device rotates the ultrasonic transducer to point to the user to complete audio signal output, and each user is allocated with one ultrasonic transducer to carry out communication; the ultrasonic transducer array does not need mechanical adjustment, and based on the principle of phased array, the beam forming weighting coefficient of the array formed by the ultrasonic transducer point element groups is calculated according to the position of the user and the position of the user, so that 1 or more narrow synthesized beams are generated to point to each user, and the local synthesized voice transmitted from the voice signal processing unit or the voice data transmitted from the far end is transmitted to the user.
Before the ultrasonic transducer sends a voice signal, the method comprises the following steps: the ultrasonic carrier generation unit generates a local carrier signal, the local carrier signal and the voice signal pass through the ultrasonic modulation unit together, the voice signal is modulated to an ultrasonic frequency band, and then the voice signal is sent to each user through the ultrasonic transducer.
The ultrasonic signal is directly self-demodulated through the nonlinear action with the air, and when reaching the human ear, a sound signal with the frequency falling within the audible frequency spectrum range of the human ear is obtained; or the ultrasonic transducer simultaneously sends ultrasonic carrier reference signals, and original voice signals are demodulated after coherent superposition at the human ear; further, the ultrasonic carrier reference signal may be transmitted by the ultrasonic transducer of the base station, or may be transmitted by the ultrasonic transducers of other base stations, depending on the operating mode of the base station.
The voice subsystem and the video subsystem are connected to the DU through network interfaces, and the DU needs to support both non-end users and end users, that is, the DU needs to effectively manage and transmit data from the voice and video subsystems and data from the AAU in two directions.
The mobile communication of the user without the terminal can be completed by adopting a mode of single base station independent work or multi-base station cooperation, wherein the single base station independent work is that each base station independently provides service for the user in the coverage range of the base station through a local voice subsystem and a local video subsystem; multiple base stations cooperate, namely, a plurality of adjacent or similar base stations jointly use all voice and video subsystems to provide service for users in a coverage range;
the multi-base station cooperation can comprise:
1) the method comprises the following steps that the cameras of a plurality of base stations commonly acquire user video information in a coverage area, a high-precision map in the coverage area is constructed through map construction and mapping, then user images with high image quality are selected for subsequent identification, or video data acquired by the cameras are synthesized and compressed and then serve as uplink video data for transmission and the like;
2) the microphone arrays of a plurality of base stations jointly form a larger distributed microphone array, and the whole system can adjust the beam forming coefficient of each microphone so as to obtain a beam with narrower width, thereby improving the accuracy of user voice data acquisition;
3) the ultrasonic transducer arrays of the base stations can simultaneously transmit ultrasonic modulation signals by adjusting phase differences, so that the lobe width is reduced, and a synthesized beam with narrower width is obtained; or the base stations respectively transmit ultrasonic wave modulation signals and ultrasonic wave carrier reference signals, and coherent demodulation is carried out at the ears of the user, so that the signal intensity is improved;
4) and each base station performs cooperative calculation and data exchange through a data exchange path between the base stations.
Compared with the prior base station and the technical scheme, the invention has the beneficial effects that:
the invention provides an intelligent base station system supporting a user without a terminal and a communication method, which can enable the user to complete network access and data transmission under the condition of no terminal, thereby realizing some functions the same as those of the user with the terminal.
The invention provides a new application scene without a terminal aiming at a wireless access network part in a 5G and future mobile communication system, and provides an intelligent base station system supporting a user without a terminal based on Mobile Edge Computing (MEC) and a corresponding system scheme aiming at the application scene. With the development and large-scale deployment of future mobile communication, the density of base stations is gradually increased, and a foundation is laid for communication of a non-terminal user; with the development and perfection of far-field voice signal extraction, ultrasonic directional transmission and array technology and the realization and application of machine learning algorithms such as voice recognition and face recognition, the direct communication of users through voice, images and the like in a distance range of less than 10m becomes possible gradually. The intelligent base station system applicable to the non-terminal user comprises voice, video and communication subsystems, and is based on a scheme of mobile edge calculation, so that the load of a core network is greatly reduced, the real-time performance is good, the service quality is high, the intelligent base station system can be deployed outdoors or indoors according to requirements, the existing base station scheme is compatible, the expandability is good, and the intelligent base station system has certain reference significance for future intelligent base station systems and non-terminal communication modes.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the specific embodiments. The drawings are only for purposes of illustrating the particular embodiments and are not to be construed as limiting the invention. In the drawings:
fig. 1 is a schematic diagram of system components and functions of an intelligent base station of the present invention, wherein: 1-a base station iron tower main body, 2-an antenna radio frequency unit (AAU), 3-an ultrasonic transducer, 4-a microphone array, 5-a projection unit, 6-a camera (group), 7-a data connecting line (comprising network connection and optical fiber connection) between the base station iron tower and a cabinet, and 8-a matched cabinet (in which a DU, a voice and video signal processing unit and the like are placed);
FIG. 2 is an organizational chart of subsystems of the intelligent base station of the present invention;
fig. 3 is a schematic diagram of two composition modes of the ultrasonic transducer of the speech subsystem of the intelligent base station of the present invention, wherein: 201-mechanical adjustment means (e.g. motor), 202-ultrasonic directional transducer, 203-ultrasonic transducer array;
FIG. 4 is a schematic diagram of the architecture and processing flow of the voice subsystem of the intelligent base station according to the present invention;
FIG. 5 is a schematic diagram of the architecture and processing flow of the video subsystem of the smart base station according to the present invention;
FIG. 6 is a schematic diagram of an integration scheme of subsystems of an intelligent base station according to the present invention;
FIG. 7 is a schematic diagram of a multi-base station cooperative working mode according to the present invention;
fig. 8 is a schematic diagram of the mobile communication process of the user without terminal.
Detailed Description
An intelligent base station system supporting a non-terminal user, comprising:
1) a speech subsystem comprising
a. The microphone array is used for receiving voice signals;
b. the ultrasonic transducer is used for sending ultrasonic audio signals;
c. an ultrasonic carrier generation and modulation unit;
d. the voice signal processing unit comprises a baseband signal processing module, a voice recognition module, a natural language processing module and a voice synthesis module;
2) a video subsystem comprising:
a. one or a group of cameras are used for video data acquisition;
b. a projection unit for video interaction;
c. the video signal processing unit comprises a baseband signal processing module, a video compression module and an image identification module;
3) a communication subsystem, comprising:
a. an Antenna radio frequency Unit (AAU);
b. a Distributed Unit (DU) in a baseband signal processing Unit (BBU);
c. a set of network interfaces for DU connection to other subsystems;
4) basic station iron tower and supporting rack, wherein:
a. the microphone array, the ultrasonic transducer, the camera, the projection unit and the antenna radio frequency unit (AAU) are arranged on the base station iron tower main body;
b. the voice signal processing unit, the video signal processing unit and the DU are arranged in a matched cabinet;
the AAU on the iron tower main body of the base station is connected to the DU in the matched cabinet, the microphone array and the ultrasonic transducer are connected to the voice signal processing unit in the matched cabinet, the camera and the projection unit are connected to the video signal processing unit in the matched cabinet, and the DU is connected with the voice subsystem and the video subsystem at the same time and is connected to the core network.
The ultrasonic transducers in the voice subsystem can be ultrasonic directional transducer groups, namely transducer groups consisting of single ultrasonic transducers, or ultrasonic transducer arrays, namely arrays consisting of ultrasonic transducer point groups; furthermore, each ultrasonic transducer in the ultrasonic directional transducer group needs to be configured with a corresponding mechanical adjusting device, the ultrasonic transducer array does not need mechanical adjustment, and each transducer point tuple in the ultrasonic transducer array can be electrically controlled to complete signal weighting and beam synthesis. Furthermore, the ultrasonic carrier generation and modulation unit of the voice subsystem can be integrated in the voice signal processing unit and placed in a matched cabinet; or integrated with the ultrasonic transducer to form an ultrasonic speaker together, and placed on the iron tower body on the base station.
The voice signal processing unit and the video signal processing unit are embedded system boards, can be realized by sharing one embedded system board, and can also be realized by respectively using one embedded system board; furthermore, if the voice subsystem and the video subsystem share one system board, data exchange can be carried out in a mode of sharing a memory; if the voice subsystem and the video subsystem respectively use one system board, data exchange can be carried out through interconnection of network ports. The ultrasonic transducer and the microphone array of the voice subsystem are connected to the voice signal processing unit through a network port, the camera and the projection unit of the video subsystem are connected to the video signal processing unit through the network port, the voice and video signal processing unit is connected to the DU through the network port, the AAU is connected to the DU through the optical fiber, and the DU is accessed to the core network through the optical fiber
Each intelligent base station can independently dispose a voice signal processing unit and a video signal processing unit, and a plurality of intelligent base stations can share one group of voice signal processing unit and video signal processing unit to carry out combined processing of data.
A voice subsystem and a video subsystem in the system are integrated with a communication subsystem of a base station to jointly form an intelligent base station for an outdoor communication scene; or a voice subsystem and a video subsystem in the system are integrated with the indoor micro base station to jointly form the indoor intelligent micro base station for an indoor communication scene.
A communication method of an intelligent base station system supporting a non-terminal user is characterized in that: the method specifically comprises the following steps:
step one, a camera group in a video subsystem captures video data of a user without a terminal in real time and inputs the video data to a video signal processing unit for face detection and recognition so as to determine a user ID and a real-time position;
secondly, the projection unit projects in front of the user according to the position of the user and is used for interacting with the video of the user;
step three, a microphone array of the voice subsystem generates a synthesized beam pointing to the user for voice input of the user;
fourthly, an ultrasonic wave transducer of the voice subsystem generates ultrasonic wave beams to point to the user for voice broadcasting and voice information output;
and step five, the voice subsystem and the video subsystem are connected to a local DU through a network interface, and the local audio and video data are interacted with other user data or control data through the local DU.
The video monitoring system comprises a video subsystem and a video subsystem, wherein the video subsystem comprises 1 or more cameras to form a camera group, and the video monitoring of the whole coverage area is jointly completed by adopting a single-base-station independent working mode or a multi-base-station cooperation mode according to the coverage area and the use requirement of an intelligent base station.
The video signal processing system supports processing of video signals collected by 1 or more cameras, and comprises: performing face detection, identification and positioning in the acquired video so as to determine the ID and the position of the user without the terminal; and the method also supports map construction and mapping aiming at the use scene of single base station with multiple cameras or multiple base stations in cooperation.
The projection unit directly projects images or videos in the air in front of a user without other additional media according to the position of the user without a terminal; further, the projection employs 3D holographic projection.
A microphone array of the voice subsystem inputs acquired signals into a voice signal processing unit, the voice signal processing unit carries out A/D conversion and band-pass filtering on user signals of each path, and then based on the principle of a phased array, weighting coefficients of beam forming are calculated according to the positions of users and the positions of the users, so that 1 or more narrow synthesized beams are generated and are directed to the users to directionally acquire the voice signals of the users.
The voice signal processing unit inputs the obtained voice signals of all users into a voice recognition module for recognition; and according to the recognition result, calling a natural language processing module and a voice synthesis module to generate synthetic voice to interact with a user or directly transmitting voice data to the DU through a network interface to transmit to a far end.
The ultrasonic transducers of the speech subsystem may be ultrasonic directional transducer groups, i.e. transducer groups consisting of a single ultrasonic transducer, or ultrasonic transducer arrays, i.e. arrays consisting of groups of ultrasonic transducer dots. Each ultrasonic transducer in the ultrasonic directional transducer group needs to be provided with a corresponding mechanical adjusting device, a certain ultrasonic transducer is selected according to the position of a user and the self position and an optimization rule, the mechanical adjusting device rotates the ultrasonic transducer to point to the user to complete audio signal output, and each user is allocated with one ultrasonic transducer to carry out communication; the ultrasonic transducer array does not need mechanical adjustment, and based on the principle of phased array, the beam forming weighting coefficient of the array formed by the ultrasonic transducer point element groups is calculated according to the position of the user and the position of the user, so that 1 or more narrow synthesized beams are generated to point to each user, and the local synthesized voice transmitted from the voice signal processing unit or the voice data transmitted from the far end is transmitted to the user.
Before the ultrasonic transducer sends a voice signal, the method comprises the following steps: the ultrasonic carrier generation unit generates a local carrier signal, the local carrier signal and the voice signal pass through the ultrasonic modulation unit together, the voice signal is modulated to an ultrasonic frequency band, and then the voice signal is sent to each user through the ultrasonic transducer.
The ultrasonic signal is directly self-demodulated through the nonlinear action with the air, and when reaching the human ear, a sound signal with the frequency falling within the audible frequency spectrum range of the human ear is obtained; or the ultrasonic transducer simultaneously sends ultrasonic carrier reference signals, and original voice signals are demodulated after coherent superposition at the human ear; further, the ultrasonic carrier reference signal may be transmitted by the ultrasonic transducer of the base station, or may be transmitted by the ultrasonic transducers of other base stations, depending on the operating mode of the base station.
The voice subsystem and the video subsystem are connected to the DU through network interfaces, and the DU needs to support both non-end users and end users, that is, the DU needs to effectively manage and transmit data from the voice and video subsystems and data from the AAU in two directions.
The mobile communication of the user without the terminal can be completed by adopting a mode of single base station independent work or multi-base station cooperation, wherein the single base station independent work is that each base station independently provides service for the user in the coverage range of the base station through a local voice subsystem and a local video subsystem; multiple base stations cooperate, namely, a plurality of adjacent or similar base stations jointly use all voice and video subsystems to provide service for users in a coverage range;
the multi-base station cooperation can comprise:
1) the method comprises the following steps that the cameras of a plurality of base stations commonly acquire user video information in a coverage area, a high-precision map in the coverage area is constructed through map construction and mapping, then user images with high image quality are selected for subsequent identification, or video data acquired by the cameras are synthesized and compressed and then serve as uplink video data for transmission and the like;
2) the microphone arrays of a plurality of base stations jointly form a larger distributed microphone array, and the whole system can adjust the beam forming coefficient of each microphone so as to obtain a beam with narrower width, thereby improving the accuracy of user voice data acquisition;
3) the ultrasonic transducer arrays of the base stations can simultaneously transmit ultrasonic modulation signals by adjusting phase differences, so that the lobe width is reduced, and a synthesized beam with narrower width is obtained; or the base stations respectively transmit ultrasonic wave modulation signals and ultrasonic wave carrier reference signals, and coherent demodulation is carried out at the ears of the user, so that the signal intensity is improved;
4) and each base station performs cooperative calculation and data exchange through a data exchange path between the base stations.
The first embodiment is as follows:
specifically, according to a first aspect of the present invention, the present invention provides an intelligent base station system supporting a non-terminal user, comprising:
1) a speech subsystem comprising
a. Microphone array for speech signal reception
b. Ultrasonic transducer for ultrasonic audio signal transmission
c. Ultrasonic carrier generation and modulation unit
d. Speech signal processing unit including baseband signal processing, speech recognition, natural language processing, speech synthesis, etc
2) A video subsystem comprising:
a. one or a group of cameras for video data acquisition
b. Projection unit for video interaction
c. Video signal processing unit including baseband signal processing, video compression, image recognition, etc
3) A communication subsystem, comprising:
a. antenna radio frequency Unit (Active Antenna Unit, AAU)
b. Distributed Unit (DU) in baseband signal processing Unit (BBU)
c. A set of network interfaces for DU connection to other subsystems
4) Basic station iron tower and supporting rack, wherein:
a. microphone array, ultrasonic transducer, camera, projection unit, antenna radio frequency unit (AAU)
Is arranged on the main body of the base station iron tower
b. The voice signal processing unit, the video signal processing unit and the DU are arranged in a matched cabinet
The AAU on the iron tower main body of the base station is connected to the DU in the matched cabinet, the microphone array and the ultrasonic transducer are connected to the voice signal processing unit in the matched cabinet, the camera and the projection unit are connected to the video signal processing unit in the matched cabinet, and the DU is connected with the voice subsystem and the video subsystem at the same time and is connected to the core network.
Furthermore, the voice signal processing unit and the video signal processing unit are embedded system boards, and can be realized by sharing one embedded system board, or can be realized by respectively using one embedded system board, and each embedded system board comprises a central processing unit, a storage and an input/output interface, wherein the storage comprises a memory and an external memory, and the input/output interface is a group of network interfaces. Furthermore, if the voice subsystem and the video subsystem share one system board, data exchange can be carried out in a mode of sharing a memory; if the voice subsystem and the video subsystem respectively use one system board, data exchange can be carried out through interconnection of network ports.
Furthermore, the ultrasonic transducer and the microphone array of the voice subsystem are connected to the voice signal processing unit through the net port, the camera and the projection unit of the video subsystem are connected to the video signal processing unit through the net port, the voice and video signal processing unit is connected to the DU through the net port, the AAU is connected to the DU through the optical fiber, and the DU is connected to the core network through the optical fiber.
Further, the ultrasonic transducers may be ultrasonic directional transducers (groups), i.e. a transducer group consisting of a single ultrasonic transducer, or an ultrasonic transducer array, i.e. an array consisting of groups of ultrasonic transducer dots. Furthermore, each ultrasonic transducer of the former needs to be provided with a corresponding mechanical adjusting device (such as an electric motor), and the latter does not need mechanical adjustment, and can electrically control each transducer point tuple in the ultrasonic transducer array to complete signal weighting and beam synthesis.
Furthermore, the ultrasonic carrier generation and modulation unit of the voice subsystem can be integrated in the voice signal processing unit and placed in a matched cabinet; or integrated with the ultrasonic transducer to form an ultrasonic speaker together, and placed on the iron tower body on the base station.
Furthermore, each intelligent base station can independently deploy a voice signal processing unit and a video signal processing unit, or a plurality of intelligent base stations can share one group of voice signal processing unit and video signal processing unit to perform combined processing of data.
Furthermore, a voice subsystem and a video subsystem in the system can be integrated with a communication subsystem of the base station to jointly form an intelligent base station for an outdoor communication scene; or the voice and video subsystems in the system are integrated with the indoor micro base station to jointly form the indoor intelligent micro base station for an indoor communication scene.
Specifically, according to a second aspect of the present invention, the present invention further provides a communication method supporting a non-terminal user based on an intelligent base station, where the method completes access and data transmission of the non-terminal user based on the cooperation of the voice, video and communication subsystems, and includes: a camera (group) in the video subsystem captures video data of a user without a terminal in real time and inputs the video data into a video signal processing unit for face detection and recognition so as to determine a user ID and a real-time position; the projection unit is used for projecting in front of the user according to the position of the user and interacting with the video of the user; a microphone array of the voice subsystem generates a synthesized beam pointing to the user for voice input of the user; the ultrasonic wave transducer of the voice subsystem generates ultrasonic wave beams which are directed to the user and used for voice broadcasting and voice information output; the voice subsystem and the video subsystem are connected to the local DU through a network interface, so that interaction of local audio and video data and other user data or control data is completed.
Furthermore, 1 or more cameras in the video subsystem form a camera group, and according to the coverage range and the use requirements of the intelligent base station, the video monitoring of the whole coverage area can be completed together in a single-base-station independent work or multi-base-station cooperation mode.
Further, the video signal processing unit needs to support processing of video signals collected by 1 or more cameras, and further includes: face detection, recognition and positioning are performed in the acquired video, and thus the ID and position of the user without the terminal are determined. Further, for a use scene of single base station with multiple cameras or multiple base stations cooperating, map construction and mapping should be supported.
Further, the projection unit projects the image or video directly into the air in front of the user without other additional media according to the position of the user without the terminal; further, the projection may employ 3D holographic projection technology.
Furthermore, a microphone array of the voice subsystem inputs the acquired signals into a voice signal processing unit, the voice signal processing unit performs A/D conversion and band-pass filtering on the user signals of each path, and then based on the principle of a phased array, the weighting coefficient of beam forming is calculated according to the position of the user and the position of the user, so that 1 or more narrow synthetic beams are generated and are directed to each user to directionally acquire the voice signals of each user.
Further, the voice signal processing unit inputs the obtained voice signals of the users to a voice recognition module for recognition; and according to the recognition result, calling a natural language processing module and a voice synthesis module to generate synthetic voice to interact with a user or directly transmitting voice data to the DU through a network interface to transmit to a far end.
Further, the ultrasound transducers of the speech subsystem may be ultrasonic directional transducers (groups), i.e. a transducer group consisting of a single ultrasound transducer, or an ultrasound transducer array, i.e. an array consisting of a group of ultrasound transducer dots. In the former, each ultrasonic transducer needs to be provided with a corresponding mechanical adjusting device (such as a motor), a certain ultrasonic transducer is selected according to the position of a user and the self position and an optimization rule, the mechanical adjusting device rotates the ultrasonic transducer to point to the user to complete audio signal output, and each user is allocated with one ultrasonic transducer to communicate; the latter does not need mechanical adjustment, based on the principle of phased array, calculates the beam forming weighting coefficient of the array formed by the ultrasonic transducer point element groups according to the position of the user and the self position, thereby generating 1 or more narrow synthesized beams pointing to each user, and sending the local synthesized voice transmitted from the voice signal processing unit or the voice data transmitted from the far end to the user.
Further, before the ultrasonic transducer sends a voice signal, the method further comprises: the ultrasonic carrier generation unit generates a local carrier signal, the local carrier signal and the voice signal pass through the ultrasonic modulation unit together, the voice signal is modulated to an ultrasonic frequency band, and then the voice signal is sent to each user through the ultrasonic transducer.
Further, the ultrasonic signal can be self-demodulated directly by nonlinear action with the air, obtaining a sound signal with a frequency within the audible frequency spectrum range of the human ear when reaching the human ear; or the ultrasonic transducer simultaneously sends the ultrasonic carrier reference signal, and the original voice signal is demodulated after coherent superposition at the human ear. Further, the ultrasonic carrier reference signal may be transmitted by the ultrasonic transducer of the base station, or may be transmitted by the ultrasonic transducers of other base stations, depending on the operating mode of the base station.
Further, the voice subsystem and the video subsystem are connected to the DU through network interfaces, and the DU needs to support both non-end users and end users, that is, the DU needs to effectively manage and transmit data from the voice and video subsystems and data from the AAU in two directions.
Furthermore, the mobile communication of the user without the terminal can be completed by adopting a mode of independent work of a single base station or cooperation of multiple base stations, wherein the independent work of the single base station is that each base station independently provides service for the user within the coverage range of the base station through local voice and video subsystems; the multiple base stations cooperate, namely, a plurality of adjacent or similar base stations jointly use all voice and video subsystems to provide services for users in a coverage range, so that the user service quality, the system energy efficiency ratio and the like are improved.
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The invention provides a new application scene of a non-terminal user in future mobile communication, and provides an MEC intelligent base station carrying voice, video and communication subsystems and a corresponding system scheme aiming at the new application scene of the non-terminal user so as to support the non-terminal user in the future mobile communication.
As shown in fig. 1 to 6, an MEC intelligent base station system supporting a non-terminal user according to the present invention includes:
1) a speech subsystem comprising: the microphone array is used for receiving voice signals; the ultrasonic transducer is used for sending voice signals; an ultrasonic carrier generation and modulation unit; and the voice signal processing unit comprises baseband signal processing, voice recognition, natural language processing, voice synthesis and the like.
2) A video subsystem comprising: one or a group of cameras are used for video data acquisition; a projection unit; and the video signal processing unit comprises baseband signal processing, video compression, image recognition and the like.
3) A communication subsystem, comprising: an antenna radio unit (AAU); a baseband signal processing Distributed Unit (DU); and a set of network interfaces for connecting the DUs with other subsystems.
4) A base station iron tower and a matched cabinet.
Wherein:
1) as shown in fig. 3, the ultrasound transducers in the speech subsystem may be ultrasonic directional transducers (group), i.e. a transducer group consisting of a single ultrasound transducer (fig. 3a), or an ultrasound transducer array, i.e. an array consisting of a group of ultrasound transducer dots (fig. 3 b). According to the requirements of the system and performance indexes, the form and parameters of the ultrasonic transducer used, such as array shape, spacing, number and the like, are determined. If an ultrasonic directional transducer(s) is/are used, a corresponding mechanical adjustment device should also be provided for each ultrasonic transducer, in particular a servomotor may be used.
2) The ultrasonic carrier generation and modulation unit of the voice subsystem can be integrated in the voice signal processing unit and can also be integrated with the ultrasonic transducer. The voice signal processing unit in the voice subsystem and the video signal processing unit in the video subsystem can be realized by sharing one embedded system board, and the voice subsystem and the video subsystem essentially adopt a digital signal processing unit to exchange data in a mode of sharing a memory on the embedded system board. Or the voice subsystem and the video subsystem are respectively realized by using one system board, and data exchange is carried out through a network interface. Specifically, the embedded system board comprises a central processing unit, a storage and an input/output interface, the central processing unit can use a CPU based on an Arm architecture, the storage comprises a memory and an external memory, the input/output interface comprises a group of RJ45 network ports for data exchange, the ultrasonic transducer and the microphone array of the voice subsystem are connected to the voice signal processing unit through the network ports, and the camera and the projection unit of the video subsystem are connected to the video signal processing unit through the network ports.
3) The DU is used to connect voice, video subsystems, etc. through a set of network interfaces. Specifically, as shown in fig. 2) and fig. 6, the embedded system board supports at least two network interfaces for exchanging data with the communication subsystem, which are respectively used for uplink and downlink of voice data and uplink and downlink of video data.
4) The microphone array, the ultrasonic transducer, the camera, the projection unit, the antenna radio frequency unit and the like are placed on the base station iron tower main body; the voice signal processing unit, the video signal processing unit and the DU are arranged in a matched cabinet; the AAU on the iron tower main body of the base station is connected to the DU in the matched cabinet through the optical fiber, other modules are connected to the signal processing unit in the matched cabinet through network interfaces, and the DU is connected with the voice subsystem, the video subsystem and the like through a group of network interfaces and is accessed to the core network through the optical fiber.
5) The voice subsystem and the video subsystem can be integrated with a communication subsystem of the base station to jointly form an intelligent base station for an outdoor communication scene; or the voice and video subsystems and the indoor micro base station are integrated to jointly form the indoor intelligent micro base station for an indoor communication scene.
As shown in fig. 4, 5, and 6, the present invention further provides a system solution supporting a non-terminal user based on an intelligent base station, and the system solution completes the access and data transmission of the non-terminal user based on the cooperation of the voice, video, and communication subsystems, including: a camera (group) in the video subsystem captures video data of a user without a terminal in real time and inputs the video data into a video signal processing unit for face detection and recognition so as to determine a user ID and a real-time position; the projection unit is used for projecting in front of the user according to the position of the user and interacting with the video of the user; a microphone array of the voice subsystem generates a synthesized beam pointing to the user for voice input of the user; the ultrasonic wave transducer of the voice subsystem generates ultrasonic wave beams which are directed to the user and used for voice broadcasting and voice information output; the voice subsystem and the video subsystem are connected to the local DU through a network interface, so that interaction of local audio and video data and other user data or control data is completed. The specific system processing flow is as follows:
1) the cameras in the video subsystem are 1 or more to form a camera group, and according to the coverage range and the use requirement of the intelligent base station, the real-time video monitoring of the whole coverage area can be jointly completed by adopting a single-base-station independent work or multi-base-station cooperation mode. The video signal processing unit needs to support the processing of the video signals collected by 1 or more cameras, and comprises the following steps: preprocessing video data acquired by a camera; and then according to the system state and control information, performing video compression on the acquired uplink video data, transmitting the video data to a DU (data Unit) through a network interface, and transmitting the video data to a remote end, or performing face detection, identification and positioning, so as to determine the ID and the position of the user without the terminal, and transmitting the information of the user to a voice subsystem through a shared memory. If a working mode of single-base-station multi-camera or multi-base-station cooperation is adopted, map construction and mapping are supported, so that repeated detection of multiple cameras on the same target is avoided.
2) After the video signal processing unit obtains the position information of each user, the position to be projected is obtained through calculation, and the local interactive video or downlink video data is selected as the projected video data according to the system state and the control information. And then the projection unit directly projects the video data into the air in front of the user according to the projection position and the video data output by the video signal processing unit without other additional media. Further, the projection may employ 3D holographic projection technology.
3) The microphone array of the voice subsystem inputs the acquired signals into the voice signal processing unit, the voice signal processing unit performs A/D conversion and band-pass filtering on the voice signals of each path of users, noise outside a human voice frequency spectrum is filtered, and then based on the phased array principle, the weighting coefficient of beam forming is calculated according to the positions of the users and the position of the microphone array, so that 1 or more narrow synthetic beams are generated and are directed to each user to directionally acquire the voice signals of the users. And then, voice recognition is carried out, if the voice of the user is a system control instruction keyword, a natural language processing module and a voice synthesis module are called to generate synthesized voice to interact with the user, and if the voice of the user is conventional voice data, the voice data is directly transmitted to the DU through a network interface and transmitted to a far end.
4) The voice signal processing unit selects output voice or downlink voice data of the voice synthesis module as voice data transmitted to users according to system states and control information, then the ultrasonic carrier generation unit generates local carrier signals, the ultrasonic modulation unit modulates the voice data to an ultrasonic frequency band, and the ultrasonic transducer sends the voice data to each user. If the user is an ultrasonic directional transducer group, namely a transducer group consisting of single ultrasonic transducers, each ultrasonic transducer corresponds to one user and is in a one-to-one communication mode, the corresponding ultrasonic transducer is rotated by a mechanical adjusting device (such as a servo motor) to point to the user according to the position of the user, the energy efficiency is higher, and the ultrasonic directional transducer group is suitable for scenes with fewer users and limited power; if the ultrasonic transducer array is an ultrasonic transducer array, namely an array composed of ultrasonic transducer point tuples, based on the principle of a phased array, the weighting coefficient of beam forming is calculated according to the position of a user and the position of the transducer array, so that 1 or more narrow synthesized beams are generated and directed to each user, an ultrasonic modulation signal is sent to each user through the ultrasonic transducer array, the ultrasonic transducer array is a many-to-many communication mode, the beams are electrically adjusted, a mechanical adjusting device is not needed, and the ultrasonic transducer array is suitable for the scenes of large-scale users. Further, the ultrasonic signal can be directly self-demodulated through the nonlinear action with air, and a sound signal in an audible sound frequency band is obtained when the ultrasonic signal reaches the ears of a human body; or the ultrasonic transducer simultaneously sends the ultrasonic carrier reference signal, and the original voice signal is demodulated after coherent superposition at the human ear, so that the signal intensity of the ultrasonic signal reaching the human ear is improved.
5) The voice subsystem and the video subsystem exchange data by sharing a memory or a network interface and are connected to the DU by the network interface, the DU needs to support both a non-terminal user and a terminal user, that is, the DU needs to effectively manage and bidirectionally transmit data from the voice and video subsystems and data from the AAU. The DUs are accessed to the core network using an optical fiber interface.
As shown in fig. 7, the mobile communication of the user without the terminal can be completed by adopting a single-base-station independent work or a multi-base-station cooperation mode, where the single-base-station independent work is that each base station independently provides service to the user within the coverage area of the base station through local voice and video subsystems; the multiple base stations cooperate, namely, a plurality of adjacent or similar base stations jointly use all voice and video subsystems to provide services for users in a coverage range, so that the user service quality, the system energy efficiency ratio and the like are improved. The specific cooperation mode may include:
1) the method comprises the steps that the cameras of the base stations commonly acquire user video information in a coverage range, high-precision maps in the coverage range are constructed through map construction and mapping, then user images with high image quality are selected for subsequent identification, or video data acquired by the cameras are synthesized and compressed and then serve as uplink video data for transmission and the like.
2) The microphone arrays of the base stations jointly form a larger distributed microphone array, and the whole system can adjust the beam forming coefficient of each microphone to obtain a beam with a narrower width, so that the accuracy of user voice data acquisition is improved.
3) The ultrasonic transducer arrays of the base stations can simultaneously transmit ultrasonic modulation signals by adjusting phase differences, so that the lobe width is reduced, and a synthesized beam with narrower width is obtained; or the plurality of base stations respectively transmit the ultrasonic wave modulation signal and the ultrasonic wave carrier reference signal, and coherent demodulation is carried out at the ear of the user, so that the signal intensity is improved.
4) Under the working mode of multi-base station cooperation, each base station carries out cooperative calculation and data exchange through a data exchange path between the base stations.
As shown in fig. 8, the mobile communication procedure supporting the non-terminal user according to the present invention includes:
1) when a user accesses a system, an intelligent base station firstly detects whether the user carries a terminal, and if the user is a terminal user, a terminal access flow is started; if the user is a user without a terminal, capturing the user through a camera, and then carrying out face detection and recognition so as to determine the user;
2) if the user is a terminal user, after the terminal access is successful, the system inquires whether the user activates a non-terminal communication mode, and if the user is activated, the current position of the user is determined according to the position of the terminal; if not, the process is ended. If the user is a non-terminal user, after the identity information of the user is determined, the system inquires whether the user activates a non-terminal communication mode, and if the user is activated, the current position of the user is determined according to the user position captured by the camera; if not, the process is ended.
3) The system tracks the user in real time in the video data collected by the camera (group), and updates the information such as the position of the user in real time; the system respectively calculates the weighting coefficients of the beam forming of the microphone array and the ultrasonic transducer array (or the direction to which the ultrasonic directional transducer needs to point) according to the real-time position information of the user, so as to generate a narrow synthesized beam pointing to the user for real-time receiving and sending of voice signals; and a projection unit of the system projects the video data to the front of the user according to the real-time position information of the user to complete the real-time video interaction with the user. The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.
With the gradual increase of the density of base stations and the further deployment of indoor micro base stations, it has become gradually possible for users to directly complete mobile communication through voice and video signals without mobile terminals such as mobile phones.
Under the application scene of the mobile communication without the terminal, the user can directly interact with the base station through voice, video and without mobile terminals such as a mobile phone. The base station carries a corresponding voice system and a video system, integrates with modules such as an antenna, a radio frequency front end, a baseband processing unit and the like, and jointly completes functions such as user access, mobile communication and the like.
The invention provides an intelligent base station system carrying voice, video and communication subsystems and a corresponding communication method aiming at application scenes of non-terminal users in future mobile communication and based on mobile edge computing and audio-video signal acquisition and processing technologies, so as to support the non-terminal users in the future mobile communication.

Claims (14)

1. An intelligent base station system supporting a non-terminal user, comprising:
1) a speech subsystem comprising
a. The microphone array is used for receiving voice signals;
b. the ultrasonic transducer is used for sending ultrasonic audio signals;
c. an ultrasonic carrier generation and modulation unit;
d. the voice signal processing unit comprises a baseband signal processing module, a voice recognition module, a natural language processing module and a voice synthesis module;
2) a video subsystem comprising:
a. one or a group of cameras are used for video data acquisition;
b. a projection unit for video interaction;
c. the video signal processing unit comprises a baseband signal processing module, a video compression module and an image identification module;
3) a communication subsystem, comprising:
a. an Antenna radio frequency Unit (AAU);
b. a Distributed Unit (DU) in a baseband signal processing Unit (BBU);
c. a set of network interfaces for DU connection to other subsystems;
4) basic station iron tower and supporting rack, wherein:
a. the microphone array, the ultrasonic transducer, the camera, the projection unit and the antenna radio frequency unit (AAU) are arranged on the base station iron tower main body;
b. the voice signal processing unit, the video signal processing unit and the DU are arranged in a matched cabinet;
the AAU on the main body of the base station iron tower is connected to a DU (audio/video) in a matched cabinet, the microphone array and the ultrasonic transducer are connected to a voice signal processing unit in the matched cabinet, the camera and the projection unit are connected to a video signal processing unit in the matched cabinet, and the DU is simultaneously connected with a voice subsystem and a video subsystem and is connected to a core network;
the ultrasonic transducers in the voice subsystem are ultrasonic directional transducer groups, namely transducer groups consisting of single ultrasonic transducers, or ultrasonic transducer arrays, namely arrays consisting of ultrasonic transducer point groups; furthermore, each ultrasonic transducer in the ultrasonic directional transducer group needs to be provided with a corresponding mechanical adjusting device, the ultrasonic transducer array does not need mechanical adjustment, and each transducer point tuple in the electric control ultrasonic transducer array completes signal weighting and beam synthesis; furthermore, an ultrasonic carrier generation and modulation unit of the voice subsystem is integrated in the voice signal processing unit and is placed in a matched cabinet; or integrated with the ultrasonic transducer to form an ultrasonic speaker together, and placed on the iron tower body on the base station.
2. The intelligent base station system supporting the non-terminal users according to claim 1, wherein:
the voice signal processing unit and the video signal processing unit are embedded system boards, and are realized by sharing one embedded system board or respectively realized by using one embedded system board; further, if the voice subsystem and the video subsystem share one system board, data exchange is carried out in a mode of sharing a memory; if the voice subsystem and the video subsystem respectively use one system board, data exchange is carried out through interconnection of network ports; the ultrasonic transducer and the microphone array of the voice subsystem are connected to the voice signal processing unit through a net port, the camera and the projection unit of the video subsystem are connected to the video signal processing unit through the net port, the voice and video signal processing unit is connected to the DU through the net port, the AAU is connected to the DU through an optical fiber, and the DU is connected to the core network through the optical fiber.
3. An intelligent base station system supporting a non-terminal user according to claim 1 or 2, characterized in that: and each intelligent base station independently deploys a voice signal processing unit and a video signal processing unit, or a plurality of intelligent base stations share one group of voice signal processing unit and video signal processing unit to carry out combined processing on data.
4. The intelligent base station system supporting the non-terminal users according to claim 1, wherein: a voice subsystem and a video subsystem in the system are integrated with a communication subsystem of a base station to jointly form an intelligent base station for an outdoor communication scene; or a voice subsystem and a video subsystem in the system are integrated with the indoor micro base station to jointly form the indoor intelligent micro base station for an indoor communication scene.
5. A communication method of an intelligent base station system supporting a non-end user, which completes the access and data transmission of the non-end user based on the cooperation of the voice, video and communication subsystems of claim 1, characterized in that: the method specifically comprises the following steps:
step one, a camera group in a video subsystem captures video data of a user without a terminal in real time and inputs the video data to a video signal processing unit for face detection and recognition so as to determine a user ID and a real-time position;
secondly, the projection unit projects in front of the user according to the position of the user and is used for interacting with the video of the user;
step three, a microphone array of the voice subsystem generates a synthesized beam pointing to the user for voice input of the user;
fourthly, an ultrasonic wave transducer of the voice subsystem generates ultrasonic wave beams to point to the user for voice broadcasting and voice information output;
and step five, the voice subsystem and the video subsystem are connected to a local DU through a network interface, and the local audio and video data are interacted with other user data or control data through the local DU.
6. The communication method of the intelligent base station system supporting the non-terminal user according to claim 5, wherein: the system comprises a video subsystem, a video subsystem and a monitoring subsystem, wherein 1 or more cameras in the video subsystem form a camera group, and the video monitoring of the whole coverage area is jointly completed by adopting a single-base-station independent working mode or a multi-base-station cooperation mode according to the coverage area and the use requirement of an intelligent base station;
the video signal processing system supports processing of video signals collected by 1 or more cameras, and comprises: performing face detection, identification and positioning in the acquired video so as to determine the ID and the position of the user without the terminal; the method also supports map construction and mapping aiming at the use scene of single-base-station multi-camera or multi-base-station cooperation;
the projection unit directly projects images or videos in the air in front of a user without other additional media according to the position of the user without a terminal; further, the projection employs 3D holographic projection.
7. The communication method of the intelligent base station system supporting the non-terminal user according to claim 5, wherein: a microphone array of the voice subsystem inputs acquired signals into a voice signal processing unit, the voice signal processing unit carries out A/D conversion and band-pass filtering on user signals of each path, and then based on the principle of a phased array, weighting coefficients of beam forming are calculated according to the positions of users and the positions of the users, so that 1 or more narrow synthesized beams are generated and are directed to the users to directionally acquire the voice signals of the users.
8. The communication method of the intelligent base station system supporting the non-terminal user according to claim 5, wherein: the voice signal processing unit inputs the obtained voice signals of all users into a voice recognition module for recognition; and according to the recognition result, calling a natural language processing module and a voice synthesis module to generate synthetic voice to interact with a user or directly transmitting voice data to the DU through a network interface to transmit to a far end.
9. The communication method of the intelligent base station system supporting the non-terminal user according to claim 5, wherein: the ultrasonic transducer of the voice subsystem is an ultrasonic directional transducer group, namely a transducer group consisting of single ultrasonic transducers, or an ultrasonic transducer array, namely an array consisting of ultrasonic transducer point groups; each ultrasonic transducer in the ultrasonic directional transducer group needs to be provided with a corresponding mechanical adjusting device, a certain ultrasonic transducer is selected according to the position of a user and the self position and an optimization rule, the mechanical adjusting device rotates the ultrasonic transducer to point to the user to complete audio signal output, and each user is allocated with one ultrasonic transducer to carry out communication; the ultrasonic transducer array does not need mechanical adjustment, and based on the principle of phased array, the beam forming weighting coefficient of the array formed by the ultrasonic transducer point element groups is calculated according to the position of the user and the position of the user, so that 1 or more narrow synthesized beams are generated to point to each user, and the local synthesized voice transmitted from the voice signal processing unit or the voice data transmitted from the far end is transmitted to the user.
10. The communication method of the intelligent base station system supporting the non-terminal users according to claim 5 or 9, wherein:
before the ultrasonic transducer sends a voice signal, the method comprises the following steps: the ultrasonic carrier generation unit generates a local carrier signal, the local carrier signal and the voice signal pass through the ultrasonic modulation unit together, the voice signal is modulated to an ultrasonic frequency band, and then the voice signal is sent to each user through the ultrasonic transducer.
11. The communication method of the intelligent base station system supporting the non-terminal users according to claim 5 or 9, wherein: the ultrasonic signal is directly self-demodulated through the nonlinear action with air, and when reaching the human ear, a sound signal with the frequency falling within the frequency spectrum range heard by the human ear is obtained; or the ultrasonic transducer simultaneously sends ultrasonic carrier reference signals, and original voice signals are demodulated after coherent superposition at the human ear; further, the ultrasonic carrier reference signal is transmitted by the ultrasonic transducer of the local base station or transmitted by the ultrasonic transducers of other base stations, depending on the operation mode of the base station.
12. The communication method of the intelligent base station system supporting the non-terminal user according to claim 5, wherein: the voice subsystem and the video subsystem are connected to the DU through network interfaces, and the DU needs to support both non-end users and end users, that is, the DU needs to effectively manage and bi-directionally transmit data from the voice and video subsystems and data from the AAU.
13. The communication method of the intelligent base station system supporting the non-terminal user according to claim 5, wherein: the mobile communication of the user without the terminal is completed by adopting a mode of single base station independent work or multi-base station cooperation, wherein the single base station independent work is that each base station independently provides service for the user in the coverage range of the base station through a local voice subsystem and a local video subsystem; the cooperation of multiple base stations, that is, a plurality of adjacent or similar base stations jointly use all voice and video subsystems to provide service for users in a coverage area.
14. The communication method of the intelligent base station system supporting the non-terminal users according to claim 5 or 13, wherein:
the multi-base station cooperation comprises the following steps:
1) the method comprises the following steps that the cameras of a plurality of base stations commonly acquire user video information in a coverage area, a high-precision map in the coverage area is constructed through map construction and mapping, then user images with high image quality are selected for subsequent identification, or video data acquired by the cameras are synthesized and compressed and then serve as uplink video data for transmission and the like;
2) the microphone arrays of a plurality of base stations jointly form a larger distributed microphone array, and the whole system adjusts the beam forming coefficient of each microphone so as to obtain a beam with narrower width, thereby improving the accuracy of user voice data acquisition;
3) the ultrasonic transducer arrays of the base stations simultaneously transmit ultrasonic modulation signals by adjusting phase differences, so that the lobe width is reduced, and a synthesized beam with narrower width is obtained; or the base stations respectively transmit ultrasonic wave modulation signals and ultrasonic wave carrier reference signals, and coherent demodulation is carried out at the ears of the user, so that the signal intensity is improved;
4) and each base station performs cooperative calculation and data exchange through a data exchange path between the base stations.
CN201911225930.6A 2019-12-03 2019-12-03 Intelligent base station system supporting non-terminal user and communication method Active CN112911189B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911225930.6A CN112911189B (en) 2019-12-03 2019-12-03 Intelligent base station system supporting non-terminal user and communication method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911225930.6A CN112911189B (en) 2019-12-03 2019-12-03 Intelligent base station system supporting non-terminal user and communication method

Publications (2)

Publication Number Publication Date
CN112911189A CN112911189A (en) 2021-06-04
CN112911189B true CN112911189B (en) 2022-04-26

Family

ID=76104609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911225930.6A Active CN112911189B (en) 2019-12-03 2019-12-03 Intelligent base station system supporting non-terminal user and communication method

Country Status (1)

Country Link
CN (1) CN112911189B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104154439A (en) * 2014-07-29 2014-11-19 浙江生辉照明有限公司 Intelligent LED lighting device and remote video chat system based on same
CN207133932U (en) * 2017-08-01 2018-03-23 福建省邮电规划设计院有限公司 A kind of machine room base station burglary-resisting system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8260217B2 (en) * 2008-10-30 2012-09-04 Taiwan Gomet Technology Co., Ltd. Bidirectional wireless microphone system with automatic login function

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104154439A (en) * 2014-07-29 2014-11-19 浙江生辉照明有限公司 Intelligent LED lighting device and remote video chat system based on same
CN207133932U (en) * 2017-08-01 2018-03-23 福建省邮电规划设计院有限公司 A kind of machine room base station burglary-resisting system

Also Published As

Publication number Publication date
CN112911189A (en) 2021-06-04

Similar Documents

Publication Publication Date Title
CN105340299B (en) Method and its device for generating surround sound sound field
CN101384105B (en) Three dimensional sound reproducing method, device and system
CN206559473U (en) A kind of image collecting device and intelligent robot
CN107127758A (en) Automatic identification photographic method and its system based on intelligent robot
CN103021414B (en) Method for distance modulation of three-dimensional audio system
CN108520754B (en) Noise reduction conference machine
CN206559550U (en) The remote control and television system of a kind of built-in microphone array
JP2014513459A (en) Video conferencing method, system, and broadband mobile hotspot device
CN111163281A (en) Panoramic video recording method and device based on voice tracking
US20200029153A1 (en) Audio signal processing method and device
CN114363512A (en) Video processing method and related electronic equipment
CN217563712U (en) Remote audio and video conference system
CN112911189B (en) Intelligent base station system supporting non-terminal user and communication method
CN201577165U (en) 3g television news live broadcast box
CN111479180A (en) Pickup control method and related product
CN110225445A (en) A kind of processing voice signal realizes the method and device of three-dimensional sound field auditory effect
Nava et al. A high-speed camera-based approach to massive sound sensing with optical wireless acoustic sensors
CN111246345B (en) Method and device for real-time virtual reproduction of remote sound field
CN115884038A (en) Audio acquisition method, electronic device and storage medium
WO2019100791A1 (en) Fisheye lens-based positioning system, positioning method and charging method by means of radio waves
CN113645542B (en) Voice signal processing method and system and audio and video communication equipment
US20230156419A1 (en) Sound field microphones
CN112668605A (en) Single-point fusion information acquisition method and device based on biological and physical characteristics
Ishigaki et al. Zoom microphone
WO2019174442A1 (en) Adapterization equipment, voice output method, device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant