TWI622474B

TWI622474B - Robot system and control method thereof

Info

Publication number: TWI622474B
Application number: TW105120645A
Authority: TW
Inventors: 劉鑫; 魏達
Original assignee: 芋頭科技(杭州)有限公司
Priority date: 2015-06-30
Filing date: 2016-06-30
Publication date: 2018-05-01
Also published as: CN106325142A; WO2017000795A1; TW201700238A; HK1231576A1

Abstract

本發明提供了一種機器人系統及其控制方法，所述機器人系統包括：麥克風陣列，接收使用者發出的語音訊號；雙眼測距攝影鏡頭，同步擷取兩通道圖像訊號；處理模組，對所述語音訊號和所述圖像訊號進行處理，獲取語音識別訊號和轉動角度訊號；電機轉動模組，接收所述轉動角度訊號，並根據所述轉動角度訊號進行平滑轉動。所述機器人系統，通過語音這種人類最自然的人機對話模式，在家居環境內，解放人們的雙手，讓機器人系統成爲一種高品質的伴侶。The invention provides a robot system and a control method thereof. The robot system comprises: a microphone array, receiving a voice signal sent by a user; a binocular ranging lens, synchronously capturing two channels of image signals; and processing a module, The voice signal and the image signal are processed to obtain a voice recognition signal and a rotation angle signal; the motor rotation module receives the rotation angle signal and performs smooth rotation according to the rotation angle signal. The robot system, through the human voice, the most natural human-machine dialogue mode, liberates people's hands in the home environment, and makes the robot system a high-quality companion.

Description

Robot system and control method thereof

本發明涉及智能機器人技術領域，尤其涉及一種機器人系統及其控制方法。The present invention relates to the field of intelligent robot technology, and in particular, to a robot system and a control method thereof.

隨著人工智慧、計算機軟硬體技術的發展，機器人技術的發展經歷了一個從低級到高級的發展過程。第一代機器人裝有記憶存儲器，由人將作業的各種要求示範給機器人，使其記住操作的程式和要領，當它接收到再現的命令時，則自主地模仿示範動作作業。第二代機器人是裝有小型計算機和傳感器的離散編程的工業機器人，它能感知外界訊號並進行“思維”，它比第一代機器人更靈活、更能適應環境變化的需求。第三代機器人是智能機器人，它不但有第二代機器人的感覺功能和簡單的自適應能力，而且能充分識別工作對象和工作環境，並能根據人給的指令和它自身的判斷結果自動確定與之相適應的工作，是人工智慧發展的高級産物，也是當今機器人發展的熱點。With the development of artificial intelligence and computer software and hardware technology, the development of robot technology has experienced a development process from low to high. The first generation of robots is equipped with a memory memory, which is used by a person to demonstrate various requirements of the job to the robot, so that it remembers the program and the method of operation, and when it receives the command of reproduction, it autonomously imitates the demonstration action. The second generation of robots is a discretely programmed industrial robot equipped with small computers and sensors that senses external signals and "thinks". It is more flexible and adaptable to the needs of the environment than the first generation of robots. The third generation robot is an intelligent robot. It not only has the sensory function and simple adaptive ability of the second generation robot, but also can fully recognize the working object and working environment, and can automatically determine according to the instructions given by the person and its own judgment results. The work that is compatible with it is a high-level product of the development of artificial intelligence, and it is also a hot spot in the development of robots today.

現有機器人系統更多應用在工業場景下，工業機器人的控制存在程式化、編程難度高、操作介面不友好等問題，在工業場景下機器人的形態也被限定在機械臂等機器設備屬性更強的應用，難以與人們更好地交換。同時，現有機器人系統及控制技術存在難以普及，開發成本高等問題。另外，現有機器人系統及控制技術難以應用在家居場景，無法友好地在家居場景內與家庭成員友好交換，更難以爲家庭成員提供高品質、便捷的服務。The existing robot system is more used in industrial scenes. The control of industrial robots has problems such as stylization, high programming difficulty, and unfriendly operation interface. In the industrial scene, the shape of the robot is also limited to the mechanical equipment such as the robot arm. Application, it is difficult to exchange with people better. At the same time, existing robot systems and control technologies are difficult to popularize and have high development costs. In addition, the existing robot system and control technology are difficult to apply in the home scene, and cannot be friendly exchanged with family members in the home scene, and it is more difficult to provide high-quality and convenient services for family members.

針對上述技術問題，本申請提供了一種機器人系統，包括：麥克風陣列，接收使用者發出的語音訊號；雙眼測距攝影鏡頭，同步擷取兩通道圖像訊號；處理模組，分別與所述麥克風陣列和所述雙眼測距攝影鏡頭相連，以接收並處理所述語音訊號和所述圖像訊號生成語音識別訊號和轉動角度訊號；電機轉動模組，與所述處理模組相連，接收並根據所述轉動角度訊號進行平滑轉動。The present application provides a robot system, including: a microphone array that receives a voice signal from a user; a binocular ranging lens that simultaneously captures two channels of image signals; and a processing module that respectively The microphone array is connected to the binocular distance measuring lens to receive and process the voice signal and the image signal to generate a voice recognition signal and a rotation angle signal; the motor rotation module is connected to the processing module to receive And smoothly rotating according to the rotation angle signal.

於一較佳實施方式中，所述處理模組包括：DSP加速輔助處理器，分別與所述麥克風陣列和所述雙眼測距攝影鏡頭連接，用以接收所述語音訊號和所述圖像訊號，對所述語音訊號進行語音增強處理，對所述圖像訊號進行處理獲取測距訊號；機器人主控制器，與所述DSP加速輔助處理器通過USB介面連接，接收經語音增強處理的所述語音訊號和所述測距訊號，並根據所述測距訊號對所述語音訊號進行自動增益處理，以獲取所述語音識別訊號；以及根據對所述所述機器人主控制器爲多核ARM處理器。In a preferred embodiment, the processing module includes: a DSP acceleration auxiliary processor, which is respectively connected to the microphone array and the binocular ranging lens for receiving the voice signal and the image. a signal, performing voice enhancement processing on the voice signal, processing the image signal to obtain a ranging signal; and the robot main controller is connected to the DSP acceleration auxiliary processor through a USB interface, and receiving the voice enhanced processing Decoding the voice signal and the ranging signal, and performing automatic gain processing on the voice signal according to the ranging signal to obtain the voice recognition signal; and processing the multi-core ARM according to the robot main controller Device.

於一較佳實施方式中，語音訊號進行定位，以獲取所述轉動角度訊號。In a preferred embodiment, the voice signal is positioned to obtain the rotation angle signal.

於一較佳實施方式中，所述機器人系統還包括：雲端語音識別引擎，接收所述語音識別訊號與所述機器人主控制器進行交換，向使用者回饋自動語音識別與自然語義處理的結果。In a preferred embodiment, the robot system further includes: a cloud speech recognition engine, which receives the speech recognition signal and exchanges with the robot main controller, and returns a result of automatic speech recognition and natural semantic processing to the user.

於一較佳實施方式中，所述機器人系統還包括：DLP內投顯示模組，位於機器人系統頭部的內部，將所述機器人的交換使用者介面投影在所述機器人的臉部。In a preferred embodiment, the robot system further includes: a DLP internal projection display module located inside the head of the robot system, and the exchange user interface of the robot is projected on the face of the robot.

於一較佳實施方式中，所述機器人主控制器通過I2C串列控制協定及HDIM介面與所述DLP內投顯示模式連接。In a preferred embodiment, the robot main controller is connected to the DLP internal display mode through an I2C serial control protocol and a HDIM interface.

於一較佳實施方式中，所述麥克風陣列爲多通道麥克風陣列。In a preferred embodiment, the microphone array is a multi-channel microphone array.

本發明還提供了一種機器人控制方法，適用於所述的機器人系統，包括步驟：S1：接收所述語音訊號和所述圖像訊號，並傳遞至所述DSP加速輔助處理器；S2：所述DSP加速輔助處理器對所述語音訊號和所述圖像訊號進行處理後傳遞至所述機器人主控制器，獲取語音識別訊號以及轉動角度訊號；S3：所述雲端語音識別引擎接收所述語音識別訊號，向戶回饋自動語音識別與自然語義處理的結果；S4：所述轉動控制模組接收所述轉動角度訊號，並根據所述轉動角度訊號進行平滑轉動。The present invention also provides a robot control method suitable for the robot system, comprising the steps of: S1: receiving the voice signal and the image signal, and transmitting to the DSP acceleration auxiliary processor; S2: The DSP acceleration auxiliary processor processes the voice signal and the image signal and transmits the voice signal to the robot main controller to obtain a voice recognition signal and a rotation angle signal; S3: the cloud voice recognition engine receives the voice recognition The signal is sent back to the household for the result of automatic speech recognition and natural semantic processing; S4: the rotation control module receives the rotation angle signal and performs smooth rotation according to the rotation angle signal.

於一較佳實施方式中，所述步驟S1包括步驟：S11：所述多通道麥克風陣列同步擷取不同方向的所述語音訊號，傳遞至所述DSP加速輔助處理器；S12：所述雙眼測距攝影鏡頭同步擷取兩通道的所述圖像訊號，傳遞至所述DSP加速輔助處理器。In a preferred embodiment, the step S1 includes the following steps: S11: the multi-channel microphone array synchronously captures the voice signals in different directions and transmits the voice signals to the DSP acceleration auxiliary processor; S12: the eyes The ranging photographic lens synchronously captures the image signals of the two channels and transmits them to the DSP acceleration auxiliary processor.

於一較佳實施方式中，所述步驟S2包括步驟：S21：所述DSP加速輔助處理器將所述語音訊號進行波束形成處理，並對所述語音訊號中混雜的噪音及回音訊號進行語音增強處理；S22：所述DSP加速輔助處理器對雙眼測距視覺演算法進行硬體並行加速，對所述圖像訊號進行處理，獲取所述測距訊號；S23：所述機器人主控制器根據所述測距訊號對所述語音訊號進行自動增益處理；S24：所述機器人主控制器接收所述語音訊號獲取所述語音識別訊號，並對所述語音訊號進行定位獲取所述轉動角度訊號。In a preferred embodiment, the step S2 includes the following steps: S21: the DSP acceleration auxiliary processor performs beamforming processing on the voice signal, and performs voice enhancement on the mixed noise and echo signals in the voice signal. Processing; S22: the DSP acceleration auxiliary processor performs hardware parallel acceleration on the binocular ranging visual algorithm, and processes the image signal to obtain the ranging signal; S23: the robot main controller is configured according to The ranging signal performs automatic gain processing on the voice signal; S24: the robot main controller receives the voice signal to acquire the voice recognition signal, and locates the voice signal to obtain the rotation angle signal.

綜上所述，由於採用了上述技術方案，本專利申請記載了一種機器人系統及其控制方法，其有益效果有：所述機器人系統，通過語音這種人類最自然的人機對話模式，在家居環境內，解放人們的雙手，讓機器人系統成爲一種高品質的伴侶；所述雙眼測距攝影鏡頭對家庭使用者進行自動識別與測距，並根據所述測距訊號實現語音訊號自動增益控制的智慧化，便於使用者在家居場景中不同距離內，採用自然說話音量都能夠使得所述機器人系統獲取高準確率的語音識別效果；並對所述語音訊號進行定位以及電機轉動模組實現平滑轉動，使得所述多通道麥克風陣列和所述雙眼測距攝影鏡頭自動對準目標，實現了遮罩噪音、提升語音拾取信噪比；且所述DLP內投顯示模組解決了交換使用者介面不友好的問題，爲使用者提供更加便捷也更具未來科技的顯示體驗。In summary, due to the adoption of the above technical solution, the present patent application describes a robot system and a control method thereof, which have the beneficial effects of: the robot system, through the human voice, the most natural human-machine dialogue mode, at home In the environment, the hands of the people are liberated, and the robot system becomes a high-quality companion; the binocular distance measuring lens automatically recognizes and measures the home user, and realizes the automatic gain of the voice signal according to the ranging signal. The intelligentization of the control makes it easy for the user to obtain a high-accuracy speech recognition effect by using the natural speech volume within different distances in the home scene; and positioning the speech signal and implementing the motor rotation module Smooth rotation, the multi-channel microphone array and the binocular ranging lens are automatically aligned with the target, achieving mask noise and improving the voice pickup signal-to-noise ratio; and the DLP internal projection display module solves the exchange use The problem of unfriendly interface is to provide users with a more convenient and future-proof display experience.

下面結合附圖對本發明的具體實施方式作進一步的說明。The specific embodiments of the present invention are further described below in conjunction with the accompanying drawings.

實施例一Embodiment 1

如圖1所示，本發明提供了一種機器人系統，所述機器人系統包括：麥克風陣列，對多路麥克風語音訊號進行同步擷取，並將擷取到的所述語音訊號通過I2S音頻介面實時傳遞至DSP加速輔助處理器3；雙眼測距攝影鏡頭2，同步擷取兩通道圖像訊號，並將擷取到的所述圖像訊號通過USB介面實時傳輸至DSP加速輔助處理器3；DSP加速輔助處理器3，接收所述語音訊號和所述圖像訊號，對語音增強演算法及雙眼測距視覺演算法加速，將所述語音訊號和所述圖像訊號分別進行語音增強處理以及數目視覺處理獲取測距訊號，並將處理結果傳遞至機器人主控制器4；機器人主控制器4，與所述DSP加速輔助處理器3通過USB介面相連，接收處理後的所述語音訊號和所述測距訊號，根據所述測距訊號對所述語音訊號進行自動增益控制，並根據對所述語音訊號進行定位獲取轉動角度訊號，實現調度和控制所述機器人系統；雲端語音識別引擎5，與所述機器人主控制器4進行交換，回饋給使用者準確及個性化的自動語音識別與自然語義處理的結果，實現遠場自動語音識別的Hands-free的産品體驗。As shown in FIG. 1 , the present invention provides a robot system, comprising: a microphone array, synchronously capturing multi-channel microphone voice signals, and transmitting the captured voice signals through the I2S audio interface in real time. To the DSP acceleration auxiliary processor 3; the binocular ranging lens 2, synchronously captures two channels of image signals, and transmits the captured image signals to the DSP acceleration auxiliary processor 3 through the USB interface in real time; DSP Accelerating the auxiliary processor 3, receiving the voice signal and the image signal, accelerating the speech enhancement algorithm and the binocular ranging visual algorithm, respectively performing voice enhancement processing on the voice signal and the image signal, and The number visual processing acquires the ranging signal, and transmits the processing result to the robot main controller 4; the robot main controller 4 is connected to the DSP acceleration auxiliary processor 3 through the USB interface, and receives the processed voice signal and the Deriving a distance measurement signal, performing automatic gain control on the voice signal according to the ranging signal, and acquiring a rotation angle according to positioning the voice signal No., realizes scheduling and controlling the robot system; the cloud speech recognition engine 5 exchanges with the robot main controller 4, and feeds back to the user the result of accurate and personalized automatic speech recognition and natural semantic processing to realize far field Automatic speech recognition for the Hands-free product experience.

電機轉動模組6，與所述機器人系統中轉動結構無縫連接，根據所述機器人主控制器4發送的轉動角度訊號進行平滑轉動；DLP內投顯示模組7，位於所述機器人頭部的內部，將所述機器人的交換使用者介面高清投影在所述機器人的臉部，爲使用者帶來未來對話模式的科技感。The motor rotation module 6 is seamlessly connected with the rotating structure of the robot system, and performs smooth rotation according to the rotation angle signal sent by the robot main controller 4; the DLP internal projection display module 7 is located at the robot head. Internally, the high-definition user interface of the robot is projected on the face of the robot, which brings the user a sense of technology in the future conversation mode.

其中，所述麥克風陣列爲多通道麥克風陣列1，能夠同步擷取不同方向的語音訊號。所述DSP加速輔助處理器3採用T1德州儀器高性能浮點運算數字訊號處理器平臺，所述機器人主控制器4採用多核ARM處理器。所述DSP加速輔助處理器3接收所述圖像訊號，並對所述圖像訊號中的目標使用者進行識別，獲取目標用於與所述機器人的距離，即所述測距訊號。The microphone array is a multi-channel microphone array 1 capable of synchronously capturing voice signals in different directions. The DSP acceleration auxiliary processor 3 adopts a T1 Texas Instruments high performance floating point arithmetic digital signal processor platform, and the robot main controller 4 adopts a multi-core ARM processor. The DSP acceleration auxiliary processor 3 receives the image signal, and identifies a target user in the image signal, and acquires a distance that the target uses for the robot, that is, the ranging signal.

所述機器人主控制器4接收所述DSP加速輔助處理器3增強後的高信噪比的所述語音訊號，實現語音啟動、語音尋向、本地自動語音識別，獲取語音識別訊號，並與所述雲端語音識別引擎5進行交換，回饋給使用者準確及個性化的自動語音識別與自然語義處理結果。從而實現遠場自動語音上識別的Hands-free的産品體驗。另外，所述機器人主控制器4接收並實時更新所述DSP加速輔助處理器3經雙目視覺深度演算法處理後提取的測距訊號，並根據所述測距訊號對所述語音訊號進行自動增益控制，從而實現使用者在家居場景的遠距離條件下，不同距離都可以用自然的說話音量快速啟動所述機器人，並進行自動語音識別交換。The robot main controller 4 receives the voice signal of the high signal-to-noise ratio enhanced by the DSP acceleration auxiliary processor 3, implements voice initiation, voice homing, local automatic voice recognition, and acquires a voice recognition signal, and The cloud speech recognition engine 5 exchanges and feeds the user with accurate and personalized automatic speech recognition and natural semantic processing results. Thus, the Hands-free product experience of far-field automatic voice recognition is realized. In addition, the robot main controller 4 receives and updates the ranging signal extracted by the DSP acceleration auxiliary processor 3 after being processed by the binocular visual depth algorithm, and automatically performs the voice signal according to the ranging signal. Gain control, so that the user can quickly start the robot with natural speaking volume at different distances under the remote conditions of the home scene, and perform automatic speech recognition exchange.

所述機器人主控制器4通過I2C串列控制協定及HDMI介面與所述DLP內投顯示模組7連接，實現機器人表情，並對從所述DSP加速輔助處理器3中擷取到的所述測距訊號處理後獲取轉動角度訊號，所述電機轉動模組6根據所述轉動角度訊號進行轉動，將所述多通道麥克風陣列1對準說話使用者，通過波束形成的方式來抑制遠場環境中的噪音與回音，提高語音訊號的信噪比，從而提升語音識別的準確率。同時，語音尋向轉動可以將所述機器人介面最准使用者，從而進一步提高所述機器人系統與使用者交換的趣味性。The robot main controller 4 is connected to the DLP internal projection display module 7 through an I2C serial control protocol and an HDMI interface to implement a robot expression, and the said image is extracted from the DSP acceleration auxiliary processor 3 After the ranging signal processing, the rotation angle signal is obtained, and the motor rotation module 6 rotates according to the rotation angle signal, and the multi-channel microphone array 1 is aligned with the speaking user, and the far field environment is suppressed by beam forming. Noise and echo in the voice, improve the signal-to-noise ratio of the voice signal, thereby improving the accuracy of voice recognition. At the same time, the voice homing rotation can maximize the user interface of the robot interface, thereby further increasing the interest of the robot system to exchange with the user.

所述機器人系統，通過語音這種人類最自然的人機對話模式，在家居環境內，解放人們的雙手，讓機器人系統成爲一種高品質的伴侶。所述雙眼測距攝影鏡頭2對家庭使用者進行自動識別與測距，並根據所述測距訊號實現語音訊號自動增益控制的智慧化，便於使用者在家居場景中不同距離內，採用自然說話音量都能夠使得所述機器人系統獲取高準確率的語音識別效果。並對所述語音訊號進行定位以及電機轉動模組6實現平滑轉動，使得所述多通道麥克風陣列1和所述雙眼測距攝影鏡頭2自動對準目標，實現了遮罩噪音、提升語音拾取信噪比。且所述DLP內投顯示模組7解決了交換使用者介面不友好的問題，爲使用者提供更加便捷也更具未來科技的顯示體驗。The robot system, through the human voice, the most natural human-machine dialogue mode, liberates people's hands in the home environment, and makes the robot system a high-quality companion. The binocular distance measuring lens 2 automatically recognizes and measures the home user, and realizes the intelligentization of the voice signal automatic gain control according to the ranging signal, so that the user can adopt the natural in different distances in the home scene. The speaking volume enables the robot system to acquire a high-accuracy speech recognition effect. And positioning the voice signal and the motor rotation module 6 to achieve smooth rotation, so that the multi-channel microphone array 1 and the binocular distance measuring lens 2 are automatically aligned with the target, thereby achieving mask noise and improving voice pickup. Signal to noise ratio. Moreover, the DLP internal display module 7 solves the problem that the user interface is unfriendly, and provides the user with a more convenient and more future display experience.

實施例二Embodiment 2

根據實施例一提出的一種機器人系統，本實施例基於該系統提出了一種機器人控制方法，如圖2所示，具體包括步驟：S1：採取語音訊號和圖像訊號，並將這兩種訊號傳輸給所述DSP加速輔助處理器3；S2：所述DSP加速輔助處理器3對所述語音訊號和所述圖像訊號進行處理後傳遞至所述機器人主控制器4，獲取語音識別訊號以及轉動角度訊號；S3：所述雲端語音識別引擎5接收所述語音識別訊號，向使用者回饋自動語音識別與自然語義處理的結果；S4：所述電機轉動模組6接收所述轉動角度訊號，並根據所述轉動角度訊號進行平滑轉動。According to a robot system according to the first embodiment, the present embodiment provides a robot control method based on the system, as shown in FIG. 2, which specifically includes the steps of: S1: taking a voice signal and an image signal, and transmitting the two signals. Accelerating the auxiliary processor 3 to the DSP; S2: the DSP acceleration auxiliary processor 3 processes the voice signal and the image signal and transmits the signal to the robot main controller 4 to acquire a voice recognition signal and rotate An angle signal; S3: the cloud speech recognition engine 5 receives the speech recognition signal, and returns a result of automatic speech recognition and natural semantic processing to the user; S4: the motor rotation module 6 receives the rotation angle signal, and Smooth rotation is performed according to the rotation angle signal.

其中，如圖3所示步驟S1包括：S11：所述多通道麥克風陣列1同步擷取不同方向的語音訊號，傳遞至所述DSP加速輔助處理器3；S12：所述雙眼測距攝影鏡頭2同步擷取兩通道的所述圖像訊號，傳遞至所述DSP加速輔助處理器3。The step S1 shown in FIG. 3 includes: S11: the multi-channel microphone array 1 synchronously captures voice signals in different directions and transmits the voice signals to the DSP acceleration auxiliary processor 3; S12: the binocular distance measuring lens 2 synchronously capturing the image signals of the two channels and transmitting them to the DSP acceleration auxiliary processor 3.

如圖4所示，所述步驟S2包括：S21：所述DSP加速輔助處理器3將所述語音訊號進行波束形成處理，並對語音訊號中混雜的噪音及回音訊號進行語音增強處理；S22：所述DSP加速輔助處理器3對雙眼測距視覺演算法進行硬體並行加速，對所述圖像訊號進行處理，獲取測距訊號；S23：所述機器人主控制器4根據所述測距訊號對所述語音訊號進行自動增益處理；S24：所述機器人主控制器4接收所述語音訊號獲取所述語音識別訊號，並對所述語音訊號進行定位獲取所述轉動角度訊號。As shown in FIG. 4, the step S2 includes: S21: the DSP acceleration auxiliary processor 3 performs beamforming processing on the voice signal, and performs voice enhancement processing on the mixed noise and echo signals in the voice signal; S22: The DSP acceleration auxiliary processor 3 performs hardware parallel acceleration on the binocular ranging visual algorithm, processes the image signal to obtain a ranging signal, and S23: the robot main controller 4 according to the ranging The signal is subjected to automatic gain processing on the voice signal; S24: the robot main controller 4 receives the voice signal to acquire the voice recognition signal, and locates the voice signal to obtain the rotation angle signal.

所述機器人控制方法，通過所述多通道麥克風陣列1和所述雙眼測距攝影鏡頭2獲取所述語音訊號和所述圖像訊號，對所述圖像訊號進行處理後獲取識別和測距訊號，所述語音訊號根據所述識別和測距訊號進行相應的自動增益處理，便於使用者在家居場景中不同距離內，採用自然說話音量都能夠使得所述機器人系統獲取高準確率的語音識別效果。對所述語音訊號進行定位，所述電機轉動模組6根據定位結果進行平滑轉動，使得所述多通道麥克風陣列1和所述雙眼測距攝影鏡頭2對準目標，實現了遮罩噪音、提升語音拾取信噪比的效果。The robot control method acquires the voice signal and the image signal through the multi-channel microphone array 1 and the binocular ranging lens 2, and processes the image signal to obtain identification and ranging. a signal, wherein the voice signal performs corresponding automatic gain processing according to the identification and ranging signals, so that the user can obtain the high-accuracy speech recognition by using the natural speaking volume within different distances in the home scene. effect. Positioning the voice signal, the motor rotation module 6 performs smooth rotation according to the positioning result, so that the multi-channel microphone array 1 and the binocular distance measuring lens 2 are aligned with the target, thereby achieving mask noise, Improve the effect of voice pickup signal to noise ratio.

以上所述僅爲本發明較佳的實施例，並非因此限制本發明的實施方式及保護範圍，對於本領域技術人員而言，應當能夠意識到凡運用本發明說明書及圖示內容所做出的等同替換和顯而易見的變化所得到的方案，均應當包含在本發明的保護範圍內。The above is only a preferred embodiment of the present invention, and is not intended to limit the scope of the embodiments and the scope of the present invention. Those skilled in the art should be able to Combinations of equivalent substitutions and obvious variations are intended to be included within the scope of the invention.

1‧‧‧多通道麥克風陣列
2‧‧‧雙眼測距攝影鏡頭
3‧‧‧DSP加速輔助處理器
4‧‧‧機器人主控制器
5‧‧‧雲端語音識別引擎
6‧‧‧電機轉動模組
7‧‧‧DLP內投顯示模組
S1-S‧‧‧步驟1‧‧‧Multichannel Microphone Array
2‧‧‧Binocular distance measuring lens
3‧‧‧DSP Acceleration Auxiliary Processor
4‧‧‧Robot master controller
5‧‧‧Cloud speech recognition engine
6‧‧‧Motor rotation module
7‧‧‧DLP internal projection display module
S1-S‧‧ steps

圖1是本發明一種機器人系統的結構示意圖；圖2是本發明一種機器人控制方法的流程圖一；圖3是本發明一種機器人控制方法的流程圖二；圖4是本發明一種機器人控制方法的流程圖三。1 is a schematic structural view of a robot system according to the present invention; FIG. 2 is a flow chart 1 of a robot control method according to the present invention; FIG. 3 is a flow chart 2 of a robot control method according to the present invention; Flow chart three.

Claims

A robot system comprising: a microphone array for receiving a voice signal sent by a user; a binocular ranging lens for simultaneously capturing two channels of image signals; and a processing module respectively for the microphone array and the binocular ranging The photographic lens is connected to receive and process the voice signal and the image signal to generate a voice recognition signal and a rotation angle signal; the processing module includes: a DSP acceleration auxiliary processor, and the microphone array and the pair respectively The eye-ranging photographic lens is connected to receive the voice signal and the image signal, perform voice enhancement processing on the voice signal, identify a target user in the image signal, and identify the image The signal is processed to obtain the distance between the target user and the robot system as a ranging signal, and the robot main controller is connected to the DSP acceleration auxiliary processor via a USB interface to receive the voice signal processed by the voice enhancement and the a ranging signal, and performing automatic gain processing on the voice signal according to the ranging signal to obtain the voice recognition signal And positioning the voice signal to obtain the rotation angle signal; and a motor rotation module connected to the processing module, receiving and smoothly rotating according to the rotation angle signal, and the microphone array and The binocular distance measuring lens is aimed at the speaking target.

The robot system of claim 1, wherein the robot main controller is a multi-core ARM processor.

The robot system of claim 1, wherein the robot system further comprises: The cloud speech recognition engine is connected to the robot main controller, receives the speech recognition signal and exchanges with the robot main controller, and returns the result of automatic speech recognition and natural semantic processing to the user.

The robot system of claim 2, wherein the robot system further comprises: a DLP internal projection display module, connected to the robot main controller, located inside the robot system head, and the robot The exchange user interface is projected on the face of the robot.

The robot system of claim 4, wherein the robot main controller is connected to the DLP internal projection display module through an I2C serial control protocol and an HDMI interface.

The robot system of claim 1, wherein the microphone array is a multi-channel microphone array.

A robot control method for the robot system according to any one of claims 1 to 6, comprising the steps of: S1: receiving a voice signal and an image signal, and transmitting the signal to a DSP acceleration auxiliary processor; S2: DSP acceleration assistance The processor processes the voice signal and the image signal and transmits the signal to the main controller of the robot to obtain a voice recognition signal and a rotation angle signal. The step S2 includes the following steps: S21: the DSP accelerates the auxiliary processor Performing beamforming processing on the voice signal, and performing voice enhancement processing on the mixed noise and echo signals in the voice signal; S22: the DSP acceleration auxiliary processor performs hardware parallel acceleration on the binocular ranging visual algorithm, The image signal is processed to obtain a ranging signal; S23: the robot main controller performs automatic gain processing on the voice signal according to the ranging signal; S24: the robot main controller receives the voice signal to obtain the voice recognition signal, and locates the voice signal to obtain the rotation angle signal; S3: the cloud voice recognition engine receives the voice recognition signal, and sends the voice recognition signal to the household The result of the automatic speech recognition and the natural semantic processing is fed back; S4: the rotation control module receives the rotation angle signal, and performs smooth rotation according to the rotation angle signal.

The robot control method according to claim 7, wherein the step S1 includes the following steps: S11: the multi-channel microphone array synchronously captures the voice signals in different directions and transmits the voice signals to the DSP acceleration auxiliary processor; S12 The binocular ranging photographic lens synchronously captures the image signals of the two channels and transmits the image signals to the DSP acceleration auxiliary processor.