WO2020228349A1 - Virtual news anchor system based on air imaging and implementation method therefor - Google Patents

Virtual news anchor system based on air imaging and implementation method therefor Download PDF

Info

Publication number
WO2020228349A1
WO2020228349A1 PCT/CN2019/129947 CN2019129947W WO2020228349A1 WO 2020228349 A1 WO2020228349 A1 WO 2020228349A1 CN 2019129947 W CN2019129947 W CN 2019129947W WO 2020228349 A1 WO2020228349 A1 WO 2020228349A1
Authority
WO
WIPO (PCT)
Prior art keywords
news
signal
broadcast content
emotional state
input signal
Prior art date
Application number
PCT/CN2019/129947
Other languages
French (fr)
Chinese (zh)
Inventor
李新福
Original Assignee
广东康云科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东康云科技有限公司 filed Critical 广东康云科技有限公司
Publication of WO2020228349A1 publication Critical patent/WO2020228349A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/011Emotion or mood input determined on the basis of sensed human body parameters such as pulse, heart rate or beat, temperature of skin, facial expressions, iris, voice pitch, brain activity patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/015Input arrangements based on nervous system activity detection, e.g. brain waves [EEG] detection, electromyograms [EMG] detection, electrodermal response detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output

Definitions

  • the invention relates to the technical field of virtual anchors, in particular to a virtual news anchor system based on air imaging and an implementation method thereof.
  • the virtual news anchor is a technology that broadcasts news by simulating the image of a real person through a display screen.
  • the virtual news anchor is displayed through a medium such as a liquid crystal display screen, and the virtual news anchor is limited to the plane where the display screen is located, lacks a three-dimensional effect, and appears rigid and not friendly enough.
  • the purpose of the present invention is to provide a virtual news anchor system based on air imaging and an implementation method thereof.
  • the embodiment of the present invention includes a virtual news anchor system based on air imaging, including:
  • a signal detection device for detecting an input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
  • the control device is used to generate a three-dimensional model and search for broadcast content according to the input signal, and then generate an audio signal according to the broadcast content;
  • the display device is used to receive the three-dimensional model generated by the control device, and then display it in the air through air imaging;
  • the broadcasting device is used to receive the audio signal generated by the control device, and then play it through audio mode.
  • a local knowledge base is stored in the control device, and the knowledge contained in the local knowledge base is used to determine the correspondence between the question and the answer;
  • the air imaging-based virtual news anchor system also includes a server, so The server is used to generate new knowledge and update the local knowledge base stored in the control device, so that when the control device receives the input signal, it can analyze the problem contained in the input signal and follow the local knowledge
  • the library, the server, and the Internet retrieve the answer corresponding to the question in order of priority, thereby generating an audio signal based on the answer.
  • the input signal includes a gesture signal
  • the control device includes:
  • a sign language recognition unit for recognizing sign language information from the gesture signal
  • the keyword extraction unit is used to extract keywords from the sign language information
  • the news information retrieval unit is used to retrieve and obtain news information according to the keywords
  • a broadcast content generating unit configured to generate broadcast content according to the news message retrieved by the news message retrieval unit
  • 3D model generating unit used to generate 3D model
  • the audio signal generating unit is configured to generate an audio signal according to the broadcast content.
  • the input signal includes a facial expression signal
  • the control device includes:
  • the facial expression recognition unit is used to obtain the facial expression and recognize the emotional state corresponding to the facial expression; the emotional state includes seriousness, joy, sadness, and excitement;
  • the news information retrieval unit is used to retrieve and obtain news information according to the emotional state
  • a broadcast content generating unit configured to generate broadcast content according to the news message retrieved by the news message retrieval unit
  • 3D model generating unit used to generate 3D model
  • the audio signal generating unit is configured to generate an audio signal according to the broadcast content.
  • the news message retrieval unit is specifically configured to:
  • the embodiment of the present invention also includes a method for implementing a virtual news anchor based on air imaging, including the following steps:
  • the input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
  • the method for implementing a virtual news anchor based on air imaging further includes the following steps:
  • An audio signal is generated based on the answer.
  • the input signal includes a gesture signal
  • the step of generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content specifically includes:
  • An audio signal is generated according to the broadcast content.
  • the input signal includes a facial expression signal
  • the step of generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content specifically includes:
  • the emotional state includes seriousness, joy, sadness, and excitement;
  • An audio signal is generated according to the broadcast content.
  • the step of retrieving and obtaining news information according to the emotional state specifically includes:
  • the beneficial effect of the present invention is that the virtual news anchor system and its implementation method in the embodiment of the present invention can display the three-dimensional model of the virtual news anchor in the air imaging mode, and the virtual news anchor can be displayed in the air without the help of a display screen. Displayed in three-dimensional medium, can bring a huge visual shock experience, and the mouth shape, eye movements and facial expressions of the virtual news anchor can cooperate with the news message played by the broadcasting device to move, which has a strong sense of reality and can be very good
  • the broadcast effect of the live news anchor is simulated, which greatly enhances the experience of the virtual news anchor.
  • Figure 1 is a structural block diagram of a specific implementation of the virtual news anchor system of the method of the present invention
  • FIG. 2 is a structural block diagram of another specific implementation of the virtual news anchor system of the method of the present invention.
  • Fig. 3 is a flowchart of a specific implementation manner of a virtual news anchor realization method according to the present invention.
  • a virtual news anchor system based on air imaging in this embodiment includes:
  • a signal detection device for detecting an input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
  • the control device is used to generate a three-dimensional model and search for broadcast content according to the input signal, and then generate an audio signal according to the broadcast content;
  • the display device is used to receive the three-dimensional model generated by the control device, and then display it in the air through air imaging;
  • the broadcasting device is used to receive the audio signal generated by the control device, and then play it through audio mode.
  • the signal detection device includes:
  • Somatosensory sensor used to obtain the input somatosensory signal
  • Gesture sensor used to obtain the input gesture signal
  • Eye tracker used to obtain the input eye movement signal
  • Touch module used to obtain the input touch signal
  • the voice collection module is used to obtain the input voice signal
  • Brain wave acquisition device for acquiring input brain wave signals
  • the camera is used to obtain the input image signal to recognize the user's facial expression.
  • the control device in this embodiment is a device with data storage capabilities and processing capabilities.
  • a personal computer can be used as the control device in this embodiment.
  • the control device is connected to the Internet, and can retrieve the resources to be obtained on the Internet.
  • Users can interact with the virtual news anchor system by making somatosensory actions, making gestures, making eye movements, touching, making voices, making brain waves, and making facial expressions.
  • the control device is used to generate a three-dimensional model and search for broadcast content according to the input signal, and then generate an audio signal according to the broadcast content.
  • the broadcast content refers to multimedia content capable of delivering current affairs news, such as a news release issued by a news agency.
  • the control device converts the audio signal according to the text information of the broadcast content, and the audio signal can drive the broadcast device to emit sound effects, thereby reading the broadcast content.
  • the control device converts the text information of the broadcast content to obtain a three-dimensional model.
  • the three-dimensional model can be expressed as a male or female human announcer, or as a cartoon character such as a cat, dog, cow, and chicken.
  • the three-dimensional model is Variable, mainly because the mouth shape, eye movements and facial expressions of the 3D model change synchronously with the audio signal, so that the mouth shape, eye movements and facial expressions of the 3D model can match the text corresponding to the sound effect emitted by the audio signal. stand up.
  • the display device in this embodiment is an air imager. 1 or 2, the display device includes a driving unit and a projection unit.
  • the driving unit is used for receiving the three-dimensional model generated by the control device, and the projection unit is used for projecting the air under the driving of the driving unit to display the three-dimensional model.
  • a corresponding decoding program and a driver program are installed on the driving unit, the decoding program is used to decode the received three-dimensional model generated by the control device, and the driver program drives the projection unit to project the air according to the decoding result Thus, the three-dimensional model is displayed.
  • the broadcasting device in this embodiment may be a speaker. 1 or 2, the broadcasting device includes a power amplifier unit and a speaker.
  • the power amplifier unit is used for receiving and amplifying the audio signal generated by the control device, and the speaker is used for emitting corresponding sound effects under the driving of the power amplifier unit.
  • the power amplifier unit is equipped with a corresponding decoding circuit, an amplifying circuit and a noise reduction circuit, the decoding circuit is used to decode the received audio signal generated by the control device, and the amplifying circuit amplifies the result of the decoding,
  • the noise reduction circuit is used for reducing noise during the working process of the amplifying circuit.
  • the virtual news anchor system in this embodiment can display the three-dimensional model of the virtual news anchor through air imaging.
  • the displayed virtual news anchor is three-dimensional, and the virtual news anchor’s mouth shape, eye movements, facial expressions, etc. It can act in conjunction with the news messages played by the broadcasting device, has a strong sense of reality, and can well simulate the broadcasting effect of a live news anchor, thereby greatly enhancing the experience of the virtual news anchor.
  • the air imaging-based virtual news anchor system also includes a server, the server is used to generate new knowledge through AI program training and learning and update the local knowledge base, the local knowledge base is stored in the control In the installation.
  • the control device receives the input signal, it analyzes the question contained in the input signal through the voice recognition program, and searches for the answer corresponding to the question according to the priority order of the local knowledge base, the server and the Internet, thereby An audio signal is generated based on the answer.
  • the control device first searches in the local knowledge base, if the answer is retrieved in the local knowledge base, it will generate an audio signal from the answer retrieved in the local knowledge base; if it is not retrieved in the local knowledge base When the answer is reached, it will be retrieved from the knowledge base stored on the server. If the answer is retrieved from the knowledge base stored on the server, an audio signal will be generated from the answer retrieved from the knowledge base stored on the server. Answers are retrieved from the stored knowledge base and connected to search engines such as Baidu for retrieval, so that the virtual news anchor system can retrieve matching answers and realize intelligent answers to questions raised by users.
  • the input signal includes a voice signal
  • the control device includes:
  • a voice recognition unit for recognizing voice information from the gesture signal
  • the keyword extraction unit is used to extract keywords from the voice information
  • the news information retrieval unit is used to retrieve and obtain news information according to the keywords
  • a broadcast content generating unit configured to generate broadcast content according to the news message retrieved by the news message retrieval unit
  • 3D model generating unit used to generate 3D model
  • the audio signal generating unit is configured to generate an audio signal according to the broadcast content.
  • the voice recognition unit, keyword extraction unit, news information retrieval unit, broadcast content generation unit, three-dimensional model generation unit and audio signal generation unit are software modules with corresponding functions installed in the control device.
  • the voice recognition unit has a voice recognition program that can recognize the content contained in the gesture signal. For example, some voice recognition programs can convert the gesture signal into text.
  • the keyword extraction unit may extract keywords from the content of the voice signal.
  • the news information retrieval unit is connected to a news search engine on the Internet, searches using the keywords extracted by the keyword extraction unit, and obtains news information returned by the news search engine.
  • the broadcast content generating unit processes news messages by deleting commercial advertisements and other irrelevant messages, rearranging the sequence, and extracting key paragraphs, etc., so as to generate broadcast content.
  • the three-dimensional model generating unit is a three-dimensional modeling program, which can generate a three-dimensional model representing a male or female human announcer image or a cartoon image of a cat, dog, cow, chicken, etc.
  • the mouth shape of the three-dimensional model is, Eye movements and facial expressions change with the pronunciation of each text when the broadcast content is played in time sequence.
  • the audio signal generating unit is a text-to-speech conversion program that can convert each text to obtain a corresponding audio signal.
  • the control device can recognize the voice signal and retrieve the corresponding news message from the Internet. Users can interact with virtual news anchors by voice to get the news they want.
  • the input signal includes a gesture signal.
  • the control device includes:
  • a sign language recognition unit for recognizing sign language information from the gesture signal
  • the keyword extraction unit is used to extract keywords from the sign language information
  • the news information retrieval unit is used to retrieve and obtain news information according to the keywords
  • a broadcast content generating unit configured to generate broadcast content according to the news message retrieved by the news message retrieval unit
  • 3D model generating unit used to generate 3D model
  • the audio signal generating unit is configured to generate an audio signal according to the broadcast content.
  • the sign language recognition unit, keyword extraction unit, news message retrieval unit, broadcast content generation unit, three-dimensional model generation unit and audio signal generation unit are software modules with corresponding functions installed in the control device.
  • the sign language recognition unit has a sign language recognition program that can recognize the content contained in the gesture signal. For example, some sign language recognition programs can convert the gesture signal into text.
  • the keyword extraction unit may extract keywords from the content of the sign language signal.
  • the news information retrieval unit is connected to a news search engine on the Internet, searches using the keywords extracted by the keyword extraction unit, and obtains news information returned by the news search engine.
  • the broadcast content generating unit processes news messages by deleting commercial advertisements and other irrelevant messages, rearranging the sequence, and extracting key paragraphs, etc., to generate broadcast content.
  • the three-dimensional model generating unit is a three-dimensional modeling program, which can generate a three-dimensional model representing a male or female human announcer image or a cartoon image of a cat, dog, cow, chicken, etc.
  • the mouth shape of the three-dimensional model is, Eye movements and facial expressions change with the pronunciation of each text when the broadcast content is played in time sequence.
  • the audio signal generating unit is a text-to-speech conversion program that can convert each text to obtain a corresponding audio signal.
  • the control device can recognize sign language signals and retrieve corresponding news messages from the Internet.
  • the virtual news anchor system in this embodiment is very friendly to people who have difficulty in pronunciation. They only need to make sign language gestures to the virtual news anchor system, and the virtual news anchor system can recognize the sign language signal and retrieve the corresponding signal from the Internet. With news messages, people who have difficulty in pronunciation can also enjoy the convenience brought by the virtual news anchor system of the present invention.
  • the input signal includes a facial expression signal
  • the control device includes:
  • the facial expression recognition unit is used to obtain the facial expression and recognize the emotional state corresponding to the facial expression; the emotional state includes seriousness, joy, sadness, and excitement;
  • the news information retrieval unit is used to retrieve and obtain news information according to the emotional state
  • a broadcast content generating unit configured to generate broadcast content according to the news message retrieved by the news message retrieval unit
  • 3D model generating unit used to generate 3D model
  • the audio signal generating unit is configured to generate an audio signal according to the broadcast content.
  • the facial expression recognition unit, news information retrieval unit, broadcast content generation unit, three-dimensional model generation unit and audio signal generation unit are software modules with corresponding functions installed in the control device.
  • the facial expression recognition unit has a facial expression recognition program that can recognize the current emotional state of seriousness, joy, sadness, excitement, etc. corresponding to the facial expression of the user.
  • the news message retrieval unit is connected to a news search engine on the Internet, searches for news messages of corresponding classifications according to the emotional state of the user, and obtains news messages returned by the news search engine.
  • the broadcast content generating unit processes news messages by deleting commercial advertisements and other irrelevant messages, rearranging the sequence, and extracting key paragraphs, etc., so as to generate broadcast content.
  • the three-dimensional model generating unit is a three-dimensional modeling program, which can generate a three-dimensional model representing a male or female human announcer image or a cartoon image of a cat, dog, cow, chicken, etc.
  • the mouth shape of the three-dimensional model is, Eye movements and facial expressions change with the pronunciation of each text when the broadcast content is played in time sequence.
  • the audio signal generating unit contains a text-to-speech conversion program, which can convert each text to obtain a corresponding audio signal.
  • the news information retrieval unit is specifically configured to:
  • facial expression When the user’s facial expression reflects that the user’s emotional state is serious, it indicates that the user is currently more rational. If you listen to political and economic news, you will get better results; when the user’s facial expression reflects the user When the emotional state of the user is joy, it indicates that the user is currently more emotional. If you listen to the news of people’s livelihood, you will get the effect of loving life more; when the user’s facial expression reflects the emotional state of the user is sad, it indicates that the user Need to receive mental treatment. If you listen to entertainment news, you will get better treatment.
  • the entertainment news includes news related to TV series, movies, celebrities and funny; when the user’s facial expression reflects the use of When the emotional state of the user is excited, it indicates that the user needs to calm down quickly. At this time, the user can watch the humanities news related to history, art and society.
  • the virtual news anchor system in this embodiment can actively push different types of news messages according to different emotional states of users, thereby providing a better news experience.
  • control device further includes:
  • the artificial intelligence unit is used to use artificial intelligence to train the three-dimensional model generated by the three-dimensional model generating unit.
  • the artificial intelligence unit trains the three-dimensional model through tools such as convolutional neural networks, which can make the mouth shape, eye movements and facial expressions of the three-dimensional model more natural, and can change different clothes, wear and makeup for the three-dimensional model. Shape, provide a better and more realistic news experience.
  • tools such as convolutional neural networks, which can make the mouth shape, eye movements and facial expressions of the three-dimensional model more natural, and can change different clothes, wear and makeup for the three-dimensional model. Shape, provide a better and more realistic news experience.
  • a method for implementing a virtual news anchor based on air imaging includes the following steps:
  • the input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
  • the method for implementing a virtual news anchor based on air imaging further includes the following steps:
  • the steps S5-S8 can be executed by the server.
  • the server is used to generate new knowledge and update the local knowledge base through the training and learning of the AI program, and the local knowledge base is stored in the control device.
  • the control device receives the input signal, it analyzes the question contained in the input signal through the voice recognition program, and searches for the answer corresponding to the question according to the priority order of the local knowledge base, the server and the Internet, thereby An audio signal is generated based on the answer.
  • the control device first searches in the local knowledge base, if the answer is retrieved in the local knowledge base, it will generate an audio signal from the answer retrieved in the local knowledge base; if it is not retrieved in the local knowledge base When the answer is reached, it will be retrieved from the knowledge base stored on the server. If the answer is retrieved from the knowledge base stored on the server, an audio signal will be generated from the answer retrieved from the knowledge base stored on the server. Answers are retrieved from the stored knowledge base and connected to search engines such as Baidu for retrieval, so that the virtual news anchor system can retrieve matching answers and realize intelligent answers to questions raised by users.
  • the input signal includes a gesture signal
  • S206A Generate an audio signal according to the broadcast content.
  • the input signal includes a facial expression signal
  • S205B Generate an audio signal according to the broadcast content.
  • step S202B specifically includes:
  • the step of generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content, namely step S2, specifically further includes:
  • the embodiments of the present invention can be realized or implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer-readable memory.
  • the method can be implemented in a computer program using standard programming techniques-including a non-transitory computer readable storage medium configured with a computer program, where the storage medium so configured allows the computer to operate in a specific and predefined manner-according to the specific
  • Each program can be implemented in a high-level process or object-oriented programming language to communicate with the computer system. However, if necessary, the program can be implemented in assembly or machine language. In any case, the language can be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
  • the method can be implemented in any type of computing platform that is operably connected to a suitable computing platform, including but not limited to a personal computer, a mini computer, a main frame, a workstation, a network or a distributed computing environment, a separate or integrated computer Platform, or communication with charged particle tools or other imaging devices, etc.
  • a suitable computing platform including but not limited to a personal computer, a mini computer, a main frame, a workstation, a network or a distributed computing environment, a separate or integrated computer Platform, or communication with charged particle tools or other imaging devices, etc.
  • Aspects of the present invention can be implemented by machine-readable codes stored on non-transitory storage media or devices, whether removable or integrated into a computing platform, such as hard disks, optical reading and/or writing storage media, RAM, ROM, etc., so that they can be read by a programmable computer, and when the storage medium or device is read by the computer, it can be used to configure and operate the computer to perform the processes described herein.
  • machine-readable code or part thereof, can be transmitted through a wired or wireless network.
  • a medium includes instructions or programs that implement the steps described above in combination with a microprocessor or other data processor
  • the invention described herein includes these and other different types of non-transitory computer-readable storage media.
  • the present invention also includes the computer itself.
  • a computer program can be applied to input data to perform the functions described herein, thereby converting the input data to generate output data that is stored in non-volatile memory.
  • the output information can also be applied to one or more output devices such as displays.
  • the converted data represents physical and tangible objects, including specific visual depictions of physical and tangible objects generated on the display.

Abstract

A virtual news anchor system based on air imaging and an implementation method therefor. The system comprises a signal detection device, a control device, a display device and a broadcasting device, wherein the signal detection device is used for detecting an input signal; the control device is used for generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content; the display device is used for receiving the three-dimensional model generated by the control device, and then displaying same in the air by means of air imaging; and the broadcasting device is used for playing the audio signal. A three-dimensional model of a virtual news anchor is displayed by means of air imaging, and the virtual news anchor is displayed in the air in the three-dimensional mode without the help of a display screen, which can preferably simulate the broadcasting effect of the real news anchor, thereby greatly enhancing the usage experience of the virtual news anchor.

Description

一种基于空气成像的虚拟新闻主播系统及其实现方法A virtual news anchor system based on air imaging and its realization method 技术领域Technical field
本发明涉及虚拟主播技术领域,尤其是一种基于空气成像的虚拟新闻主播系统及其实现方法。The invention relates to the technical field of virtual anchors, in particular to a virtual news anchor system based on air imaging and an implementation method thereof.
背景技术Background technique
虚拟新闻主播是一种通过显示屏等模拟出真人的形象来播报新闻的技术。现有技术中,虚拟新闻主播是通过液晶显示屏等介质显示的,虚拟新闻主播被局限在显示屏所在的平面上,缺乏立体感,显得生硬而不够亲切。The virtual news anchor is a technology that broadcasts news by simulating the image of a real person through a display screen. In the prior art, the virtual news anchor is displayed through a medium such as a liquid crystal display screen, and the virtual news anchor is limited to the plane where the display screen is located, lacks a three-dimensional effect, and appears rigid and not friendly enough.
发明内容Summary of the invention
为了解决上述技术问题,本发明的目的在于提供一种基于空气成像的虚拟新闻主播系统及其实现方法。In order to solve the above technical problems, the purpose of the present invention is to provide a virtual news anchor system based on air imaging and an implementation method thereof.
一方面,本发明实施例包括一种基于空气成像的虚拟新闻主播系统,包括:On the one hand, the embodiment of the present invention includes a virtual news anchor system based on air imaging, including:
信号检测装置,用于检测输入的信号,所述输入的信号包括手势信号、体感信号、脑波信号、眼球动作信号、语音信号、触摸信号和面部表情信号中的至少一个;A signal detection device for detecting an input signal, the input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
控制装置,用于生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号;The control device is used to generate a three-dimensional model and search for broadcast content according to the input signal, and then generate an audio signal according to the broadcast content;
展示装置,用于接收控制装置所生成的三维模型,然后在空气中通过空气成像的方式进行展示;The display device is used to receive the three-dimensional model generated by the control device, and then display it in the air through air imaging;
播音装置,用于接收控制装置所生成的音频信号,然后通过音频方式进行播放。The broadcasting device is used to receive the audio signal generated by the control device, and then play it through audio mode.
进一步地,所述控制装置中存储有本地知识库,所述本地知识库中包含的知识用于确定问题与答案之间的对应关系;所述基于空气成像的虚拟新闻主播系统还包括服务器,所述服务器用于生成新的知识并对所述控制装置存储的本地知识库进行更新,使得所述控制装置在接收到输入的信号时解析出所述输入的信号中包含的问题,并按照本地知识库、服务器和互联网的优先顺序检索与所述问题对应的答案,从而根据所述答案生成音频信号。Further, a local knowledge base is stored in the control device, and the knowledge contained in the local knowledge base is used to determine the correspondence between the question and the answer; the air imaging-based virtual news anchor system also includes a server, so The server is used to generate new knowledge and update the local knowledge base stored in the control device, so that when the control device receives the input signal, it can analyze the problem contained in the input signal and follow the local knowledge The library, the server, and the Internet retrieve the answer corresponding to the question in order of priority, thereby generating an audio signal based on the answer.
进一步地,所述输入的信号包括手势信号,所述控制装置包括:Further, the input signal includes a gesture signal, and the control device includes:
手语识别单元,用于从所述手势信号中识别出手语信息;A sign language recognition unit for recognizing sign language information from the gesture signal;
关键词提取单元,用于从所述手语信息中提取关键词;The keyword extraction unit is used to extract keywords from the sign language information;
新闻消息检索单元,用于根据所述关键词检索并获取新闻消息;The news information retrieval unit is used to retrieve and obtain news information according to the keywords;
播报内容生成单元,用于根据所述新闻消息检索单元检索到的新闻消息生成播报内容;A broadcast content generating unit, configured to generate broadcast content according to the news message retrieved by the news message retrieval unit;
三维模型生成单元,用于生成三维模型;3D model generating unit, used to generate 3D model;
音频信号生成单元,用于根据所述播报内容生成音频信号。The audio signal generating unit is configured to generate an audio signal according to the broadcast content.
进一步地,所述输入的信号包括面部表情信号,所述控制装置包括:Further, the input signal includes a facial expression signal, and the control device includes:
面部表情识别单元,用于获取所述面部表情并识别所述面部表情对应的情感状态;所述情感状态包括严肃、喜悦、哀伤、激动;The facial expression recognition unit is used to obtain the facial expression and recognize the emotional state corresponding to the facial expression; the emotional state includes seriousness, joy, sadness, and excitement;
新闻消息检索单元,用于根据所述情感状态检索并获取新闻消息;The news information retrieval unit is used to retrieve and obtain news information according to the emotional state;
播报内容生成单元,用于根据所述新闻消息检索单元检索到的新闻消息生成播报内容;A broadcast content generating unit, configured to generate broadcast content according to the news message retrieved by the news message retrieval unit;
三维模型生成单元,用于生成三维模型;3D model generating unit, used to generate 3D model;
音频信号生成单元,用于根据所述播报内容生成音频信号。The audio signal generating unit is configured to generate an audio signal according to the broadcast content.
进一步地,所述新闻消息检索单元具体用于:Further, the news message retrieval unit is specifically configured to:
当所述情感状态为严肃时,检索并获取政治经济类新闻消息;When the emotional state is serious, retrieve and obtain political and economic news;
当所述情感状态为喜悦时,检索并获取民生类新闻消息;When the emotional state is joy, retrieve and obtain livelihood news;
当所述情感状态为哀伤时,检索并获取娱乐类新闻消息;When the emotional state is sad, retrieve and obtain entertainment news information;
当所述情感状态为激动时,检索并获取人文类新闻消息。When the emotional state is excitement, search and obtain humanities news.
另一方面,本发明实施例还包括一种基于空气成像的虚拟新闻主播实现方法,包括以下步骤:On the other hand, the embodiment of the present invention also includes a method for implementing a virtual news anchor based on air imaging, including the following steps:
检测输入的信号,所述输入的信号包括手势信号、体感信号、脑波信号、眼球动作信号、语音信号、触摸信号和面部表情信号中的至少一个;Detecting an input signal, where the input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号;Generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content;
接收控制装置所生成的三维模型,然后在空气中通过空气成像的方式进行展示;Receive the three-dimensional model generated by the control device, and then display it in the air through air imaging;
接收控制装置所生成的音频信号,然后通过音频方式进行播放。Receive the audio signal generated by the control device, and then play it through audio mode.
进一步地,所述基于空气成像的虚拟新闻主播实现方法还包括以下步骤:Further, the method for implementing a virtual news anchor based on air imaging further includes the following steps:
生成新的知识并更新本地知识库;所述知识用于确定问题与答案之间的对应关系;Generate new knowledge and update the local knowledge base; the knowledge is used to determine the correspondence between the question and the answer;
在接收到输入的信号时解析出所述输入的信号中包含的问题;When receiving the input signal, analyze the problem contained in the input signal;
按照本地知识库、服务器和互联网的优先顺序检索与所述问题对应的答案;Retrieve the answer corresponding to the question according to the priority order of local knowledge base, server and Internet;
根据所述答案生成音频信号。An audio signal is generated based on the answer.
进一步地,所述输入的信号包括手势信号,所述生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号这一步骤具体包括:Further, the input signal includes a gesture signal, and the step of generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content specifically includes:
从所述手势信号中识别出手语信息;Identifying sign language information from the gesture signal;
从所述手语信息中提取关键词;Extract keywords from the sign language information;
根据所述关键词检索并获取新闻消息;Search and obtain news information according to the keywords;
根据所述新闻消息检索单元检索到的新闻消息生成播报内容;Generating broadcast content according to the news information retrieved by the news information retrieval unit;
根据所述播报内容生成三维模型;Generating a three-dimensional model according to the broadcast content;
根据所述播报内容生成音频信号。An audio signal is generated according to the broadcast content.
进一步地,所述输入的信号包括面部表情信号,所述生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号这一步骤具体包括:Further, the input signal includes a facial expression signal, and the step of generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content specifically includes:
获取所述面部表情并识别所述面部表情对应的情感状态;所述情感状态包括严肃、喜悦、哀伤、激动;Acquiring the facial expression and identifying the emotional state corresponding to the facial expression; the emotional state includes seriousness, joy, sadness, and excitement;
根据所述情感状态检索并获取新闻消息;Retrieve and obtain news information according to the emotional state;
根据所述新闻消息检索单元检索到的新闻消息生成播报内容;Generating broadcast content according to the news information retrieved by the news information retrieval unit;
根据所述播报内容生成三维模型;Generating a three-dimensional model according to the broadcast content;
根据所述播报内容生成音频信号。An audio signal is generated according to the broadcast content.
进一步地,所述根据所述情感状态检索并获取新闻消息这一步骤具体包括:Further, the step of retrieving and obtaining news information according to the emotional state specifically includes:
当所述情感状态为严肃时,检索并获取政治经济类新闻消息;When the emotional state is serious, retrieve and obtain political and economic news;
当所述情感状态为喜悦时,检索并获取民生类新闻消息;When the emotional state is joy, retrieve and obtain livelihood news;
当所述情感状态为哀伤时,检索并获取娱乐类新闻消息;When the emotional state is sad, retrieve and obtain entertainment news information;
当所述情感状态为激动时,检索并获取人文类新闻消息。When the emotional state is excitement, search and obtain humanities news.
本发明的有益效果是:本发明实施例中的虚拟新闻主播系统及其实现方法可以通过空气成像的方式将虚拟新闻主播的三维模型展示出来,不需要借助显示屏即可将虚拟新闻主播在空气中立体展示出来,可以带来巨大的视觉震撼体验,并且虚拟新闻主播的嘴型、眼部动作和面部表情等可以配合播音装置所播放的新闻消息进行动作,具有强烈的真实感,能够很好地模拟真人新闻主播的播音效果,从而极大地增强了虚拟新闻主播的使用体验。The beneficial effect of the present invention is that the virtual news anchor system and its implementation method in the embodiment of the present invention can display the three-dimensional model of the virtual news anchor in the air imaging mode, and the virtual news anchor can be displayed in the air without the help of a display screen. Displayed in three-dimensional medium, can bring a huge visual shock experience, and the mouth shape, eye movements and facial expressions of the virtual news anchor can cooperate with the news message played by the broadcasting device to move, which has a strong sense of reality and can be very good The broadcast effect of the live news anchor is simulated, which greatly enhances the experience of the virtual news anchor.
附图说明Description of the drawings
图1为本发明方法虚拟新闻主播系统的一个具体实施方式的结构框图;Figure 1 is a structural block diagram of a specific implementation of the virtual news anchor system of the method of the present invention;
图2为本发明方法虚拟新闻主播系统的另一个具体实施方式的结构框图;2 is a structural block diagram of another specific implementation of the virtual news anchor system of the method of the present invention;
图3为本发明方法虚拟新闻主播实现方法的一个具体实施方式的流程图。Fig. 3 is a flowchart of a specific implementation manner of a virtual news anchor realization method according to the present invention.
具体实施方式Detailed ways
实施例1Example 1
本实施例中一种基于空气成像的虚拟新闻主播系统,参照图1或图2,包括:A virtual news anchor system based on air imaging in this embodiment, referring to Figure 1 or Figure 2, includes:
信号检测装置,用于检测输入的信号,所述输入的信号包括手势信号、体感信号、脑波信号、眼球动作信号、语音信号、触摸信号和面部表情信号中的至少一个;A signal detection device for detecting an input signal, the input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
控制装置,用于生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号;The control device is used to generate a three-dimensional model and search for broadcast content according to the input signal, and then generate an audio signal according to the broadcast content;
展示装置,用于接收控制装置所生成的三维模型,然后在空气中通过空气成像的方式进行展示;The display device is used to receive the three-dimensional model generated by the control device, and then display it in the air through air imaging;
播音装置,用于接收控制装置所生成的音频信号,然后通过音频方式进行播放。The broadcasting device is used to receive the audio signal generated by the control device, and then play it through audio mode.
本实施例中,参照图1或图2,所述信号检测装置包括:In this embodiment, referring to FIG. 1 or FIG. 2, the signal detection device includes:
体感传感器,用于获取输入的体感信号;Somatosensory sensor, used to obtain the input somatosensory signal;
手势传感器,用于获取输入的手势信号;Gesture sensor, used to obtain the input gesture signal;
眼球跟踪器,用于获取输入的眼球动作信号;Eye tracker, used to obtain the input eye movement signal;
触摸模块,用于获取输入的触摸信号;Touch module, used to obtain the input touch signal;
语音采集模块,用于获取输入的语音信号;The voice collection module is used to obtain the input voice signal;
脑波采集装置,用于获取输入的脑波信号;Brain wave acquisition device for acquiring input brain wave signals;
摄像头,用于获取输入的图像信号,从而识别使用者的面部表情。The camera is used to obtain the input image signal to recognize the user's facial expression.
本实施例中的控制装置是一个具有数据存储能力和处理能力的装置,例如,可以使用个人计算机作为本实施例中的控制装置。所述控制装置连接到互联网,并可以在互联网上检索所要获得的资源。The control device in this embodiment is a device with data storage capabilities and processing capabilities. For example, a personal computer can be used as the control device in this embodiment. The control device is connected to the Internet, and can retrieve the resources to be obtained on the Internet.
使用者可以通过发出体感、做出手势、进行眼球动作、触摸、发出语音、发出脑波和做出面部表情等动作来与虚拟新闻主播系统进行互动。Users can interact with the virtual news anchor system by making somatosensory actions, making gestures, making eye movements, touching, making voices, making brain waves, and making facial expressions.
所述控制装置用于生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号。所述播报内容是指能够传递时事新闻消息的多媒体内容,例如一段由新闻通讯社发布的新闻稿。控制装置根据播报内容的文字信息转换得到音频信号,该音频信号可以驱动播音装置发出音效,从而读出播报内容。控制装置根据播报内容的文字信息转换得到三维模型,所述三维模型可以表现为一个男性或女性的人类播音员形象,也可以表现为猫、狗、牛、鸡等卡通形象,所述三维模型是可变的,主要是三维模型的嘴型、眼部动作和面部表情等与音频信号同步变化,使得三维模型的嘴型、眼部动作和面部表情等与音频信号发出的音效对应的文字可以匹配起来。The control device is used to generate a three-dimensional model and search for broadcast content according to the input signal, and then generate an audio signal according to the broadcast content. The broadcast content refers to multimedia content capable of delivering current affairs news, such as a news release issued by a news agency. The control device converts the audio signal according to the text information of the broadcast content, and the audio signal can drive the broadcast device to emit sound effects, thereby reading the broadcast content. The control device converts the text information of the broadcast content to obtain a three-dimensional model. The three-dimensional model can be expressed as a male or female human announcer, or as a cartoon character such as a cat, dog, cow, and chicken. The three-dimensional model is Variable, mainly because the mouth shape, eye movements and facial expressions of the 3D model change synchronously with the audio signal, so that the mouth shape, eye movements and facial expressions of the 3D model can match the text corresponding to the sound effect emitted by the audio signal. stand up.
本实施例中的展示装置是空气成像仪。参照图1或图2,所述展示装置包括驱动单元和投影单元。所述驱动单元用于接收控制装置所生成的三维模型,所述投影单元用于在所述驱 动单元的驱动下对空气进行投影从而展示所述三维模型。所述驱动单元上安装有相应的解码程序和驱动程序,所述解码程序用于对接收到的控制装置所生成的三维模型进行解码,所述驱动程序根据解码的结果驱动投影单元对空气进行投影从而展示所述三维模型。The display device in this embodiment is an air imager. 1 or 2, the display device includes a driving unit and a projection unit. The driving unit is used for receiving the three-dimensional model generated by the control device, and the projection unit is used for projecting the air under the driving of the driving unit to display the three-dimensional model. A corresponding decoding program and a driver program are installed on the driving unit, the decoding program is used to decode the received three-dimensional model generated by the control device, and the driver program drives the projection unit to project the air according to the decoding result Thus, the three-dimensional model is displayed.
本实施例中的播音装置可以是音响。参照图1或图2,所述播音装置包括功放单元和扬声器。所述功放单元用于接收控制装置所生成的音频信号并进行放大,所述扬声器用于在所述功放单元的驱动下发出相应的音效。所述功放单元上安装有相应的解码电路、放大电路和降噪电路,所述解码电路用于对接收到的控制装置所生成的音频信号进行解码,所述放大电路对解码的结果进行放大,所述降噪电路用于降低放大电路工作过程中的噪音。The broadcasting device in this embodiment may be a speaker. 1 or 2, the broadcasting device includes a power amplifier unit and a speaker. The power amplifier unit is used for receiving and amplifying the audio signal generated by the control device, and the speaker is used for emitting corresponding sound effects under the driving of the power amplifier unit. The power amplifier unit is equipped with a corresponding decoding circuit, an amplifying circuit and a noise reduction circuit, the decoding circuit is used to decode the received audio signal generated by the control device, and the amplifying circuit amplifies the result of the decoding, The noise reduction circuit is used for reducing noise during the working process of the amplifying circuit.
本实施例中的虚拟新闻主播系统可以通过空气成像的方式将虚拟新闻主播的三维模型展示出来,所展示的虚拟新闻主播是立体的,并且虚拟新闻主播的嘴型、眼部动作和面部表情等可以配合播音装置所播放的新闻消息进行动作,具有强烈的真实感,能够很好地模拟真人新闻主播的播音效果,从而极大地增强了虚拟新闻主播的使用体验。The virtual news anchor system in this embodiment can display the three-dimensional model of the virtual news anchor through air imaging. The displayed virtual news anchor is three-dimensional, and the virtual news anchor’s mouth shape, eye movements, facial expressions, etc. It can act in conjunction with the news messages played by the broadcasting device, has a strong sense of reality, and can well simulate the broadcasting effect of a live news anchor, thereby greatly enhancing the experience of the virtual news anchor.
进一步地,所述控制装置中存储有本地知识库,所述本地知识库中包含的知识用于确定各种问题和答案的对应关系,使得可以根据问题查找到相应的答案。参照图1和图2,所述基于空气成像的虚拟新闻主播系统还包括服务器,所述服务器用于通过AI程序的训练和学习生成新的知识并更新本地知识库,该本地知识库存储在控制装置中。所述控制装置在接收到输入的信号时,通过语音识别程序解析出所述输入的信号中包含的问题,并按照本地知识库、服务器和互联网的优先顺序检索与所述问题对应的答案,从而根据所述答案生成音频信号。也就是说,所述控制装置首先在本地知识库中进行检索,如果在本地知识库中检索到答案,就将从本地知识库中检索到的答案生成音频信号;如果没有在本地知识库中检索到答案,就从服务器所存储的知识库中进行检索,如果在服务器所存储的知识库中检索到答案,就将从服务器所存储知识库中检索到的答案生成音频信号;如果没有在服务器所存储的知识库中检索到答案,连接到百度等搜索引擎进行检索,使得虚拟新闻主播系统可以检索到匹配的答案,实现对用户所提问题的智能解答。Further, a local knowledge base is stored in the control device, and the knowledge contained in the local knowledge base is used to determine the correspondence between various questions and answers, so that the corresponding answers can be found according to the questions. 1 and 2, the air imaging-based virtual news anchor system also includes a server, the server is used to generate new knowledge through AI program training and learning and update the local knowledge base, the local knowledge base is stored in the control In the installation. When the control device receives the input signal, it analyzes the question contained in the input signal through the voice recognition program, and searches for the answer corresponding to the question according to the priority order of the local knowledge base, the server and the Internet, thereby An audio signal is generated based on the answer. In other words, the control device first searches in the local knowledge base, if the answer is retrieved in the local knowledge base, it will generate an audio signal from the answer retrieved in the local knowledge base; if it is not retrieved in the local knowledge base When the answer is reached, it will be retrieved from the knowledge base stored on the server. If the answer is retrieved from the knowledge base stored on the server, an audio signal will be generated from the answer retrieved from the knowledge base stored on the server. Answers are retrieved from the stored knowledge base and connected to search engines such as Baidu for retrieval, so that the virtual news anchor system can retrieve matching answers and realize intelligent answers to questions raised by users.
进一步作为优选的实施方式,所述输入的信号包括语音信号,所述控制装置包括:Further as a preferred embodiment, the input signal includes a voice signal, and the control device includes:
语音识别单元,用于从所述手势信号中识别出语音信息;A voice recognition unit for recognizing voice information from the gesture signal;
关键词提取单元,用于从所述语音信息中提取关键词;The keyword extraction unit is used to extract keywords from the voice information;
新闻消息检索单元,用于根据所述关键词检索并获取新闻消息;The news information retrieval unit is used to retrieve and obtain news information according to the keywords;
播报内容生成单元,用于根据所述新闻消息检索单元检索到的新闻消息生成播报内容;A broadcast content generating unit, configured to generate broadcast content according to the news message retrieved by the news message retrieval unit;
三维模型生成单元,用于生成三维模型;3D model generating unit, used to generate 3D model;
音频信号生成单元,用于根据所述播报内容生成音频信号。The audio signal generating unit is configured to generate an audio signal according to the broadcast content.
所述语音识别单元、关键词提取单元、新闻消息检索单元、播报内容生成单元、三维模型生成单元和音频信号生成单元是控制装置内所安装的具有相应功能的软件模块。其中,所述语音识别单元具有一个语音识别程序,可以识别所述手势信号所包含的内容,例如一些语音识别程序可以将手势信号转换成文字。所述关键词提取单元可以从所述语音信号的内容中提取关键词。所述新闻消息检索单元连接到互联网上的新闻搜索引擎,利用关键词提取单元提取到的关键词进行检索,并获取新闻搜索引擎返回的新闻消息。所述播报内容生成单元通过删除商业广告以及其他无关消息、重新整理顺序和提取关键段落等手段对新闻消息进行处理,从而生成播报内容。所述三维模型生成单元是一个三维建模程序,可以生成一个表现为男性或女性的人类播音员形象或表现为猫、狗、牛、鸡等卡通形象的三维模型,该三维模型的嘴型、眼部动作和面部表情等随着播报内容中按时序进行播放时每个文字的发音而变化。所述音频信号生成单元是一个包含一个文字-语音转换程序,可以将各文字转换得到相应的音频信号。The voice recognition unit, keyword extraction unit, news information retrieval unit, broadcast content generation unit, three-dimensional model generation unit and audio signal generation unit are software modules with corresponding functions installed in the control device. Wherein, the voice recognition unit has a voice recognition program that can recognize the content contained in the gesture signal. For example, some voice recognition programs can convert the gesture signal into text. The keyword extraction unit may extract keywords from the content of the voice signal. The news information retrieval unit is connected to a news search engine on the Internet, searches using the keywords extracted by the keyword extraction unit, and obtains news information returned by the news search engine. The broadcast content generating unit processes news messages by deleting commercial advertisements and other irrelevant messages, rearranging the sequence, and extracting key paragraphs, etc., so as to generate broadcast content. The three-dimensional model generating unit is a three-dimensional modeling program, which can generate a three-dimensional model representing a male or female human announcer image or a cartoon image of a cat, dog, cow, chicken, etc. The mouth shape of the three-dimensional model is, Eye movements and facial expressions change with the pronunciation of each text when the broadcast content is played in time sequence. The audio signal generating unit is a text-to-speech conversion program that can convert each text to obtain a corresponding audio signal.
通过语音识别单元、关键词提取单元、新闻消息检索单元、播报内容生成单元、三维模型生成单元和音频信号生成单元,所述控制装置可以识别语音信号,并从互联网上检索得到相应的新闻消息。使用者可以语音与虚拟新闻主播进行互动,获取想要的新闻消息。Through the voice recognition unit, the keyword extraction unit, the news message retrieval unit, the broadcast content generation unit, the three-dimensional model generation unit and the audio signal generation unit, the control device can recognize the voice signal and retrieve the corresponding news message from the Internet. Users can interact with virtual news anchors by voice to get the news they want.
进一步作为优选的实施方式,所述输入的信号包括手势信号,参照图1,所述控制装置包括:Further as a preferred embodiment, the input signal includes a gesture signal. Referring to FIG. 1, the control device includes:
手语识别单元,用于从所述手势信号中识别出手语信息;A sign language recognition unit for recognizing sign language information from the gesture signal;
关键词提取单元,用于从所述手语信息中提取关键词;The keyword extraction unit is used to extract keywords from the sign language information;
新闻消息检索单元,用于根据所述关键词检索并获取新闻消息;The news information retrieval unit is used to retrieve and obtain news information according to the keywords;
播报内容生成单元,用于根据所述新闻消息检索单元检索到的新闻消息生成播报内容;A broadcast content generating unit, configured to generate broadcast content according to the news message retrieved by the news message retrieval unit;
三维模型生成单元,用于生成三维模型;3D model generating unit, used to generate 3D model;
音频信号生成单元,用于根据所述播报内容生成音频信号。The audio signal generating unit is configured to generate an audio signal according to the broadcast content.
所述手语识别单元、关键词提取单元、新闻消息检索单元、播报内容生成单元、三维模型生成单元和音频信号生成单元是控制装置内所安装的具有相应功能的软件模块。其中,所述手语识别单元具有一个手语识别程序,可以识别所述手势信号所包含的内容,例如一些手语识别程序可以将手势信号转换成文字。所述关键词提取单元可以从所述手语信号的内容中提取关键词。所述新闻消息检索单元连接到互联网上的新闻搜索引擎,利用关键词提取单元提取到的关键词进行检索,并获取新闻搜索引擎返回的新闻消息。所述播报内容生成单元通 过删除商业广告以及其他无关消息、重新整理顺序和提取关键段落等手段对新闻消息进行处理,从而生成播报内容。所述三维模型生成单元是一个三维建模程序,可以生成一个表现为男性或女性的人类播音员形象或表现为猫、狗、牛、鸡等卡通形象的三维模型,该三维模型的嘴型、眼部动作和面部表情等随着播报内容中按时序进行播放时每个文字的发音而变化。所述音频信号生成单元是一个包含一个文字-语音转换程序,可以将各文字转换得到相应的音频信号。The sign language recognition unit, keyword extraction unit, news message retrieval unit, broadcast content generation unit, three-dimensional model generation unit and audio signal generation unit are software modules with corresponding functions installed in the control device. Wherein, the sign language recognition unit has a sign language recognition program that can recognize the content contained in the gesture signal. For example, some sign language recognition programs can convert the gesture signal into text. The keyword extraction unit may extract keywords from the content of the sign language signal. The news information retrieval unit is connected to a news search engine on the Internet, searches using the keywords extracted by the keyword extraction unit, and obtains news information returned by the news search engine. The broadcast content generating unit processes news messages by deleting commercial advertisements and other irrelevant messages, rearranging the sequence, and extracting key paragraphs, etc., to generate broadcast content. The three-dimensional model generating unit is a three-dimensional modeling program, which can generate a three-dimensional model representing a male or female human announcer image or a cartoon image of a cat, dog, cow, chicken, etc. The mouth shape of the three-dimensional model is, Eye movements and facial expressions change with the pronunciation of each text when the broadcast content is played in time sequence. The audio signal generating unit is a text-to-speech conversion program that can convert each text to obtain a corresponding audio signal.
通过手语识别单元、关键词提取单元、新闻消息检索单元、播报内容生成单元、三维模型生成单元和音频信号生成单元,所述控制装置可以识别手语信号,并从互联网上检索得到相应的新闻消息。本实施例中的虚拟新闻主播系统对发音存在困难的人士非常友好,他们只需要对虚拟新闻主播系统做出手语手势,虚拟新闻主播系统就可以识别出手语信号,并从互联网上检索得到相应的新闻消息,使得发音存在困难的人士也可以享受到本发明虚拟新闻主播系统带来的便利。Through the sign language recognition unit, the keyword extraction unit, the news message retrieval unit, the broadcast content generation unit, the three-dimensional model generation unit and the audio signal generation unit, the control device can recognize sign language signals and retrieve corresponding news messages from the Internet. The virtual news anchor system in this embodiment is very friendly to people who have difficulty in pronunciation. They only need to make sign language gestures to the virtual news anchor system, and the virtual news anchor system can recognize the sign language signal and retrieve the corresponding signal from the Internet. With news messages, people who have difficulty in pronunciation can also enjoy the convenience brought by the virtual news anchor system of the present invention.
进一步作为优选的实施方式,参照图2,所述输入的信号包括面部表情信号,所述控制装置包括:As a further preferred embodiment, referring to FIG. 2, the input signal includes a facial expression signal, and the control device includes:
面部表情识别单元,用于获取所述面部表情并识别所述面部表情对应的情感状态;所述情感状态包括严肃、喜悦、哀伤、激动;The facial expression recognition unit is used to obtain the facial expression and recognize the emotional state corresponding to the facial expression; the emotional state includes seriousness, joy, sadness, and excitement;
新闻消息检索单元,用于根据所述情感状态检索并获取新闻消息;The news information retrieval unit is used to retrieve and obtain news information according to the emotional state;
播报内容生成单元,用于根据所述新闻消息检索单元检索到的新闻消息生成播报内容;A broadcast content generating unit, configured to generate broadcast content according to the news message retrieved by the news message retrieval unit;
三维模型生成单元,用于生成三维模型;3D model generating unit, used to generate 3D model;
音频信号生成单元,用于根据所述播报内容生成音频信号。The audio signal generating unit is configured to generate an audio signal according to the broadcast content.
所述面部表情识别单元、新闻消息检索单元、播报内容生成单元、三维模型生成单元和音频信号生成单元是控制装置内所安装的具有相应功能的软件模块。The facial expression recognition unit, news information retrieval unit, broadcast content generation unit, three-dimensional model generation unit and audio signal generation unit are software modules with corresponding functions installed in the control device.
其中,所述面部表情识别单元具有一个面部表情识别程序,可以识别使用者人脸表情当前对应的严肃、喜悦、哀伤、激动等情感状态。所述新闻消息检索单元连接到互联网上的新闻搜索引擎,根据使用者的情感状态来检索相应分类的新闻消息,并获取新闻搜索引擎返回的新闻消息。所述播报内容生成单元通过删除商业广告以及其他无关消息、重新整理顺序和提取关键段落等手段对新闻消息进行处理,从而生成播报内容。所述三维模型生成单元是一个三维建模程序,可以生成一个表现为男性或女性的人类播音员形象或表现为猫、狗、牛、鸡等卡通形象的三维模型,该三维模型的嘴型、眼部动作和面部表情等随着播报内容中按时序进行播放时每个文字的发音而变化。所述音频信号生成单元是一个包含一个文字-语音转换 程序,可以将各文字转换得到相应的音频信号。Wherein, the facial expression recognition unit has a facial expression recognition program that can recognize the current emotional state of seriousness, joy, sadness, excitement, etc. corresponding to the facial expression of the user. The news message retrieval unit is connected to a news search engine on the Internet, searches for news messages of corresponding classifications according to the emotional state of the user, and obtains news messages returned by the news search engine. The broadcast content generating unit processes news messages by deleting commercial advertisements and other irrelevant messages, rearranging the sequence, and extracting key paragraphs, etc., so as to generate broadcast content. The three-dimensional model generating unit is a three-dimensional modeling program, which can generate a three-dimensional model representing a male or female human announcer image or a cartoon image of a cat, dog, cow, chicken, etc. The mouth shape of the three-dimensional model is, Eye movements and facial expressions change with the pronunciation of each text when the broadcast content is played in time sequence. The audio signal generating unit contains a text-to-speech conversion program, which can convert each text to obtain a corresponding audio signal.
进一步作为优选的实施方式,所述新闻消息检索单元具体用于:As a further preferred embodiment, the news information retrieval unit is specifically configured to:
当所述情感状态为严肃时,检索并获取政治经济类新闻消息;When the emotional state is serious, retrieve and obtain political and economic news;
当所述情感状态为喜悦时,检索并获取民生类新闻消息;When the emotional state is joy, retrieve and obtain livelihood news;
当所述情感状态为哀伤时,检索并获取娱乐类新闻消息;When the emotional state is sad, retrieve and obtain entertainment news information;
当所述情感状态为激动时,检索并获取人文类新闻消息。When the emotional state is excitement, search and obtain humanities news.
当使用者的面部表情反映出使用者的情感状态为严肃时,表明使用者当前比较理智,如果收听收看政治经济类新闻消息将会获得更好的效果;当使用者的面部表情反映出使用者的情感状态为喜悦时,表明使用者当前比较感性,如果收听收看民生类新闻消息将会获得更热爱生活的效果;当使用者的面部表情反映出使用者的情感状态为哀伤时,表明使用者需要接受精神治疗,如果收听收看娱乐类新闻消息将会获得更好的治疗效果,所述娱乐类新闻消息包括跟电视剧、电影、明星和搞笑有关的新闻消息;当使用者的面部表情反映出使用者的情感状态为激动时,表明使用者需要快速平静下来,此时可以让使用者观看与历史、艺术和社会有关的人文类新闻消息。When the user’s facial expression reflects that the user’s emotional state is serious, it indicates that the user is currently more rational. If you listen to political and economic news, you will get better results; when the user’s facial expression reflects the user When the emotional state of the user is joy, it indicates that the user is currently more emotional. If you listen to the news of people’s livelihood, you will get the effect of loving life more; when the user’s facial expression reflects the emotional state of the user is sad, it indicates that the user Need to receive mental treatment. If you listen to entertainment news, you will get better treatment. The entertainment news includes news related to TV series, movies, celebrities and funny; when the user’s facial expression reflects the use of When the emotional state of the user is excited, it indicates that the user needs to calm down quickly. At this time, the user can watch the humanities news related to history, art and society.
通过对新闻消息检索单元进行上述设置,本实施例中的虚拟新闻主播系统可以针对使用者的不同情感状态主动推送不同类型的新闻消息,从而提供更好的新闻体验。By performing the above settings on the news message retrieval unit, the virtual news anchor system in this embodiment can actively push different types of news messages according to different emotional states of users, thereby providing a better news experience.
进一步作为优选的实施方式,参照图1或图2,所述控制装置还包括:Further as a preferred embodiment, referring to FIG. 1 or FIG. 2, the control device further includes:
人工智能单元,用于使用人工智能对所述三维模型生成单元生成的三维模型进行训练。The artificial intelligence unit is used to use artificial intelligence to train the three-dimensional model generated by the three-dimensional model generating unit.
所述人工智能单元通过卷积神经网络等工具对三维模型进行训练,可以使得三维模型的嘴型、眼部动作和面部表情等更加自然,并且可以为三维模型更改不同的服装、穿戴和妆容等造型,提供更好和更真实的新闻体验。The artificial intelligence unit trains the three-dimensional model through tools such as convolutional neural networks, which can make the mouth shape, eye movements and facial expressions of the three-dimensional model more natural, and can change different clothes, wear and makeup for the three-dimensional model. Shape, provide a better and more realistic news experience.
实施例2Example 2
本实施例一种基于空气成像的虚拟新闻主播实现方法,参照图3,包括以下步骤:In this embodiment, a method for implementing a virtual news anchor based on air imaging, referring to FIG. 3, includes the following steps:
S1.检测输入的信号,所述输入的信号包括手势信号、体感信号、脑波信号、眼球动作信号、语音信号、触摸信号和面部表情信号中的至少一个;S1. Detect an input signal, the input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
S2.生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号;S2. Generate a three-dimensional model and search for the broadcast content according to the input signal, and then generate an audio signal according to the broadcast content;
S3.接收控制装置所生成的三维模型,然后在空气中通过空气成像的方式进行展示;S3. Receive the three-dimensional model generated by the control device, and then display it in the air through air imaging;
S4.接收控制装置所生成的音频信号,然后通过音频方式进行播放。S4. Receive the audio signal generated by the control device, and then play it through audio mode.
进一步作为优选的实施方式,所述基于空气成像的虚拟新闻主播实现方法还包括以下步 骤:Further as a preferred embodiment, the method for implementing a virtual news anchor based on air imaging further includes the following steps:
S5.生成知识并更新本地知识库;所述知识用于确定问题与答案之间的对应关系;S5. Generate knowledge and update the local knowledge base; the knowledge is used to determine the correspondence between the question and the answer;
S6.在接收到输入的信号时解析出所述输入的信号中包含的问题;S6. When receiving the input signal, analyze the problem contained in the input signal;
S7.按照本地知识库、服务器和互联网的优先顺序检索与所述问题对应的答案;S7. Retrieve the answer corresponding to the question according to the priority order of local knowledge base, server and Internet;
S8.根据所述答案生成音频信号。S8. Generate an audio signal according to the answer.
所述步骤S5-S8可以通过服务器来执行。所述服务器用于通过AI程序的训练和学习生成新的知识并更新本地知识库,该本地知识库存储在控制装置中。所述控制装置在接收到输入的信号时,通过语音识别程序解析出所述输入的信号中包含的问题,并按照本地知识库、服务器和互联网的优先顺序检索与所述问题对应的答案,从而根据所述答案生成音频信号。也就是说,所述控制装置首先在本地知识库中进行检索,如果在本地知识库中检索到答案,就将从本地知识库中检索到的答案生成音频信号;如果没有在本地知识库中检索到答案,就从服务器所存储的知识库中进行检索,如果在服务器所存储的知识库中检索到答案,就将从服务器所存储知识库中检索到的答案生成音频信号;如果没有在服务器所存储的知识库中检索到答案,连接到百度等搜索引擎进行检索,使得虚拟新闻主播系统可以检索到匹配的答案,实现对用户所提问题的智能解答。The steps S5-S8 can be executed by the server. The server is used to generate new knowledge and update the local knowledge base through the training and learning of the AI program, and the local knowledge base is stored in the control device. When the control device receives the input signal, it analyzes the question contained in the input signal through the voice recognition program, and searches for the answer corresponding to the question according to the priority order of the local knowledge base, the server and the Internet, thereby An audio signal is generated based on the answer. In other words, the control device first searches in the local knowledge base, if the answer is retrieved in the local knowledge base, it will generate an audio signal from the answer retrieved in the local knowledge base; if it is not retrieved in the local knowledge base When the answer is reached, it will be retrieved from the knowledge base stored on the server. If the answer is retrieved from the knowledge base stored on the server, an audio signal will be generated from the answer retrieved from the knowledge base stored on the server. Answers are retrieved from the stored knowledge base and connected to search engines such as Baidu for retrieval, so that the virtual news anchor system can retrieve matching answers and realize intelligent answers to questions raised by users.
进一步作为优选的实施方式,所述输入的信号包括手势信号,所述生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号这一步骤,即步骤S2,具体包括:Further as a preferred embodiment, the input signal includes a gesture signal, the step of generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content, namely step S2, specifically includes:
S201A.从所述手势信号中识别出手语信息;S201A. Recognizing sign language information from the gesture signal;
S202A.从所述手语信息中提取关键词;S202A. Extract keywords from the sign language information;
S203A.根据所述关键词检索并获取新闻消息;S203A. Search and obtain news information according to the keywords;
S204A.根据所述新闻消息检索单元检索到的新闻消息生成播报内容;S204A. Generate broadcast content based on the news information retrieved by the news information retrieval unit;
S205A.根据所述播报内容生成三维模型;S205A. Generate a three-dimensional model according to the broadcast content;
S206A.根据所述播报内容生成音频信号。S206A. Generate an audio signal according to the broadcast content.
进一步作为优选的实施方式,所述输入的信号包括面部表情信号,所述生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号这一步骤,即步骤S2,具体包括:Further as a preferred embodiment, the input signal includes a facial expression signal, and the step of generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content, namely step S2, specifically includes :
S201B.获取所述面部表情并识别所述面部表情对应的情感状态;所述情感状态包括严肃、喜悦、哀伤、激动;S201B. Obtain the facial expression and identify the emotional state corresponding to the facial expression; the emotional state includes seriousness, joy, sadness, and excitement;
S202B.根据所述情感状态检索并获取新闻消息;S202B. Retrieve and obtain news information according to the emotional state;
S203B.根据所述新闻消息检索单元检索到的新闻消息生成播报内容;S203B. Generate broadcast content according to the news information retrieved by the news information retrieval unit;
S204B.根据所述播报内容生成三维模型;S204B. Generate a three-dimensional model according to the broadcast content;
S205B.根据所述播报内容生成音频信号。S205B. Generate an audio signal according to the broadcast content.
进一步作为优选的实施方式,所述根据所述情感状态检索并获取新闻消息这一步骤,即步骤S202B,具体包括:As a further preferred embodiment, the step of retrieving and obtaining news information according to the emotional state, namely step S202B, specifically includes:
S20201.当所述情感状态为严肃时,检索并获取政治经济类新闻消息;S20201. When the emotional state is serious, retrieve and obtain political and economic news;
S20202.当所述情感状态为喜悦时,检索并获取民生类新闻消息;S20202. When the emotional state is joy, retrieve and obtain livelihood news;
S20203.当所述情感状态为哀伤时,检索并获取娱乐类新闻消息;S20203. When the emotional state is sad, retrieve and obtain entertainment news;
S20204.当所述情感状态为激动时,检索并获取人文类新闻消息。S20204. When the emotional state is excitement, retrieve and obtain humanities news.
进一步作为优选的实施方式,所述生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号这一步骤,即步骤S2,具体还包括:Further as a preferred embodiment, the step of generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content, namely step S2, specifically further includes:
S207.使用人工智能对所述三维模型生成单元生成的三维模型进行训练。S207. Use artificial intelligence to train the three-dimensional model generated by the three-dimensional model generating unit.
本实施例中的各步骤,可以使用实施例1中的各相应装置或单元来实现,可以取得与实施例1相同的有益效果。由于实施例1中各装置或单元就是使用本实施例中的各步骤来实现相应的功能的,因此实施例1中已对本实施例中的各步骤进行了详细说明,本实施例中不再赘述。The steps in this embodiment can be implemented using the corresponding devices or units in Embodiment 1, and the same beneficial effects as in Embodiment 1 can be achieved. Since each device or unit in Embodiment 1 uses the steps in this embodiment to implement corresponding functions, each step in this embodiment has been described in detail in Embodiment 1, and will not be repeated in this embodiment. .
应当认识到,本发明的实施例可以由计算机硬件、硬件和软件的组合、或者通过存储在非暂时性计算机可读存储器中的计算机指令来实现或实施。所述方法可以使用标准编程技术-包括配置有计算机程序的非暂时性计算机可读存储介质在计算机程序中实现,其中如此配置的存储介质使得计算机以特定和预定义的方式操作——根据在具体实施例中描述的方法和附图。每个程序可以以高级过程或面向对象的编程语言来实现以与计算机系统通信。然而,若需要,该程序可以以汇编或机器语言实现。在任何情况下,该语言可以是编译或解释的语言。此外,为此目的该程序能够在编程的专用集成电路上运行。It should be realized that the embodiments of the present invention can be realized or implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer-readable memory. The method can be implemented in a computer program using standard programming techniques-including a non-transitory computer readable storage medium configured with a computer program, where the storage medium so configured allows the computer to operate in a specific and predefined manner-according to the specific The methods and drawings described in the examples. Each program can be implemented in a high-level process or object-oriented programming language to communicate with the computer system. However, if necessary, the program can be implemented in assembly or machine language. In any case, the language can be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
此外,可按任何合适的顺序来执行本文描述的过程的操作,除非本文另外指示或以其他方式明显地与上下文矛盾。本文描述的过程(或变型和/或其组合)可在配置有可执行指令的一个或多个计算机系统的控制下执行,并且可作为共同地在一个或多个处理器上执行的代码(例如,可执行指令、一个或多个计算机程序或一个或多个应用)、由硬件或其组合来实现。所述计算机程序包括可由一个或多个处理器执行的多个指令。In addition, the operations of the processes described herein may be performed in any suitable order, unless otherwise indicated herein or otherwise clearly contradictory to the context. The processes (or variants and/or combinations thereof) described herein can be executed under the control of one or more computer systems configured with executable instructions, and can be used as code (for example, , Executable instructions, one or more computer programs, or one or more applications), implemented by hardware or a combination thereof. The computer program includes a plurality of instructions executable by one or more processors.
进一步,所述方法可以在可操作地连接至合适的任何类型的计算平台中实现,包括但不限于个人电脑、迷你计算机、主框架、工作站、网络或分布式计算环境、单独的或集成的计算机平台、或者与带电粒子工具或其它成像装置通信等等。本发明的各方面可以以存储在非 暂时性存储介质或设备上的机器可读代码来实现,无论是可移动的还是集成至计算平台,如硬盘、光学读取和/或写入存储介质、RAM、ROM等,使得其可由可编程计算机读取,当存储介质或设备由计算机读取时可用于配置和操作计算机以执行在此所描述的过程。此外,机器可读代码,或其部分可以通过有线或无线网络传输。当此类媒体包括结合微处理器或其他数据处理器实现上文所述步骤的指令或程序时,本文所述的发明包括这些和其他不同类型的非暂时性计算机可读存储介质。当根据本发明所述的方法和技术编程时,本发明还包括计算机本身。Further, the method can be implemented in any type of computing platform that is operably connected to a suitable computing platform, including but not limited to a personal computer, a mini computer, a main frame, a workstation, a network or a distributed computing environment, a separate or integrated computer Platform, or communication with charged particle tools or other imaging devices, etc. Aspects of the present invention can be implemented by machine-readable codes stored on non-transitory storage media or devices, whether removable or integrated into a computing platform, such as hard disks, optical reading and/or writing storage media, RAM, ROM, etc., so that they can be read by a programmable computer, and when the storage medium or device is read by the computer, it can be used to configure and operate the computer to perform the processes described herein. In addition, the machine-readable code, or part thereof, can be transmitted through a wired or wireless network. When such a medium includes instructions or programs that implement the steps described above in combination with a microprocessor or other data processor, the invention described herein includes these and other different types of non-transitory computer-readable storage media. When programming according to the methods and techniques of the present invention, the present invention also includes the computer itself.
计算机程序能够应用于输入数据以执行本文所述的功能,从而转换输入数据以生成存储至非易失性存储器的输出数据。输出信息还可以应用于一个或多个输出设备如显示器。在本发明优选的实施例中,转换的数据表示物理和有形的对象,包括显示器上产生的物理和有形对象的特定视觉描绘。A computer program can be applied to input data to perform the functions described herein, thereby converting the input data to generate output data that is stored in non-volatile memory. The output information can also be applied to one or more output devices such as displays. In a preferred embodiment of the present invention, the converted data represents physical and tangible objects, including specific visual depictions of physical and tangible objects generated on the display.
以上是对本发明的较佳实施进行了具体说明,但对本发明创造并不限于所述实施例,熟悉本领域的技术人员在不违背本发明精神的前提下还可做作出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。The above is a detailed description of the preferred implementation of the present invention, but the invention is not limited to the described embodiments. Those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention. These equivalent modifications or replacements are all included in the scope defined by the claims of this application.

Claims (10)

  1. 一种基于空气成像的虚拟新闻主播系统,其特征在于,包括:A virtual news anchor system based on air imaging, characterized in that it comprises:
    信号检测装置,用于检测输入的信号,所述输入的信号包括手势信号、体感信号、脑波信号、眼球动作信号、语音信号、触摸信号和面部表情信号中的至少一个;A signal detection device for detecting an input signal, the input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
    控制装置,用于生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号;The control device is used to generate a three-dimensional model and search for broadcast content according to the input signal, and then generate an audio signal according to the broadcast content;
    展示装置,用于接收控制装置所生成的三维模型,然后在空气中通过空气成像的方式进行展示;The display device is used to receive the three-dimensional model generated by the control device, and then display it in the air through air imaging;
    播音装置,用于接收控制装置所生成的音频信号,然后通过音频方式进行播放。The broadcasting device is used to receive the audio signal generated by the control device, and then play it through audio mode.
  2. 根据权利要求1所述的一种基于空气成像的虚拟新闻主播系统,其特征在于,所述控制装置中存储有本地知识库,所述本地知识库中包含的知识用于确定问题与答案之间的对应关系;所述基于空气成像的虚拟新闻主播系统还包括服务器,所述服务器用于生成新的知识并对所述控制装置存储的本地知识库进行更新,使得所述控制装置在接收到输入的信号时解析出所述输入的信号中包含的问题,并按照本地知识库、服务器和互联网的优先顺序检索与所述问题对应的答案,从而根据所述答案生成音频信号。The virtual news anchor system based on air imaging according to claim 1, wherein a local knowledge base is stored in the control device, and the knowledge contained in the local knowledge base is used to determine the relationship between the question and the answer. The corresponding relationship; the air imaging-based virtual news anchor system also includes a server, the server is used to generate new knowledge and update the local knowledge base stored in the control device, so that the control device receives input Analyze the question contained in the input signal, and retrieve the answer corresponding to the question according to the priority order of the local knowledge base, the server and the Internet, so as to generate an audio signal according to the answer.
  3. 根据权利要求1所述的一种基于空气成像的虚拟新闻主播系统,其特征在于,所述输入的信号包括手势信号,所述控制装置包括:A virtual news anchor system based on air imaging according to claim 1, wherein the input signal comprises a gesture signal, and the control device comprises:
    手语识别单元,用于从所述手势信号中识别出手语信息;A sign language recognition unit for recognizing sign language information from the gesture signal;
    关键词提取单元,用于从所述手语信息中提取关键词;The keyword extraction unit is used to extract keywords from the sign language information;
    新闻消息检索单元,用于根据所述关键词检索并获取新闻消息;The news information retrieval unit is used to retrieve and obtain news information according to the keywords;
    播报内容生成单元,用于根据所述新闻消息检索单元检索到的新闻消息生成播报内容;A broadcast content generating unit, configured to generate broadcast content according to the news message retrieved by the news message retrieval unit;
    三维模型生成单元,用于生成三维模型;3D model generating unit, used to generate 3D model;
    音频信号生成单元,用于根据所述播报内容生成音频信号。The audio signal generating unit is configured to generate an audio signal according to the broadcast content.
  4. 根据权利要求1所述的一种基于空气成像的虚拟新闻主播系统,其特征在于,所述输入的信号包括面部表情信号,所述控制装置包括:A virtual news anchor system based on air imaging according to claim 1, wherein the input signal comprises a facial expression signal, and the control device comprises:
    面部表情识别单元,用于获取所述面部表情并识别所述面部表情对应的情感状态;所述情感状态包括严肃、喜悦、哀伤、激动;The facial expression recognition unit is used to obtain the facial expression and recognize the emotional state corresponding to the facial expression; the emotional state includes seriousness, joy, sadness, and excitement;
    新闻消息检索单元,用于根据所述情感状态检索并获取新闻消息;The news information retrieval unit is used to retrieve and obtain news information according to the emotional state;
    播报内容生成单元,用于根据所述新闻消息检索单元检索到的新闻消息生成播报内容;A broadcast content generating unit, configured to generate broadcast content according to the news message retrieved by the news message retrieval unit;
    三维模型生成单元,用于生成三维模型;3D model generating unit, used to generate 3D model;
    音频信号生成单元,用于根据所述播报内容生成音频信号。The audio signal generating unit is configured to generate an audio signal according to the broadcast content.
  5. 根据权利要求4所述的一种基于空气成像的虚拟新闻主播系统,其特征在于,所述新闻消息检索单元具体用于:The virtual news anchor system based on air imaging according to claim 4, wherein the news information retrieval unit is specifically configured to:
    当所述情感状态为严肃时,检索并获取政治经济类新闻消息;When the emotional state is serious, retrieve and obtain political and economic news;
    当所述情感状态为喜悦时,检索并获取民生类新闻消息;When the emotional state is joy, retrieve and obtain livelihood news;
    当所述情感状态为哀伤时,检索并获取娱乐类新闻消息;When the emotional state is sad, retrieve and obtain entertainment news information;
    当所述情感状态为激动时,检索并获取人文类新闻消息。When the emotional state is excitement, search and obtain humanities news.
  6. 一种基于空气成像的虚拟新闻主播实现方法,其特征在于,包括以下步骤:A method for realizing a virtual news anchor based on air imaging is characterized in that it comprises the following steps:
    检测输入的信号,所述输入的信号包括手势信号、体感信号、脑波信号、眼球动作信号、语音信号、触摸信号和面部表情信号中的至少一个;Detecting an input signal, where the input signal includes at least one of a gesture signal, a somatosensory signal, a brain wave signal, an eye movement signal, a voice signal, a touch signal, and a facial expression signal;
    生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号;Generating a three-dimensional model and searching for broadcast content according to the input signal, and then generating an audio signal according to the broadcast content;
    接收控制装置所生成的三维模型,然后在空气中通过空气成像的方式进行展示;Receive the three-dimensional model generated by the control device, and then display it in the air through air imaging;
    接收控制装置所生成的音频信号,然后通过音频方式进行播放。Receive the audio signal generated by the control device, and then play it through audio mode.
  7. 根据权利要求6所述的一种基于空气成像的虚拟新闻主播实现方法,其特征在于,还包括以下步骤:The method for realizing a virtual news anchor based on air imaging according to claim 6, characterized in that it further comprises the following steps:
    生成新的知识并更新本地知识库;所述知识用于确定问题与答案之间的对应关系;Generate new knowledge and update the local knowledge base; the knowledge is used to determine the correspondence between the question and the answer;
    在接收到输入的信号时解析出所述输入的信号中包含的问题;When receiving the input signal, analyze the problem contained in the input signal;
    按照本地知识库、服务器和互联网的优先顺序检索与所述问题对应的答案;Retrieve the answer corresponding to the question according to the priority order of local knowledge base, server and Internet;
    根据所述答案生成音频信号。An audio signal is generated based on the answer.
  8. 根据权利要求6所述的一种基于空气成像的虚拟新闻主播实现方法,其特征在于,所述输入的信号包括手势信号,所述生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号这一步骤具体包括:The method for realizing a virtual news anchor based on air imaging according to claim 6, wherein the input signal includes a gesture signal, and the three-dimensional model is generated and the broadcast content is searched according to the input signal, and then according to the The step of generating audio signal from broadcast content specifically includes:
    从所述手势信号中识别出手语信息;Identifying sign language information from the gesture signal;
    从所述手语信息中提取关键词;Extract keywords from the sign language information;
    根据所述关键词检索并获取新闻消息;Search and obtain news information according to the keywords;
    根据所述新闻消息检索单元检索到的新闻消息生成播报内容;Generating broadcast content according to the news information retrieved by the news information retrieval unit;
    根据所述播报内容生成三维模型;Generating a three-dimensional model according to the broadcast content;
    根据所述播报内容生成音频信号。An audio signal is generated according to the broadcast content.
  9. 根据权利要求6所述的一种基于空气成像的虚拟新闻主播实现方法,其特征在于,所述输入的信号包括面部表情信号,所述生成三维模型并根据输入的信号查找播报内容,然后根据所述播报内容生成音频信号这一步骤具体包括:The method for realizing a virtual news anchor based on air imaging according to claim 6, wherein the input signal includes a facial expression signal, the three-dimensional model is generated and the broadcast content is searched according to the input signal, and then the broadcast content is searched according to the input signal. The step of generating audio signals from the broadcast content specifically includes:
    获取所述面部表情并识别所述面部表情对应的情感状态;所述情感状态包括严肃、喜悦、哀伤、激动;Acquiring the facial expression and identifying the emotional state corresponding to the facial expression; the emotional state includes seriousness, joy, sadness, and excitement;
    根据所述情感状态检索并获取新闻消息;Retrieve and obtain news information according to the emotional state;
    根据所述新闻消息检索单元检索到的新闻消息生成播报内容;Generating broadcast content according to the news information retrieved by the news information retrieval unit;
    根据所述播报内容生成三维模型;Generating a three-dimensional model according to the broadcast content;
    根据所述播报内容生成音频信号。An audio signal is generated according to the broadcast content.
  10. 根据权利要求9所述的一种基于空气成像的虚拟新闻主播实现方法,其特征在于,所述根据所述情感状态检索并获取新闻消息这一步骤具体包括:The method for realizing a virtual news anchor based on air imaging according to claim 9, wherein the step of retrieving and obtaining news information according to the emotional state specifically comprises:
    当所述情感状态为严肃时,检索并获取政治经济类新闻消息;When the emotional state is serious, retrieve and obtain political and economic news;
    当所述情感状态为喜悦时,检索并获取民生类新闻消息;When the emotional state is joy, retrieve and obtain livelihood news;
    当所述情感状态为哀伤时,检索并获取娱乐类新闻消息;When the emotional state is sad, retrieve and obtain entertainment news information;
    当所述情感状态为激动时,检索并获取人文类新闻消息。When the emotional state is excitement, search and obtain humanities news.
PCT/CN2019/129947 2019-05-14 2019-12-30 Virtual news anchor system based on air imaging and implementation method therefor WO2020228349A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910396513.1 2019-05-14
CN201910396513.1A CN110309470A (en) 2019-05-14 2019-05-14 A kind of virtual news main broadcaster system and its implementation based on air imaging

Publications (1)

Publication Number Publication Date
WO2020228349A1 true WO2020228349A1 (en) 2020-11-19

Family

ID=68074725

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/129947 WO2020228349A1 (en) 2019-05-14 2019-12-30 Virtual news anchor system based on air imaging and implementation method therefor

Country Status (2)

Country Link
CN (1) CN110309470A (en)
WO (1) WO2020228349A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309470A (en) * 2019-05-14 2019-10-08 广东康云科技有限公司 A kind of virtual news main broadcaster system and its implementation based on air imaging
CN111243626B (en) * 2019-12-30 2022-12-09 清华大学 Method and system for generating speaking video
CN115426553A (en) * 2021-05-12 2022-12-02 海信集团控股股份有限公司 Intelligent sound box and display method thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090465A (en) * 2014-06-17 2014-10-08 福建水立方三维数字科技有限公司 Three-dimensional interactive projection imaging method
CN108255292A (en) * 2017-12-06 2018-07-06 上海永微信息科技有限公司 Air imaging interaction systems, method, control device and storage medium
CN108537574A (en) * 2018-03-20 2018-09-14 广东康云多维视觉智能科技有限公司 A kind of 3- D ads display systems and method
CN109085966A (en) * 2018-06-15 2018-12-25 广东康云多维视觉智能科技有限公司 A kind of three-dimensional display system and method based on cloud computing
CN110309470A (en) * 2019-05-14 2019-10-08 广东康云科技有限公司 A kind of virtual news main broadcaster system and its implementation based on air imaging

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105446953A (en) * 2015-11-10 2016-03-30 深圳狗尾草智能科技有限公司 Intelligent robot and virtual 3D interactive system and method
CN105632251B (en) * 2016-01-20 2018-04-20 华中师范大学 3D virtual teacher system and method with phonetic function
CN106959839A (en) * 2017-03-22 2017-07-18 北京光年无限科技有限公司 A kind of human-computer interaction device and method
CN207181987U (en) * 2017-07-24 2018-04-03 中山市博林树投资管理有限公司 A kind of virtual artificial intelligence companion
CN107797663A (en) * 2017-10-26 2018-03-13 北京光年无限科技有限公司 Multi-modal interaction processing method and system based on visual human
CN109410297A (en) * 2018-09-14 2019-03-01 重庆爱奇艺智能科技有限公司 It is a kind of for generating the method and apparatus of avatar image
CN109241924A (en) * 2018-09-18 2019-01-18 宁波众鑫网络科技股份有限公司 Multi-platform information interaction system Internet-based

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090465A (en) * 2014-06-17 2014-10-08 福建水立方三维数字科技有限公司 Three-dimensional interactive projection imaging method
CN108255292A (en) * 2017-12-06 2018-07-06 上海永微信息科技有限公司 Air imaging interaction systems, method, control device and storage medium
CN108537574A (en) * 2018-03-20 2018-09-14 广东康云多维视觉智能科技有限公司 A kind of 3- D ads display systems and method
CN109085966A (en) * 2018-06-15 2018-12-25 广东康云多维视觉智能科技有限公司 A kind of three-dimensional display system and method based on cloud computing
CN110309470A (en) * 2019-05-14 2019-10-08 广东康云科技有限公司 A kind of virtual news main broadcaster system and its implementation based on air imaging

Also Published As

Publication number Publication date
CN110309470A (en) 2019-10-08

Similar Documents

Publication Publication Date Title
JP6888096B2 (en) Robot, server and human-machine interaction methods
TWI778477B (en) Interaction methods, apparatuses thereof, electronic devices and computer readable storage media
US20230042654A1 (en) Action synchronization for target object
KR102503413B1 (en) Animation interaction method, device, equipment and storage medium
CN107092664B (en) Content interpretation method and device
WO2018171223A1 (en) Data processing method and nursing robot device
US11511436B2 (en) Robot control method and companion robot
WO2020228349A1 (en) Virtual news anchor system based on air imaging and implementation method therefor
TWI430189B (en) System, apparatus and method for message simulation
CN109766759A (en) Emotion identification method and Related product
TWI486904B (en) Method for rhythm visualization, system, and computer-readable memory
US20240070397A1 (en) Human-computer interaction method, apparatus and system, electronic device and computer medium
US20220093000A1 (en) Systems and methods for multimodal book reading
CN111414506B (en) Emotion processing method and device based on artificial intelligence, electronic equipment and storage medium
CN116484318B (en) Lecture training feedback method, lecture training feedback device and storage medium
CN109710799B (en) Voice interaction method, medium, device and computing equipment
CN113395578A (en) Method, device and equipment for extracting video theme text and storage medium
KR102021700B1 (en) System and method for rehabilitate language disorder custermized patient based on internet of things
CN111971670A (en) Generating responses in a conversation
Zaramella et al. Why Don't You Speak?: A Smartphone Application to Engage Museum Visitors Through Deepfakes Creation
CN116843805B (en) Method, device, equipment and medium for generating virtual image containing behaviors
CN114220034A (en) Image processing method, device, terminal and storage medium
CN115174947A (en) Live video extraction method and device, storage medium and electronic equipment
CN117828010A (en) Text processing method, apparatus, electronic device, storage medium, and program product
CN117453880A (en) Multi-mode data processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19928457

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19928457

Country of ref document: EP

Kind code of ref document: A1