CN112506405B - Artificial intelligent voice large screen command method based on Internet supervision field - Google Patents

Artificial intelligent voice large screen command method based on Internet supervision field Download PDF

Info

Publication number
CN112506405B
CN112506405B CN202011396329.6A CN202011396329A CN112506405B CN 112506405 B CN112506405 B CN 112506405B CN 202011396329 A CN202011396329 A CN 202011396329A CN 112506405 B CN112506405 B CN 112506405B
Authority
CN
China
Prior art keywords
text
algorithm
user
analysis
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011396329.6A
Other languages
Chinese (zh)
Other versions
CN112506405A (en
Inventor
刘磊
侯居永
栾丽丽
陈兆亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202011396329.6A priority Critical patent/CN112506405B/en
Publication of CN112506405A publication Critical patent/CN112506405A/en
Application granted granted Critical
Publication of CN112506405B publication Critical patent/CN112506405B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A30/00Adapting or protecting infrastructure or their operation
    • Y02A30/60Planning or developing urban green infrastructure

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • Probability & Statistics with Applications (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention provides an artificial intelligent voice large-screen command method based on the field of internet supervision, which belongs to the field of internet plus supervision.

Description

Artificial intelligent voice large screen command method based on Internet supervision field
Technical Field
The invention relates to the technical field of Internet plus supervision, in particular to an artificial intelligent voice large-screen command method based on the field of Internet supervision.
Background
During the operation of the system, a large amount of service data is accumulated, multi-angle deep analysis is carried out on the service data by establishing a data warehouse model, hidden information in the data is fully excavated, scientific basis is provided for leader decision-making, supervision data and supervision results are classified from multiple dimensions of different supervision industries, supervision areas, supervision fields and the like by aggregating various supervision data information and relying on an intelligent applicability data analysis model and tools, supervision risks are specially analyzed and displayed from different angles of supervision objects, supervision matters, supervision behaviors and the like by large data analysis algorithms such as regression analysis, cluster analysis, heat analysis, classification analysis and the like, multi-dimensional supervision work statistical analysis of regions, departments and the like is formed, governments can realize more accurate city management by means of the large data, promote civilians, realize standard supervision based on the large data, Accurate supervision, combined supervision and supervision on supervision, and meanwhile, data can be analyzed to realize early risk warning and drive business decision.
The existing system only focuses on the display effect of a multi-dimensional analysis processing and visualization system of data, but often ignores the interaction with a user, and realizes the heavy display effect and the light interaction.
Disclosure of Invention
In order to solve the technical problems, the invention provides an artificial intelligent voice large-screen command method based on the field of internet supervision.
The invention discloses an artificial intelligent voice large-screen command method based on the field of internet supervision, and aims to improve the traditional interaction mode between a user and a system, improve the experience of the user, add a more intelligent input processing mode to the system and enable the system to be more humanized by applying a voice real-time character and natural language processing technology and a semantic analysis technology to a visual large-screen analysis method in the field of internet supervision.
The technical scheme of the invention is as follows:
an artificial intelligent voice large screen command method based on the internet supervision field,
the method comprises the steps of identifying and converting user voice in real time into text information, extracting keywords through a natural language processing algorithm to obtain main information which a user wants to express, processing the main information through a background algorithm, transmitting the main information to a foreground through a Websocket protocol, calling a page control method to simulate user clicking operation, displaying the information which the user wants, and further realizing that the user clicking operation is replaced by the voice.
Further, in the above-mentioned case,
converting the user voice into text information in real time:
the real-time conversion of audio language into text is based on a deep full-sequence convolution neural network framework, and audio stream data is converted into text data in real time by establishing long connection between an application and a voice transcription engine for a complete audio file within 60 seconds.
And performing semantic analysis and word segmentation on the text information to obtain entry vectors, and performing iterative learning by using a keyword corpus in the Internet supervision field as a training object through a deep learning algorithm.
The vocabulary in the internet supervision field is uploaded to a recognition engine, and when the vocabulary appears in the audio stream to be transcribed, the engine can recognize the vocabulary, so that the recognition accuracy of the professional vocabulary is improved, iterative training can be performed, and continuous optimization can be realized.
Further, in the above-mentioned case,
natural language processing and semantic analysis:
after the voice input of a user is converted into text information through processing, natural language analysis and understanding services are provided according to linguistic data in the internet supervision field and by means of a natural language processing technology, a collaborative crowdsourcing technology, a machine learning technology and a neural network technology, and analysis services are provided for the text information by means of processing technologies of word segmentation part of speech tagging, person name recognition, place name recognition, organization name recognition, time noun recognition, syntax dependence analysis, automatic summarization, text similarity, text classification, emotion analysis and keyword extraction of NLP.
The related core algorithms comprise a K-short word segmentation algorithm, an HMM hidden Markov algorithm model, a Dijkstra shortest distance algorithm, a TF-IDF word frequency-reverse text frequency algorithm, a TextRank algorithm, a W2V word vector model, a CRF conditional random field algorithm model and a FastText text classification algorithm model based on a neural network.
Further, in the above-mentioned case,
establishing connection between a server and a client based on a WebSocket protocol, transmitting generated instruction information to a front-end script, calling a related method to realize simulated click operation, and completing page and function switching.
The server transmits an instruction to the client:
and (3) applying a WebSocket technology, calling a client monitoring method by the server, transmitting the keywords generated in the second step to the client, and calling different js methods by the client according to different parameters, thereby realizing the purpose of replacing the clicking action of the user and completing the switching display of the page.
1) Based on a deep learning technology, the recognition rate of the speech to the text is improved;
2) the accuracy rate of text processing is improved based on the internet supervision field specific corpus;
3) extracting keywords through natural language processing and semantic analysis;
4) through webocket technology, the server transmits a transmission instruction to the client, and action triggering and function switching are achieved.
The invention has the advantages that
By the method and the device, the interactivity between the large visual screen and the user in the internet supervision field can be improved, the use experience of the user is further improved, and meanwhile, the system is more intelligent and the use scenes are enriched.
Drawings
FIG. 1 is a schematic workflow diagram of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, it is obvious that the described embodiments are some, but not all embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
The invention can greatly improve the user participation degree and improve the user experience degree by adding the voice command control function, applies the voice real-time text-to-text function to an internet supervision visual analysis system, and carries out natural language processing, semantic analysis and other technologies on the voice converted text to realize accurate identification, extraction and calling of keywords and a foreground processing method, thereby converting the voice converted text into the operation on the page. The interaction is more intelligent and friendly.
The invention provides a brand-new interactive experience mode for realizing the user voice instead of clicking operation based on an artificial intelligent natural language processing technology by combining the actual Internet + supervision service condition, the standard requirement and the actual use scene of the system and following the relevant laws and regulations.
The technical scheme is as follows:
1. converting user voice into text information in real time:
the real-time conversion of the audio language into the text is based on a deep full-sequence convolution neural network framework, and audio stream data is converted into text data in real time by establishing long connection between application and a voice transcription engine for a complete audio file within 60 seconds; meanwhile, a user can upload special vocabularies in the internet supervision field to a recognition engine, and when the vocabularies appear in the audio stream to be transcribed, the vocabularies can be recognized by the engine, so that the recognition accuracy of the professional vocabularies is improved, iterative training can be performed, and continuous optimization can be realized. The system can support audio formats such as wav, pcm and mp3, single-channel audio streams with sampling rates of 8k and 16k, and data sampling precision of 16 bits.
2. Natural language processing and semantic analysis:
after the user voice input is converted into text information through processing, personalized, integrated, intelligent and diversified natural language analysis and understanding services are provided according to the linguistic data in the specific internet supervision field by means of a natural language processing technology, a collaborative crowdsourcing technology, a machine learning technology, a neural network technology and the like, and efficient and accurate analysis services are provided for the text information by utilizing processing technologies of word segmentation part of speech tagging, name recognition, place name recognition, organization name recognition, time noun recognition, syntactic dependency analysis, automatic summarization, text similarity, text classification, emotion analysis, keyword extraction and the like of NLP, and related core algorithms comprise a K-short word segmentation algorithm, an HMM hidden Markov algorithm model, a Dijkstra shortest distance algorithm, a TF-IDF word frequency-reverse text frequency algorithm, a TextRank algorithm and a W2V word vector model, A CRF conditional random field algorithm model, a FastText text classification algorithm model based on a neural network and the like.
3. The server transmits an instruction to the client:
and (3) applying a WebSocket technology, calling a client monitoring method by the server, transmitting the keywords generated in the second step to the client, and calling different js methods by the client according to different parameters, thereby realizing the purpose of replacing the clicking action of the user and completing the switching display of the page.
The above description is only a preferred embodiment of the present invention, and is only used to illustrate the technical solutions of the present invention, and not to limit the protection scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (4)

1. An artificial intelligent voice large screen command method based on the internet supervision field is characterized in that,
the method comprises the steps of identifying and converting user voice in real time into text information, extracting keywords through a natural language processing algorithm to obtain main information which a user wants to express, processing the main information through a background algorithm, transmitting the main information to a foreground through a Websocket protocol, calling a page control method to simulate user clicking operation, displaying the information which the user wants, and further replacing the user clicking operation with voice;
converting the user voice into text information in real time:
the real-time conversion of the audio language into the text is based on a deep full-sequence convolution neural network framework, and audio stream data is converted into text data in real time by establishing long connection between application and a voice transcription engine for a complete audio file within 60 seconds;
performing semantic analysis and word segmentation on text information to obtain entry vectors, and performing iterative learning by using a keyword corpus in the Internet supervision field as a training object through a deep learning algorithm;
the vocabulary in the internet supervision field is uploaded to a recognition engine, and when the vocabulary appears in the audio stream to be transcribed, the vocabulary can be recognized by the engine, so that the recognition accuracy of the professional vocabulary is improved, and iterative training and continuous optimization can be realized;
natural language processing and semantic analysis:
after the voice input of a user is converted into text information through processing, natural language analysis and understanding services are provided according to linguistic data in the internet supervision field and by means of a natural language processing technology, a collaborative crowdsourcing technology, a machine learning technology and a neural network technology, and analysis services are provided for the text information by means of processing technologies of word segmentation part of speech tagging, person name recognition, place name recognition, organization name recognition, time noun recognition, syntax dependence analysis, automatic summarization, text similarity, text classification, emotion analysis and keyword extraction of NLP.
2. The method of claim 1,
the related core algorithms comprise a K-short word segmentation algorithm, an HMM hidden Markov algorithm model, a Dijkstra shortest distance algorithm, a TF-IDF word frequency-reverse text frequency algorithm, a TextRank algorithm, a W2V word vector model, a CRF conditional random field algorithm model and a FastText text classification algorithm model based on a neural network.
3. The method of claim 1,
establishing connection between a server and a client based on a WebSocket protocol, transmitting generated instruction information to a front-end script, calling a related method to realize simulated click operation, and completing page and function switching.
4. The method of claim 3,
the server transmits an instruction to the client:
and (3) applying a WebSocket technology, calling a client monitoring method by the server, transmitting the keywords generated in the second step to the client, and calling different js methods by the client according to different parameters, so as to replace the clicking action of the user and finish the switching display of the page.
CN202011396329.6A 2020-12-03 2020-12-03 Artificial intelligent voice large screen command method based on Internet supervision field Active CN112506405B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011396329.6A CN112506405B (en) 2020-12-03 2020-12-03 Artificial intelligent voice large screen command method based on Internet supervision field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011396329.6A CN112506405B (en) 2020-12-03 2020-12-03 Artificial intelligent voice large screen command method based on Internet supervision field

Publications (2)

Publication Number Publication Date
CN112506405A CN112506405A (en) 2021-03-16
CN112506405B true CN112506405B (en) 2022-05-31

Family

ID=74969518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011396329.6A Active CN112506405B (en) 2020-12-03 2020-12-03 Artificial intelligent voice large screen command method based on Internet supervision field

Country Status (1)

Country Link
CN (1) CN112506405B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116453637B (en) * 2023-03-20 2023-11-07 杭州市卫生健康事业发展中心 Health data management method and system based on regional big data

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629246B (en) * 2012-02-10 2017-06-27 百纳(武汉)信息技术有限公司 Recognize the server and browser voice command identification method of browser voice command
KR102238535B1 (en) * 2014-10-01 2021-04-09 엘지전자 주식회사 Mobile terminal and method for controlling the same
CN108986797B (en) * 2018-08-06 2021-07-06 中国科学技术大学 Voice theme recognition method and system
CN109947993B (en) * 2019-03-14 2022-10-21 阿波罗智联(北京)科技有限公司 Plot skipping method and device based on voice recognition and computer equipment
CN111597308A (en) * 2020-05-19 2020-08-28 中国电子科技集团公司第二十八研究所 Knowledge graph-based voice question-answering system and application method thereof

Also Published As

Publication number Publication date
CN112506405A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
US20210157990A1 (en) System and Method for Estimation of Interlocutor Intents and Goals in Turn-Based Electronic Conversational Flow
CN106446045B (en) User portrait construction method and system based on dialogue interaction
CN111666380A (en) Intelligent calling method, device, equipment and medium
CN110853649A (en) Label extraction method, system, device and medium based on intelligent voice technology
CN110717018A (en) Industrial equipment fault maintenance question-answering system based on knowledge graph
CN112201228A (en) Multimode semantic recognition service access method based on artificial intelligence
CN111651996A (en) Abstract generation method and device, electronic equipment and storage medium
CN111324727A (en) User intention recognition method, device, equipment and readable storage medium
KR20200130400A (en) Voice-based search for digital content on the network
CN111046148A (en) Intelligent interaction system and intelligent customer service robot
CN114254158A (en) Video generation method and device, and neural network training method and device
CN116431806A (en) Natural language understanding method and refrigerator
CN113505606B (en) Training information acquisition method and device, electronic equipment and storage medium
CN112506405B (en) Artificial intelligent voice large screen command method based on Internet supervision field
CN116450799B (en) Intelligent dialogue method and equipment applied to traffic management service
US20210264812A1 (en) Language learning system and method
CN116881730A (en) Chat scene matching system, method, equipment and storage medium based on context
US11989514B2 (en) Identifying high effort statements for call center summaries
Röpke et al. Training a Speech-to-Text Model for Dutch on the Corpus Gesproken Nederlands.
CN116186258A (en) Text classification method, equipment and storage medium based on multi-mode knowledge graph
KR20230116143A (en) Counseling Type Classification System
CN112836517A (en) Method for processing mining risk signal based on natural language
CN113094471A (en) Interactive data processing method and device
CN110890097A (en) Voice processing method and device, computer storage medium and electronic equipment
CN116049385B (en) Method, device, equipment and platform for generating information and create industry research report

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant