CN112506405B - Artificial intelligent voice large screen command method based on Internet supervision field - Google Patents
Artificial intelligent voice large screen command method based on Internet supervision field Download PDFInfo
- Publication number
- CN112506405B CN112506405B CN202011396329.6A CN202011396329A CN112506405B CN 112506405 B CN112506405 B CN 112506405B CN 202011396329 A CN202011396329 A CN 202011396329A CN 112506405 B CN112506405 B CN 112506405B
- Authority
- CN
- China
- Prior art keywords
- text
- algorithm
- user
- analysis
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000004422 calculation algorithm Methods 0.000 claims description 28
- 238000004458 analytical method Methods 0.000 claims description 24
- 238000005516 engineering process Methods 0.000 claims description 24
- 238000003058 natural language processing Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 11
- 238000013528 artificial neural network Methods 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 5
- 239000013598 vector Substances 0.000 claims description 5
- 230000009471 action Effects 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000007635 classification algorithm Methods 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 claims description 3
- 230000008451 emotion Effects 0.000 claims description 3
- 238000010801 machine learning Methods 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000008520 organization Effects 0.000 claims description 3
- 238000013518 transcription Methods 0.000 claims description 3
- 230000035897 transcription Effects 0.000 claims description 3
- 230000003993 interaction Effects 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000010224 classification analysis Methods 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000004141 dimensional analysis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A30/00—Adapting or protecting infrastructure or their operation
- Y02A30/60—Planning or developing urban green infrastructure
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Evolutionary Computation (AREA)
- Signal Processing (AREA)
- Probability & Statistics with Applications (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention provides an artificial intelligent voice large-screen command method based on the field of internet supervision, which belongs to the field of internet plus supervision.
Description
Technical Field
The invention relates to the technical field of Internet plus supervision, in particular to an artificial intelligent voice large-screen command method based on the field of Internet supervision.
Background
During the operation of the system, a large amount of service data is accumulated, multi-angle deep analysis is carried out on the service data by establishing a data warehouse model, hidden information in the data is fully excavated, scientific basis is provided for leader decision-making, supervision data and supervision results are classified from multiple dimensions of different supervision industries, supervision areas, supervision fields and the like by aggregating various supervision data information and relying on an intelligent applicability data analysis model and tools, supervision risks are specially analyzed and displayed from different angles of supervision objects, supervision matters, supervision behaviors and the like by large data analysis algorithms such as regression analysis, cluster analysis, heat analysis, classification analysis and the like, multi-dimensional supervision work statistical analysis of regions, departments and the like is formed, governments can realize more accurate city management by means of the large data, promote civilians, realize standard supervision based on the large data, Accurate supervision, combined supervision and supervision on supervision, and meanwhile, data can be analyzed to realize early risk warning and drive business decision.
The existing system only focuses on the display effect of a multi-dimensional analysis processing and visualization system of data, but often ignores the interaction with a user, and realizes the heavy display effect and the light interaction.
Disclosure of Invention
In order to solve the technical problems, the invention provides an artificial intelligent voice large-screen command method based on the field of internet supervision.
The invention discloses an artificial intelligent voice large-screen command method based on the field of internet supervision, and aims to improve the traditional interaction mode between a user and a system, improve the experience of the user, add a more intelligent input processing mode to the system and enable the system to be more humanized by applying a voice real-time character and natural language processing technology and a semantic analysis technology to a visual large-screen analysis method in the field of internet supervision.
The technical scheme of the invention is as follows:
an artificial intelligent voice large screen command method based on the internet supervision field,
the method comprises the steps of identifying and converting user voice in real time into text information, extracting keywords through a natural language processing algorithm to obtain main information which a user wants to express, processing the main information through a background algorithm, transmitting the main information to a foreground through a Websocket protocol, calling a page control method to simulate user clicking operation, displaying the information which the user wants, and further realizing that the user clicking operation is replaced by the voice.
Further, in the above-mentioned case,
converting the user voice into text information in real time:
the real-time conversion of audio language into text is based on a deep full-sequence convolution neural network framework, and audio stream data is converted into text data in real time by establishing long connection between an application and a voice transcription engine for a complete audio file within 60 seconds.
And performing semantic analysis and word segmentation on the text information to obtain entry vectors, and performing iterative learning by using a keyword corpus in the Internet supervision field as a training object through a deep learning algorithm.
The vocabulary in the internet supervision field is uploaded to a recognition engine, and when the vocabulary appears in the audio stream to be transcribed, the engine can recognize the vocabulary, so that the recognition accuracy of the professional vocabulary is improved, iterative training can be performed, and continuous optimization can be realized.
Further, in the above-mentioned case,
natural language processing and semantic analysis:
after the voice input of a user is converted into text information through processing, natural language analysis and understanding services are provided according to linguistic data in the internet supervision field and by means of a natural language processing technology, a collaborative crowdsourcing technology, a machine learning technology and a neural network technology, and analysis services are provided for the text information by means of processing technologies of word segmentation part of speech tagging, person name recognition, place name recognition, organization name recognition, time noun recognition, syntax dependence analysis, automatic summarization, text similarity, text classification, emotion analysis and keyword extraction of NLP.
The related core algorithms comprise a K-short word segmentation algorithm, an HMM hidden Markov algorithm model, a Dijkstra shortest distance algorithm, a TF-IDF word frequency-reverse text frequency algorithm, a TextRank algorithm, a W2V word vector model, a CRF conditional random field algorithm model and a FastText text classification algorithm model based on a neural network.
Further, in the above-mentioned case,
establishing connection between a server and a client based on a WebSocket protocol, transmitting generated instruction information to a front-end script, calling a related method to realize simulated click operation, and completing page and function switching.
The server transmits an instruction to the client:
and (3) applying a WebSocket technology, calling a client monitoring method by the server, transmitting the keywords generated in the second step to the client, and calling different js methods by the client according to different parameters, thereby realizing the purpose of replacing the clicking action of the user and completing the switching display of the page.
1) Based on a deep learning technology, the recognition rate of the speech to the text is improved;
2) the accuracy rate of text processing is improved based on the internet supervision field specific corpus;
3) extracting keywords through natural language processing and semantic analysis;
4) through webocket technology, the server transmits a transmission instruction to the client, and action triggering and function switching are achieved.
The invention has the advantages that
By the method and the device, the interactivity between the large visual screen and the user in the internet supervision field can be improved, the use experience of the user is further improved, and meanwhile, the system is more intelligent and the use scenes are enriched.
Drawings
FIG. 1 is a schematic workflow diagram of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, it is obvious that the described embodiments are some, but not all embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
The invention can greatly improve the user participation degree and improve the user experience degree by adding the voice command control function, applies the voice real-time text-to-text function to an internet supervision visual analysis system, and carries out natural language processing, semantic analysis and other technologies on the voice converted text to realize accurate identification, extraction and calling of keywords and a foreground processing method, thereby converting the voice converted text into the operation on the page. The interaction is more intelligent and friendly.
The invention provides a brand-new interactive experience mode for realizing the user voice instead of clicking operation based on an artificial intelligent natural language processing technology by combining the actual Internet + supervision service condition, the standard requirement and the actual use scene of the system and following the relevant laws and regulations.
The technical scheme is as follows:
1. converting user voice into text information in real time:
the real-time conversion of the audio language into the text is based on a deep full-sequence convolution neural network framework, and audio stream data is converted into text data in real time by establishing long connection between application and a voice transcription engine for a complete audio file within 60 seconds; meanwhile, a user can upload special vocabularies in the internet supervision field to a recognition engine, and when the vocabularies appear in the audio stream to be transcribed, the vocabularies can be recognized by the engine, so that the recognition accuracy of the professional vocabularies is improved, iterative training can be performed, and continuous optimization can be realized. The system can support audio formats such as wav, pcm and mp3, single-channel audio streams with sampling rates of 8k and 16k, and data sampling precision of 16 bits.
2. Natural language processing and semantic analysis:
after the user voice input is converted into text information through processing, personalized, integrated, intelligent and diversified natural language analysis and understanding services are provided according to the linguistic data in the specific internet supervision field by means of a natural language processing technology, a collaborative crowdsourcing technology, a machine learning technology, a neural network technology and the like, and efficient and accurate analysis services are provided for the text information by utilizing processing technologies of word segmentation part of speech tagging, name recognition, place name recognition, organization name recognition, time noun recognition, syntactic dependency analysis, automatic summarization, text similarity, text classification, emotion analysis, keyword extraction and the like of NLP, and related core algorithms comprise a K-short word segmentation algorithm, an HMM hidden Markov algorithm model, a Dijkstra shortest distance algorithm, a TF-IDF word frequency-reverse text frequency algorithm, a TextRank algorithm and a W2V word vector model, A CRF conditional random field algorithm model, a FastText text classification algorithm model based on a neural network and the like.
3. The server transmits an instruction to the client:
and (3) applying a WebSocket technology, calling a client monitoring method by the server, transmitting the keywords generated in the second step to the client, and calling different js methods by the client according to different parameters, thereby realizing the purpose of replacing the clicking action of the user and completing the switching display of the page.
The above description is only a preferred embodiment of the present invention, and is only used to illustrate the technical solutions of the present invention, and not to limit the protection scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.
Claims (4)
1. An artificial intelligent voice large screen command method based on the internet supervision field is characterized in that,
the method comprises the steps of identifying and converting user voice in real time into text information, extracting keywords through a natural language processing algorithm to obtain main information which a user wants to express, processing the main information through a background algorithm, transmitting the main information to a foreground through a Websocket protocol, calling a page control method to simulate user clicking operation, displaying the information which the user wants, and further replacing the user clicking operation with voice;
converting the user voice into text information in real time:
the real-time conversion of the audio language into the text is based on a deep full-sequence convolution neural network framework, and audio stream data is converted into text data in real time by establishing long connection between application and a voice transcription engine for a complete audio file within 60 seconds;
performing semantic analysis and word segmentation on text information to obtain entry vectors, and performing iterative learning by using a keyword corpus in the Internet supervision field as a training object through a deep learning algorithm;
the vocabulary in the internet supervision field is uploaded to a recognition engine, and when the vocabulary appears in the audio stream to be transcribed, the vocabulary can be recognized by the engine, so that the recognition accuracy of the professional vocabulary is improved, and iterative training and continuous optimization can be realized;
natural language processing and semantic analysis:
after the voice input of a user is converted into text information through processing, natural language analysis and understanding services are provided according to linguistic data in the internet supervision field and by means of a natural language processing technology, a collaborative crowdsourcing technology, a machine learning technology and a neural network technology, and analysis services are provided for the text information by means of processing technologies of word segmentation part of speech tagging, person name recognition, place name recognition, organization name recognition, time noun recognition, syntax dependence analysis, automatic summarization, text similarity, text classification, emotion analysis and keyword extraction of NLP.
2. The method of claim 1,
the related core algorithms comprise a K-short word segmentation algorithm, an HMM hidden Markov algorithm model, a Dijkstra shortest distance algorithm, a TF-IDF word frequency-reverse text frequency algorithm, a TextRank algorithm, a W2V word vector model, a CRF conditional random field algorithm model and a FastText text classification algorithm model based on a neural network.
3. The method of claim 1,
establishing connection between a server and a client based on a WebSocket protocol, transmitting generated instruction information to a front-end script, calling a related method to realize simulated click operation, and completing page and function switching.
4. The method of claim 3,
the server transmits an instruction to the client:
and (3) applying a WebSocket technology, calling a client monitoring method by the server, transmitting the keywords generated in the second step to the client, and calling different js methods by the client according to different parameters, so as to replace the clicking action of the user and finish the switching display of the page.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011396329.6A CN112506405B (en) | 2020-12-03 | 2020-12-03 | Artificial intelligent voice large screen command method based on Internet supervision field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011396329.6A CN112506405B (en) | 2020-12-03 | 2020-12-03 | Artificial intelligent voice large screen command method based on Internet supervision field |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112506405A CN112506405A (en) | 2021-03-16 |
CN112506405B true CN112506405B (en) | 2022-05-31 |
Family
ID=74969518
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011396329.6A Active CN112506405B (en) | 2020-12-03 | 2020-12-03 | Artificial intelligent voice large screen command method based on Internet supervision field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112506405B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116453637B (en) * | 2023-03-20 | 2023-11-07 | 杭州市卫生健康事业发展中心 | Health data management method and system based on regional big data |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629246B (en) * | 2012-02-10 | 2017-06-27 | 百纳(武汉)信息技术有限公司 | Recognize the server and browser voice command identification method of browser voice command |
KR102238535B1 (en) * | 2014-10-01 | 2021-04-09 | 엘지전자 주식회사 | Mobile terminal and method for controlling the same |
CN108986797B (en) * | 2018-08-06 | 2021-07-06 | 中国科学技术大学 | Voice theme recognition method and system |
CN109947993B (en) * | 2019-03-14 | 2022-10-21 | 阿波罗智联(北京)科技有限公司 | Plot skipping method and device based on voice recognition and computer equipment |
CN111597308A (en) * | 2020-05-19 | 2020-08-28 | 中国电子科技集团公司第二十八研究所 | Knowledge graph-based voice question-answering system and application method thereof |
-
2020
- 2020-12-03 CN CN202011396329.6A patent/CN112506405B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112506405A (en) | 2021-03-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210157990A1 (en) | System and Method for Estimation of Interlocutor Intents and Goals in Turn-Based Electronic Conversational Flow | |
CN106446045B (en) | User portrait construction method and system based on dialogue interaction | |
CN111666380A (en) | Intelligent calling method, device, equipment and medium | |
CN110853649A (en) | Label extraction method, system, device and medium based on intelligent voice technology | |
CN110717018A (en) | Industrial equipment fault maintenance question-answering system based on knowledge graph | |
CN112201228A (en) | Multimode semantic recognition service access method based on artificial intelligence | |
CN111651996A (en) | Abstract generation method and device, electronic equipment and storage medium | |
CN111324727A (en) | User intention recognition method, device, equipment and readable storage medium | |
KR20200130400A (en) | Voice-based search for digital content on the network | |
CN111046148A (en) | Intelligent interaction system and intelligent customer service robot | |
CN114254158A (en) | Video generation method and device, and neural network training method and device | |
CN116431806A (en) | Natural language understanding method and refrigerator | |
CN113505606B (en) | Training information acquisition method and device, electronic equipment and storage medium | |
CN112506405B (en) | Artificial intelligent voice large screen command method based on Internet supervision field | |
CN116450799B (en) | Intelligent dialogue method and equipment applied to traffic management service | |
US20210264812A1 (en) | Language learning system and method | |
CN116881730A (en) | Chat scene matching system, method, equipment and storage medium based on context | |
US11989514B2 (en) | Identifying high effort statements for call center summaries | |
Röpke et al. | Training a Speech-to-Text Model for Dutch on the Corpus Gesproken Nederlands. | |
CN116186258A (en) | Text classification method, equipment and storage medium based on multi-mode knowledge graph | |
KR20230116143A (en) | Counseling Type Classification System | |
CN112836517A (en) | Method for processing mining risk signal based on natural language | |
CN113094471A (en) | Interactive data processing method and device | |
CN110890097A (en) | Voice processing method and device, computer storage medium and electronic equipment | |
CN116049385B (en) | Method, device, equipment and platform for generating information and create industry research report |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |