CN109582965B - Distributed platform construction method and system of semantic analysis engine - Google Patents

Distributed platform construction method and system of semantic analysis engine Download PDF

Info

Publication number
CN109582965B
CN109582965B CN201811456181.3A CN201811456181A CN109582965B CN 109582965 B CN109582965 B CN 109582965B CN 201811456181 A CN201811456181 A CN 201811456181A CN 109582965 B CN109582965 B CN 109582965B
Authority
CN
China
Prior art keywords
word
data
training
model
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811456181.3A
Other languages
Chinese (zh)
Other versions
CN109582965A (en
Inventor
高岚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN201811456181.3A priority Critical patent/CN109582965B/en
Publication of CN109582965A publication Critical patent/CN109582965A/en
Application granted granted Critical
Publication of CN109582965B publication Critical patent/CN109582965B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the technical field of big data processing, in particular to a distributed platform architecture method and a distributed platform architecture system of a semantic analysis engine. The method effectively solves the increasingly huge data processing amount and reduces the maintenance and updating cost. A distributed platform architecture method of a semantic analysis engine is characterized by comprising the following steps: receiving input user statement data; performing off-line training on the user statement data; and analyzing the user statement data in real time to obtain a semantic result. The real-time parsing system is used for parsing the user sentences in real time, and comprises sentence segmentation, extraction of the intention and vocabulary characteristics of the user sentences to understand the real meanings of the user sentences. The offline system is used to train the segmentation models and the intent extraction models needed in the real-time system.

Description

Distributed platform construction method and system of semantic analysis engine
Technical Field
The invention relates to the technical field of big data processing, in particular to a distributed platform architecture method and a distributed platform architecture system of a semantic analysis engine.
Background
The rapid development of AI artificial intelligence technology now makes the devices such as televisions, mobile phones, and sound equipment in human life become more and more intelligent. Voice interaction is an important skill, and semantic analysis technology in voice interaction can help a machine device to understand human language, which is a very important technology. Then, semantic analysis technology is required for each intelligent device having a voice interaction function. When a product with a voice interaction function is to be applied to a production environment, an important point to be considered is to estimate the data volume of the product to be analyzed, when the data volume is small, an offline single-edition semantic analysis processing engine can be directly arranged at an equipment terminal for processing, but when the data volume is large, a large data platform needs to be selected for processing so as to ensure better user experience.
For a single-machine platform, the maintenance and the updating are not very facilitated, and the problem can be solved by uniformly collecting the voice data of the user to analyze, process and feed back. But with the dramatic increase in the amount of data for the user, a large data platform must be used to handle it. At present, the distributed processing architecture of big data is more and more widely applied in various fields because it can process huge amount of data and the operation speed is greatly increased. It is therefore necessary to apply the distributed processing architecture approach of big data also in voice interaction technology.
Disclosure of Invention
The invention aims to provide a distributed platform architecture method and a distributed platform architecture system for a semantic analysis engine, which can process larger data processing capacity by using the technology of a distributed processing architecture of big data and reduce maintenance and updating cost to a certain extent.
The invention discloses a distributed platform architecture method of a semantic analysis engine in a first aspect, which comprises the following steps:
receiving input user statement data;
performing off-line training on the user statement data; and
analyzing the user statement data in real time to obtain a semantic result;
preferably, the process of training the user sentence data offline includes:
storing the input user statement data on a distributed system to generate training data, converting the training data into a distributed data set to enable the training data to be partitioned, training the partitioned training data according to a word partitioning format to obtain a CRF word partitioning model, and training according to the part of speech of a word to obtain a CRF part of speech model;
performing word segmentation on all the training data, calculating a d-dimensional vector for each word by the training data subjected to word segmentation through an unsupervised method to obtain a word vector, and further generating a word vector model;
building a bidirectional coding and decoding model based on a neural network, inputting the word vectors into the bidirectional coding and decoding model to train and learn to obtain the intention of input sentences, simultaneously segmenting all training data, converting the user data after segmentation processing into an elastic distributed data set, and inputting the bidirectional coding and decoding model to verify whether the intention is accurate so as to train an intention extraction model; and
and providing the word vector with a near-sense word and/or a label for each word by a method of querying a standard dictionary to generate a labeled near-sense word network.
Preferably, the process of analyzing the user statement data in real time includes:
calling a CRF word segmentation model trained in advance, segmenting the user statement data to be decomposed into a plurality of words, calling a CRF part-of-speech model trained in advance, and labeling part-of-speech of each word obtained by decomposition;
searching all the vocabularies marked with the parts of speech in a near-sense word network with labels obtained by pre-training, and finding all the labels relative to the vocabularies by combining each vocabulary with the corresponding part of speech marked by the vocabulary;
meanwhile, calling a pre-trained intention extraction model, and analyzing all vocabularies marked with parts of speech to obtain possible intention information of the current user sentence; and
and analyzing and obtaining a final semantic result by combining the intention information and the label word information.
The real-time parsing system is used for parsing the user sentences in real time, and comprises sentence segmentation, extraction of the intention and vocabulary characteristics of the user sentences to understand the real meanings of the user sentences. The offline system is used to train the segmentation models and the intent extraction models needed in the real-time system. The method effectively solves the increasingly huge data processing amount and reduces the maintenance and updating cost.
The second aspect of the present invention discloses a distributed platform architecture system of a semantic analysis engine, comprising:
the offline training system is configured to receive input user statement data and perform offline training on the user statement data; and
the real-time analysis system is configured to receive input user statement data and analyze the user statement data in real time to obtain a semantic result.
Preferably, the offline training system is configured to:
storing the input user statement data on a distributed system to generate training data, converting the training data into a distributed data set to enable the training data to be partitioned, training the partitioned training data according to a word partitioning format to obtain a CRF word partitioning model, and training according to the part of speech of a word to obtain a CRF part of speech model;
performing word segmentation on all the training data, calculating a d-dimensional vector for each word by the training data subjected to word segmentation through an unsupervised method to obtain a word vector, and further generating a word vector model;
building a bidirectional coding and decoding model based on a neural network, inputting the word vectors into the bidirectional coding and decoding model to train and learn to obtain the intention of input sentences, simultaneously segmenting all training data, converting the user data after segmentation processing into an elastic distributed data set, and inputting the bidirectional coding and decoding model to verify whether the intention is accurate so as to train an intention extraction model; and
and providing the word vector with a near-sense word and/or a label for each word by a method of querying a standard dictionary to generate a labeled near-sense word network.
Preferably, the real-time parsing system is configured to invoke a pre-trained CRF word segmentation model, segment the user statement data to be decomposed into a plurality of words, and then invoke a pre-trained CRF part-of-speech model, and label part-of-speech for each word obtained by the decomposition;
searching all the vocabularies marked with the parts of speech in a near-sense word network with labels obtained by pre-training, and finding all the labels relative to the vocabularies by combining each vocabulary with the corresponding part of speech marked by the vocabulary;
meanwhile, calling a pre-trained intention extraction model, and analyzing all vocabularies marked with parts of speech to obtain possible intention information of the current user sentence; and
and analyzing and obtaining a final semantic result by combining the intention information and the label word information.
The invention has the beneficial effects that:
the real-time parsing system is used for parsing the user sentences in real time, and comprises sentence segmentation, extraction of the intention and vocabulary characteristics of the user sentences to understand the real meanings of the user sentences. The offline system is used to train the segmentation models and the intent extraction models needed in the real-time system. The method effectively solves the increasingly huge data processing amount and reduces the maintenance and updating cost.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a distributed system of semantic analysis engines according to an embodiment of the invention.
Detailed Description
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples thereof.
The technical solutions of the embodiments of the present invention will be described below with reference to the accompanying drawings.
In a first aspect of the present disclosure, a distributed platform architecture method for a semantic analysis engine is provided, and as shown in fig. 1, a processing flow connected by a lower part of solid lines is a real-time analysis technology framework of the semantic analysis engine, and a processing flow connected by an upper part of dotted lines is an offline training model data processing framework of the semantic analysis engine. The method comprises the following steps: receiving input user statement data; performing off-line training on the user statement data; and analyzing the user statement data in real time to obtain a semantic result.
Wherein the process of performing offline training on the user sentence data comprises:
the Spark-based LBFGS algorithm uses CRF-Spark to store the input user sentence data on HDFS, i.e. a distributed system, to generate training data, and converts the training data into a distributed data set so that the training data is blocked, e.g. converted into RDD using a textFile function, and executed in parallel on a cluster. Training the training data after being segmented according to a segmentation format to obtain a CRF segmentation model, wherein the format of the training data can be customized, and the segmentation format is (word, B/I/E/S), wherein (B/I/E/S) represents the beginning (B), middle (I), end (E) and single word (S) of the word, for example, as follows:
human being B
People I
Net E
1 B
Moon cake I
1 I
Day(s) E
Information communication S
Data can be trained by utilizing a train function during training, and the save function is called to store the model to a fixed position after the model is trained. Training the segmented training data according to the part of speech of the word to obtain a CRF part of speech model; the parts of speech are in the format of (word, part of speech), wherein the parts of speech are adjectives (adj), nouns (n), verbs (v), etc., as follows:
word Part of speech of the word
Advancing direction v
Is filled with v
Hope for n
Is/are as follows u
New a
Century n
Performing word segmentation on all the training data, calculating a d-dimensional vector for each word by the training data subjected to word segmentation through an unsupervised method to obtain a word vector, and further generating a word vector model; building a bidirectional LSTM coding and decoding model of a spark-based LSTM neural network, inputting the word vector into the bidirectional coding and decoding model to train and learn to obtain the intention of an input sentence, simultaneously segmenting all the training data, converting the user data after the segmentation processing into an elastic distributed data set, for example, converting a textFile function into RDD data, packaging the RDD data in a DataSet form to construct a final training data form RDD < DataSet >, inputting the bidirectional coding and decoding model, and training by using a train function to check whether the intention is accurate so as to train an intention extraction model.
And providing the word vector with a near-sense word and/or a label for each word by a method of querying a standard dictionary to generate a labeled near-sense word-word network, for example, as follows:
vocabulary and phrases Label (R) Word with similar meaning
Play back intent:play Play back
Watch with intent:play Play back
Check the intent:search Searching
Searching intent:search Searching
To come intent:recommend Recommending
Recommending intent:recommend Recommending
Downloading intent:download Downloading
Analyzing the user statement data in real time:
for example, the user statement from the terminal post, such as "how much the weather is today".
And analyzing the user statement based on the service of the springBoot framework.
Calling a CRF word segmentation model which is trained in advance, and segmenting the user sentence data to be decomposed into a plurality of words, such as:
"how much the weather is today".
Then calling a pre-trained part-of-speech model of CRF, and labeling part-of-speech for each vocabulary obtained by decomposition, such as:
"today: t (time word), weather: n (noun), how: ry (interrogatories) ".
Searching all the vocabularies marked with the parts of speech in a labeled near-sense word network obtained by pre-training, wherein each vocabulary finds all the labels relative to the vocabulary by combining the corresponding part of speech marked with the vocabulary, such as:
"today: day-0, weather: weather ".
Meanwhile, calling an intention extraction model which is trained in advance, and analyzing all vocabularies marked with parts of speech to obtain possible intention information of the current user sentence, such as:
"intention: query, field: weather ".
And finally, analyzing and obtaining a final semantic result by combining the intention information and the label word information, such as:
text how much the weather is today
Figure BDA0001887781630000051
Figure BDA0001887781630000061
The real-time parsing system is used for parsing the user sentences in real time, and comprises sentence segmentation, extraction of the intention and vocabulary characteristics of the user sentences to understand the real meanings of the user sentences. The offline system is used to train the segmentation models and the intent extraction models needed in the real-time system. The method effectively solves the increasingly huge data processing amount and reduces the maintenance and updating cost.
The second aspect of the present invention discloses a distributed platform architecture system of a semantic analysis engine, comprising:
the offline training system is configured to receive input user statement data and perform offline training on the user statement data; and
the real-time analysis system is configured to receive input user statement data and analyze the user statement data in real time to obtain a semantic result.
Preferably, the offline training system is configured to:
storing the input user statement data on a distributed system to generate training data, converting the training data into a distributed data set to enable the training data to be partitioned, training the partitioned training data according to a word partitioning format to obtain a CRF word partitioning model, and training according to the part of speech of a word to obtain a CRF part of speech model;
performing word segmentation on all the training data, calculating a d-dimensional vector for each word by the training data subjected to word segmentation through an unsupervised method to obtain a word vector, and further generating a word vector model;
building a bidirectional coding and decoding model based on a neural network, inputting the word vectors into the bidirectional coding and decoding model to train and learn to obtain the intention of input sentences, simultaneously segmenting all training data, converting the user data after segmentation processing into an elastic distributed data set, and inputting the bidirectional coding and decoding model to verify whether the intention is accurate so as to train an intention extraction model; and
and providing the word vector with a near-sense word and/or a label for each word by a method of querying a standard dictionary to generate a labeled near-sense word network.
Preferably, the real-time parsing system is configured to invoke a pre-trained CRF word segmentation model, segment the user statement data to be decomposed into a plurality of words, and then invoke a pre-trained CRF part-of-speech model, and label part-of-speech for each word obtained by the decomposition;
searching all the vocabularies marked with the parts of speech in a near-sense word network with labels obtained by pre-training, and finding all the labels relative to the vocabularies by combining each vocabulary with the corresponding part of speech marked by the vocabulary;
meanwhile, calling a pre-trained intention extraction model, and analyzing all vocabularies marked with parts of speech to obtain possible intention information of the current user sentence; and
and analyzing and obtaining a final semantic result by combining the intention information and the label word information.
The detailed working example process has already been elaborated in detail in the corresponding method, and is not described again.
Although the present invention has been described herein with reference to the illustrated embodiments thereof, which are intended to be preferred embodiments of the present invention, it is to be understood that the invention is not limited thereto, and that numerous other modifications and embodiments can be devised by those skilled in the art that will fall within the spirit and scope of the principles of this disclosure.

Claims (2)

1. A distributed platform architecture method of a semantic analysis engine is characterized by comprising the following steps:
receiving input user statement data;
performing off-line training on the user statement data; and
analyzing the user statement data in real time to obtain a semantic result;
wherein the process of performing offline training on the user sentence data comprises:
storing the input user statement data on a distributed system to generate training data, converting the training data into a distributed data set to enable the training data to be partitioned, training the partitioned training data according to a word partitioning format to obtain a CRF word partitioning model, and training according to the part of speech of a word to obtain a CRF part of speech model;
performing word segmentation on all the training data, calculating a d-dimensional vector for each word by the training data subjected to word segmentation through an unsupervised method to obtain a word vector, and further generating a word vector model;
building a bidirectional coding and decoding model based on a neural network, inputting the word vectors into the bidirectional coding and decoding model to train and learn to obtain the intention of input sentences, simultaneously segmenting all training data, converting user data after segmentation processing into an elastic distributed data set, and inputting the bidirectional coding and decoding model to check whether the intention is accurate so as to train an intention extraction model; and
providing a near-sense word and/or a label for each word by the word vector through a method of querying a standard dictionary to generate a labeled near-sense word network;
the process of analyzing the user statement data in real time comprises the following steps:
calling a CRF word segmentation model trained in advance, segmenting the user statement data to be decomposed into a plurality of words, calling a CRF part-of-speech model trained in advance, and labeling part-of-speech of each word obtained by decomposition;
searching all the vocabularies marked with the parts of speech in a near-sense word network with labels obtained by pre-training, and finding all the labels relative to the vocabularies by combining each vocabulary with the corresponding part of speech marked by the vocabulary;
meanwhile, calling a pre-trained intention extraction model, and analyzing all vocabularies marked with parts of speech to obtain possible intention information of the current user sentence; and
and analyzing and obtaining a final semantic result by combining the intention information and the label word information.
2. A distributed platform architecture system for a semantic analysis engine, comprising:
the offline training system is configured to receive input user statement data and perform offline training on the user statement data; and
the real-time analysis system is configured to receive input user statement data and analyze the user statement data in real time to obtain a semantic result;
the offline training system is configured to:
storing the input user statement data on a distributed system to generate training data, converting the training data into a distributed data set to enable the training data to be partitioned, training the partitioned training data according to a word partitioning format to obtain a CRF word partitioning model, and training according to the part of speech of a word to obtain a CRF part of speech model;
performing word segmentation on all the training data, calculating a d-dimensional vector for each word by the training data subjected to word segmentation through an unsupervised method to obtain a word vector, and further generating a word vector model;
building a bidirectional coding and decoding model based on a neural network, inputting the word vectors into the bidirectional coding and decoding model to train and learn to obtain the intention of input sentences, simultaneously segmenting all training data, converting user data after segmentation processing into an elastic distributed data set, and inputting the bidirectional coding and decoding model to check whether the intention is accurate so as to train an intention extraction model; and
providing a near-sense word and/or a label for each word by the word vector through a method of querying a standard dictionary to generate a labeled near-sense word network;
the real-time analysis system is configured to call a CRF word segmentation model which is trained in advance, segment the user statement data to be decomposed into a plurality of words, then call a CRF part-of-speech model which is trained in advance, and label part-of-speech of each word obtained by decomposition;
searching all the vocabularies marked with the parts of speech in a near-sense word network with labels obtained by pre-training, and finding all the labels relative to the vocabularies by combining each vocabulary with the corresponding part of speech marked by the vocabulary;
meanwhile, calling a pre-trained intention extraction model, and analyzing all vocabularies marked with parts of speech to obtain possible intention information of the current user sentence; and
and analyzing and obtaining a final semantic result by combining the intention information and the label word information.
CN201811456181.3A 2018-11-30 2018-11-30 Distributed platform construction method and system of semantic analysis engine Active CN109582965B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811456181.3A CN109582965B (en) 2018-11-30 2018-11-30 Distributed platform construction method and system of semantic analysis engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811456181.3A CN109582965B (en) 2018-11-30 2018-11-30 Distributed platform construction method and system of semantic analysis engine

Publications (2)

Publication Number Publication Date
CN109582965A CN109582965A (en) 2019-04-05
CN109582965B true CN109582965B (en) 2022-03-01

Family

ID=65926589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811456181.3A Active CN109582965B (en) 2018-11-30 2018-11-30 Distributed platform construction method and system of semantic analysis engine

Country Status (1)

Country Link
CN (1) CN109582965B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933039A (en) * 2015-06-04 2015-09-23 中国科学院新疆理化技术研究所 Entity link system for language lacking resources
CN107423288A (en) * 2017-07-05 2017-12-01 达而观信息科技(上海)有限公司 A kind of Chinese automatic word-cut and method based on unsupervised learning
CN107464568A (en) * 2017-09-25 2017-12-12 四川长虹电器股份有限公司 Based on the unrelated method for distinguishing speek person of Three dimensional convolution neutral net text and system
CN107861944A (en) * 2017-10-24 2018-03-30 广东亿迅科技有限公司 A kind of text label extracting method and device based on Word2Vec
CN107894981A (en) * 2017-12-13 2018-04-10 武汉烽火普天信息技术有限公司 A kind of automatic abstracting method of case semantic feature
CN107943860A (en) * 2017-11-08 2018-04-20 北京奇艺世纪科技有限公司 The recognition methods and device that the training method of model, text are intended to
CN108388553A (en) * 2017-12-28 2018-08-10 广州索答信息科技有限公司 Talk with method, electronic equipment and the conversational system towards kitchen of disambiguation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9239828B2 (en) * 2013-12-05 2016-01-19 Microsoft Technology Licensing, Llc Recurrent conditional random fields

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933039A (en) * 2015-06-04 2015-09-23 中国科学院新疆理化技术研究所 Entity link system for language lacking resources
CN107423288A (en) * 2017-07-05 2017-12-01 达而观信息科技(上海)有限公司 A kind of Chinese automatic word-cut and method based on unsupervised learning
CN107464568A (en) * 2017-09-25 2017-12-12 四川长虹电器股份有限公司 Based on the unrelated method for distinguishing speek person of Three dimensional convolution neutral net text and system
CN107861944A (en) * 2017-10-24 2018-03-30 广东亿迅科技有限公司 A kind of text label extracting method and device based on Word2Vec
CN107943860A (en) * 2017-11-08 2018-04-20 北京奇艺世纪科技有限公司 The recognition methods and device that the training method of model, text are intended to
CN107894981A (en) * 2017-12-13 2018-04-10 武汉烽火普天信息技术有限公司 A kind of automatic abstracting method of case semantic feature
CN108388553A (en) * 2017-12-28 2018-08-10 广州索答信息科技有限公司 Talk with method, electronic equipment and the conversational system towards kitchen of disambiguation

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Improved Bags-of-Words Algorithm for Scene Recognition;Gang L. 等;《Physics Procedia》;20121231;第24卷;1255-1261 *
Using recurrent neural networks for slot filling in spoken language understanding;Mesnil G. 等;《IEEE/ACM Transactions on Audio, Speech, and Language Processing》;20141225;第23卷(第3期);530-539 *
基于电商数据和用户行为的信息抽取;甘骏;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20170215(第02期);I138-4391 *
基于词关联关系的文本内容分析;林宇航;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20131115(第11期);I138-1045 *
基于语义相似度的中文文本分类研究;李晓军;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20180415(第04期);I138-3522 *

Also Published As

Publication number Publication date
CN109582965A (en) 2019-04-05

Similar Documents

Publication Publication Date Title
CN111930940B (en) Text emotion classification method and device, electronic equipment and storage medium
CN109284399B (en) Similarity prediction model training method and device and computer readable storage medium
CN109086303A (en) The Intelligent dialogue method, apparatus understood, terminal are read based on machine
CN110853649A (en) Label extraction method, system, device and medium based on intelligent voice technology
CN109637537B (en) Method for automatically acquiring annotated data to optimize user-defined awakening model
CN111708869B (en) Processing method and device for man-machine conversation
CN110019742B (en) Method and device for processing information
CN110727776B (en) Automobile question-answering interaction system and interaction method based on artificial intelligence
CN109949799B (en) Semantic parsing method and system
EP3940693A1 (en) Voice interaction-based information verification method and apparatus, and device and computer storage medium
KR101627428B1 (en) Method for establishing syntactic analysis model using deep learning and apparatus for perforing the method
CN113672708A (en) Language model training method, question and answer pair generation method, device and equipment
CN110991179A (en) Semantic analysis method based on electric power professional term
CN112349294B (en) Voice processing method and device, computer readable medium and electronic equipment
CN114818649A (en) Service consultation processing method and device based on intelligent voice interaction technology
CN109933773A (en) A kind of multiple semantic sentence analysis system and method
CN113326367B (en) Task type dialogue method and system based on end-to-end text generation
CN115064154A (en) Method and device for generating mixed language voice recognition model
KR101941924B1 (en) Method for providing association model based intention nano analysis service using cognitive neural network
CN117524202A (en) Voice data retrieval method and system for IP telephone
CN112818096A (en) Dialog generating method and device
CN109582965B (en) Distributed platform construction method and system of semantic analysis engine
CN113393841A (en) Training method, device and equipment of speech recognition model and storage medium
CN112270192B (en) Semantic recognition method and system based on part of speech and deactivated word filtering
CN115273828A (en) Training method and device of voice intention recognition model and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant