WO2020192237A1 - 基于人工智能的语义识别的方法、装置系统及存储介质 - Google Patents

基于人工智能的语义识别的方法、装置系统及存储介质 Download PDF

Info

Publication number
WO2020192237A1
WO2020192237A1 PCT/CN2020/070175 CN2020070175W WO2020192237A1 WO 2020192237 A1 WO2020192237 A1 WO 2020192237A1 CN 2020070175 W CN2020070175 W CN 2020070175W WO 2020192237 A1 WO2020192237 A1 WO 2020192237A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
sentence
text information
vector
semantic
Prior art date
Application number
PCT/CN2020/070175
Other languages
English (en)
French (fr)
Inventor
高丽丽
李超
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Publication of WO2020192237A1 publication Critical patent/WO2020192237A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • G06Q30/015Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
    • G06Q30/016After-sales
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]

Definitions

  • This application relates to the field of computer technology, in particular to a method, device, system and storage medium for semantic recognition based on artificial intelligence.
  • the semantic recognition of sentences can be applied to various scenarios. Among them, the most widely used is the handling of complaint information in e-commerce and determining the corresponding liability information , The following describes how to specifically identify the semantics of sentences by handling complaints in e-commerce and determining the corresponding liability information.
  • Electronic Commerce is a business activity centered on commodity exchange by means of information network technology.
  • e-commerce service providers provide services for selling goods based on the Internet, so that people can directly and conveniently shop online at home.
  • e-commerce service providers and customers are often communicated through the Internet to communicate various service-related matters.
  • customers often evaluate all aspects of e-commerce.
  • complaints may also be generated and the complaints are sent to the e-commerce service provider.
  • the e-commerce service provider receives Analyze the complaint information and determine the corresponding certain responsibility information, thereby completing the entire e-commerce process.
  • the embodiments of the present application provide a method, device, system, and storage medium for semantic recognition based on artificial intelligence, so as to improve the accuracy of semantic recognition of long texts.
  • a semantic recognition method based on artificial intelligence including:
  • the vector is classified to identify the semantics of the long text.
  • a semantic recognition system based on artificial intelligence comprising: a first convolutional neural network unit, a second neural network unit, and a classification unit, wherein,
  • the first convolutional neural network unit is configured to receive each sentence in the long text, and input each sentence into the first convolutional neural network to obtain text information of each sentence, wherein the text information represents the Semantic features of sentences;
  • the second neural network unit is used to receive the text information of each sentence, and obtain the text information vector of the long text according to the text information of each sentence, where each vector element is the text information of each sentence, and the obtained
  • the text information vector is input to the second neural network, and the output is a vector representing the logical relationship between the text information of each sentence;
  • the classification unit is used to classify the vector to recognize the semantics of the long text.
  • a device for semantic recognition includes:
  • a memory and a processor coupled to the memory, the processor is configured to execute the semantic recognition method described in any one of the foregoing based on instructions stored in the memory.
  • a non-volatile computer-readable storage medium has a computer program stored thereon, and when the program is executed by a processor, the semantic recognition method described in any one of the above is realized.
  • the embodiment of the present application first obtains the text information of each sentence through each sentence in the long text through the set first convolutional neural network; secondly, obtains the text information vector of the long text according to the text information of each sentence , Where each vector element is the text information of each sentence, input the obtained text information vector into the second neural network, and output the vector representing the logical relationship between the text information of each sentence; The vector is classified to recognize the semantics of long texts.
  • the embodiments of the present application can be applied to the scenario of handling complaint information and determining corresponding liability information in e-commerce. Since the embodiment of this application considers the logical relationship between the text information of each sentence in the long text when determining the semantic information, and is not based on the isolated determination based on the text information of each sentence, the semantic recognition of the long text is improved. Accuracy.
  • Fig. 1 is a flowchart of a method for artificial intelligence-based semantic recognition provided by an embodiment of the application
  • FIG. 3 is another flowchart of the artificial intelligence-based semantic recognition method provided by an embodiment of the application.
  • Figure 4 is an overall framework diagram of the method for determining complaint information in an e-commerce system provided by an embodiment of the application
  • FIG. 5 is a schematic diagram of the process of recognizing the text information of each sentence in an embodiment of the application.
  • FIG. 6 is a schematic diagram of LSTM calculation logic in an embodiment of the application.
  • FIG. 7 is a process block diagram of BiLSTM processing data provided by an embodiment of this application.
  • 8A-8C are schematic diagrams of the system structure of artificial intelligence-based semantic recognition provided by an embodiment of the application.
  • Fig. 9 is a schematic structural diagram of an artificial intelligence-based semantic recognition device provided by an embodiment of the application.
  • e-commerce service providers mainly use a combination of manual and machine learning algorithms to handle complaints and determine the corresponding liability information.
  • the received complaint information is spliced together to form a long text, and then based on the formed long text, the set machine learning algorithm is used to classify the long text to obtain the corresponding responsibility information, so as to realize the responsibility determination problem.
  • the embodiment of the present application first obtains the text information of each sentence through each sentence in the long text through the first convolutional neural network; secondly, obtains the text of the long text according to the text information of each sentence Information vector, where each vector element is the text information of each sentence, input the obtained text information vector into the second neural network, and output a vector representing the logical relationship between the text information of each sentence; finally, The vector is classified to identify the semantics of the long text. Since the embodiment of the application considers the logical relationship between the text information of each sentence when determining the semantic information, instead of determining it based on the text information of each sentence in isolation, the accuracy of semantic recognition is improved.
  • AI Artificial Intelligence
  • Neural Network is a mathematical model or calculation model, which mainly includes: input layer, hidden layer and output layer.
  • Convolutional Neural Network is a type of neural network that includes convolution calculations and has a deep structure.
  • embodiments of the present application can be applied to the scenario of handling complaint information and determining corresponding liability information in e-commerce.
  • the second neural network may be a bidirectional long short memory network (BiLSTM) or a gated recurrent unit (GRU).
  • BiLSTM bidirectional long short memory network
  • GRU gated recurrent unit
  • the embodiment of the application not only uses the first convolutional neural network to determine the text of each sentence in the complaint information when determining the liability information Information, and use the set BiLSTM or GRU to identify the logical relationship between the text information of each sentence, so that the corresponding liability information is finally obtained, taking into account the logical relationship between the text information of each sentence in the complaint information , Not based on the text information of each sentence in the isolated complaint information, so the accuracy of determining the liability information based on the complaint information is improved.
  • Figure 1 is a flowchart of the artificial intelligence-based semantic recognition method provided by an embodiment of the application, which can be executed by any computing device with data processing capabilities, for example, a terminal device or a server.
  • the terminal device may be a personal computer (PC)
  • Smart terminal devices such as laptops, laptops, etc. can also be smart mobile terminal devices such as smart phones and tablet computers.
  • the specific steps are:
  • Step 101 Input each sentence in the long text into the set first convolutional neural network to obtain text information of each sentence, where the text information represents the semantic feature of the sentence;
  • Step 102 Obtain the text information vector of the long text according to the text information of each sentence, where each vector element is the text information of each sentence; input the obtained text information vector into the set second neural network, and output Obtain a vector representing the logical relationship between the text information of each sentence;
  • Step 103 Classify the vector to identify the semantics of the long text.
  • the obtaining the text information of each sentence through each sentence in the long text through the set first convolutional neural network includes:
  • Each sentence in the long text is segmented according to the minimum semantic unit, and the minimum semantic unit obtained by the segmentation is input into the set first convolutional neural network in order to obtain the text information of each sentence.
  • the smallest semantic unit in each sentence can be a Chinese character or a word, which is the smallest unit of a semantic word or word.
  • the first convolutional neural network includes a plurality of convolution units, and each convolution unit has a convolution kernel of a different size. As shown in FIG. 2, the value of each sentence is obtained as described in step 101.
  • the text information is:
  • S12 Input the obtained minimum semantic unit combinations of different unit length values into different convolution units for processing, and concatenate the text information vectors output from the different convolution units to obtain the text information of each sentence.
  • the number of the plurality of convolution units is three, which are a first convolution unit, a second convolution unit, and a third convolution unit;
  • step S11 the minimum semantic unit combination for obtaining different unit length values in step S11 is:
  • S110 Perform every two adjacent minimum semantic unit combinations according to the input order of the minimum semantic unit to obtain a minimum semantic unit combination with a unit length value of 2.
  • S112 Perform every three adjacent minimum semantic unit combinations according to the input order of the minimum semantic unit to obtain a minimum semantic unit combination with a unit length value of 3.
  • S114 Perform every four adjacent minimum semantic unit combinations according to the input order of the minimum semantic unit to obtain a minimum semantic unit combination with a unit length value of 4.
  • step S12 inputting the obtained minimum semantic unit combinations of different unit length values into different convolution units is processed as follows:
  • the set second neural network includes two sub-neural networks, as shown in FIG. 3, the process of outputting in step 102 to obtain a vector representing the logical relationship between the text information of each sentence is:
  • the first sub-neural network reads and processes the vector elements in the text information vector of the long text from left to right, and obtains a vector representing a left-to-right logical relationship between the text information of each sentence;
  • the second sub-neural network reads and processes the vector elements in the text information vector of the long text from right to left to obtain a vector representing the logical relationship between the text information of each sentence from right to left;
  • the second neural network is a bidirectional long short memory network BiLSTM or a gated recurrent unit GRU;
  • the first sub-neural network is BiLSTM or GRU;
  • the second sub-neural network is BiLSTM or GRU.
  • the classification in step 103 is:
  • the vector is mapped to the matching intent label, and the semantics of the long text is recognized.
  • the classification method is performed in a multi-classification and evaluation index (softmax) method, and the neighbors are mapped to the matching intention tags, so as to recognize the semantics.
  • Softmax is used in the multi-classification process. It maps multiple output information to the (0, 1) interval, which can be opened as a probability to understand, so that multiple classifications are performed and the corresponding semantics are obtained.
  • the complaint information in Table 1 is the form of data generated during the e-commerce process.
  • the liability information is "broken” and “lost” for example analysis.
  • the "broken” data is “may be lost” when the site first finds the problem. , It was actually “damaged” after the warehouse investigation.
  • the “lost” data is “possibly lost” when the site first discovers the problem. After the warehousing investigation found that the goods were indeed transported to the site, it can be determined as lost. It can be seen that the logical reasoning relationship between sentences is the core of the second neural network.
  • FIG. 4 is an overall framework diagram of the method for determining complaint information in an e-commerce system provided by an embodiment of the application, and the second neural network is BiLSTM as an example.
  • the complaint information is generally chat data, which is usually obtained through chat data recognition. Therefore, when the complaint information is input into the first convolutional neural network, the chat data can be input, that is, chat data According to the order of the original dialogue, they are sequentially input into the first convolutional neural network. Make sure that the order of sentences matches the order in which the original chat data was generated.
  • Step 1 Prepare the data
  • the chat data is sequentially input into the first convolutional neural network in the order of the original conversation. Make sure that the order of the sentences matches the order of the original chat.
  • the second step use the first convolutional neural network to process the input chat data and identify the text information of each sentence.
  • FIG. 5 is a schematic diagram of the process of recognizing the text information of each sentence in an embodiment of the application.
  • w1w2w3...wn is the specific input sentence, w1, w2,...wn respectively represent the smallest semantic unit obtained by segmentation.
  • three convolution kernels to read each minimum semantic unit (for example, Chinese characters or words) in the sentence.
  • the convolution kernel on the left gradually reads data for every two Chinese characters or words for convolution calculation
  • the convolution kernel in the middle gradually reads data for every three Chinese characters or words for convolution calculation
  • the convolution kernel on the right reads data for every four Chinese characters or Words gradually read the data for convolution calculation.
  • the data output by each convolution kernel that is, the text vector
  • the output result is [Out_1, Out_2...Out_n].
  • the third step Use BiLSTM to learn the logical relationship between the text information of each sentence
  • BiLSTM is used to calculate the logical relationship between the text information of each sentence.
  • RNN Recurrent Neural Networks
  • threshold RNNs are proposed, and LSTM is the most famous kind of threshold RNNs.
  • the leaky unit in the RNN allows the RNN to accumulate the long-term connections between nodes that are far away by designing the weight coefficients between the connections; the threshold RNN feeds back this idea, allowing the coefficient to be changed at different times, and allowing the network to forget Information that has been accumulated.
  • LSTM is such a threshold RNN.
  • the ingenuity of LSTM is that by increasing the input threshold, forgetting threshold, and output threshold, the weight of the self-loop is changed, so that when the model parameters are fixed, the integral scale at different times can be dynamic Change, thereby avoiding the problem of gradient disappearance or gradient expansion.
  • Fig. 6 is a schematic diagram of LSTM calculation logic in an embodiment of the application.
  • LSTM is used to perform model calculation for each sentence, and the specific calculation logic of LSTM is shown in FIG. 6.
  • Figure 7 is a block diagram of the BiLSTM processing data provided by the embodiment of this application.
  • BiLSTM is used to use two LSTMs for the text information of each sentence obtained in the second step, namely The first sub-neural network (LSTM1) and the second sub-neural network (LSTM2) read the logical relationship between the text information of the sentence from two directions, where LSTM1 reads the text information of the sentence from left to right, LSTM2 is to read the text information of the sentence from right to left, and finally concatenate the information between each LSTM reading into a vector, and output as [can_1,can_2. togethercan_n].
  • LSTM1 reads the text information of the sentence from left to right
  • LSTM2 is to read the text information of the sentence from right to left
  • the fourth step to realize the generation of responsibility information
  • the matrix output in the third step is fully connected, it is mapped to the intent label through the softmax method, that is, the corresponding responsibility classification label, that is, the responsibility information is obtained.
  • FIG. 8A is a schematic structural diagram of a system for semantic recognition based on artificial intelligence provided by an embodiment of the application.
  • the system includes: a first convolutional neural network unit 801, a second neural network unit 802, and a classification unit 803, wherein,
  • the first convolutional neural network unit 801 is configured to receive each sentence in the long text, and input each sentence into the first convolutional neural network to obtain the text information of each sentence, where the text information represents the sentence Semantic features;
  • the second neural network unit 802 is used to receive the text information of each sentence, and obtain the text information vector of the long text according to the text information of each sentence, where each vector element is the text information of each sentence, and the obtained text
  • the information vector is input to the second neural network, and the output is a vector representing the logical relationship between the text information of each sentence;
  • the classification unit 803 is configured to classify the vector to recognize the semantics of the long text.
  • the first convolutional neural network unit 801 is further configured to segment each sentence according to the smallest semantic unit, and input the smallest semantic unit obtained by the segmentation into the set first volume in order.
  • Product neural network get the text information of each sentence.
  • the smallest semantic unit in each sentence can be a Chinese character or a word, which is the smallest unit of a semantic word or word.
  • the first convolutional neural network unit 801 further includes: a combining unit 8011 and a text information acquiring unit 8012, wherein:
  • the combination unit 8011 is used to combine the smallest semantic units input in order with different set unit length values to obtain the smallest semantic unit combination with different unit length values;
  • the text information acquisition unit 8012 is used to input the obtained minimum semantic unit combinations of different unit length values into different convolution kernels for processing, and concatenate the text information vectors output from different convolution kernels to obtain the Text information.
  • the combination unit 8011 is further used for:
  • the text information obtaining unit 8012 is further configured to:
  • the minimum semantic unit combination with a unit length of 4 is input to the third convolution kernel for processing.
  • the second neural network unit 802 further includes a first sub-neural network unit 8021, a second sub-neural network unit 8022, and a vector acquisition unit 8023, wherein:
  • the first sub-neural network unit 8021 is used to read and process the vector elements in the text information vector of the long text from left to right to obtain the logical relationship from left to right representing the text information of each sentence The vector;
  • the second sub-neural network unit 8022 is used to read and process the vector elements in the text information vector of the long text from right to left to obtain a right-to-left logical relationship representing the text information of each sentence The vector;
  • the vector obtaining unit 8023 obtains the vector representing the left-to-right logical relationship between the text information of each sentence, and the vector representing the right-to-left logical relationship between the text information of each sentence, and concatenates them to obtain A vector representing the logical relationship between the text information of each sentence.
  • the classification unit 803 is further configured to use a set classification method to map the vector to a matching intent tag to recognize the semantics of the long text.
  • the electronic device 900 includes: a memory 906, a processor 902, a communication module 904, a user interface 910, and a device for interconnecting these Component communication bus 908.
  • the memory 906 may be a high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid-state storage devices; or a non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, or flash memory devices, Or other non-volatile solid-state storage devices.
  • a high-speed random access memory such as DRAM, SRAM, DDR RAM, or other random access solid-state storage devices
  • non-volatile memory such as one or more magnetic disk storage devices, optical disk storage devices, or flash memory devices, Or other non-volatile solid-state storage devices.
  • the user interface 910 may include one or more output devices 912, and one or more input devices 914.
  • the memory 906 stores a set of instructions executable by the processor 902, including programs for implementing the processing procedures in the foregoing embodiments, and the processor 902 implements the steps of the semantic recognition method when the program is executed.
  • the embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the semantic recognition method described in any one of the above is implemented.
  • the embodiment of the application adopts the first convolutional neural network, specifically TextCNN to identify the text information in each sentence in the complaint information, and uses BiLSTM to identify the logical algorithm between sentences, which can effectively capture the sentence The logical relationship between the reasoning, so as to get more accurate semantic recognition.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于人工智能的语义识别的方法、系统、装置及存储介质,本方法首先将长文本中的每个句子通过设置的第一卷积神经网络得到每个句子的文本信息,其中,所述文本信息代表所述句子的语义特征(101);其次根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息;将得到的文本信息向量输入到设置的第二神经网络中,输出得到代表每个句子文本信息之间的逻辑关系的向量(102);最后,对所述向量进行分类以识别所述长文本的语义(103)。

Description

基于人工智能的语义识别的方法、装置系统及存储介质
本申请要求于2019年3月22日提交中国专利局、申请号为201910222540.7,发明名称为“一种语义识别的方法及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别涉及一种基于人工智能的语义识别的方法、装置、系统及存储介质。
背景技术
随着人工智能技术的发展,对句子的语义识别逐渐发展起来,对句子的语义识别可以应用到各种场景中,其中,运用最广泛的就是电子商务中处理投诉信息且确定对应的定责信息,以下以电子商务中处理投诉信息且确定对应的定责信息,对如何具体进行句子的语义识别进行说明。
随着计算机及互联网技术的发展,电子商务逐渐发展起来。电子商务(Electronic Commerce)是以信息网络技术为手段,以商品交换为中心的商务活动。随着电子商务的发展,由电子商务服务商基于互联网提供售卖商品的服务,使得人们可以在家直接方便地进行网上购物。在电子商务提供各种服务时,常常通过互联网在电子商务服务商与客户之间进行通信,以沟通各种服务相关事宜。其中,客户在进行电子商务过程中,常常会对电子商务的各个环节作出评价,在此过程中,也可能产生投诉信息,且将投诉发送给电子商务服务商,电子商务服务商在接收到后,对投诉信息进行分析,确定对应的确定定责信息,从而完成整个电子商务过程。
技术内容
本申请实施例提供一种基于人工智能的语义识别的方法、装置、系统及存储介质,从而提高长文本的语义识别的准确率。
本申请实施例是这样实现的:
一种基于人工智能的语义识别方法,包括:
将长文本中的每个句子通过设置的第一卷积神经网络得到每个句子的文本信息,其中,所述文本信息代表所述句子的语义特征;
根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息;
将得到的文本信息向量输入到设置的第二神经网络中,输出得到代表每个句子文本信息之间的逻辑关系的向量;
对所述向量进行分类以识别所述长文本的语义。
一种基于人工智能的语义识别系统,包括:第一卷积神经网络单元、第二神经网络单元及分类单元,其中,
所述第一卷积神经网络单元,用于接收长文本中的每个句子,将每个句子输入第一卷积神经网络,得到每个句子的文本信息,其中,所述文本信息代表所述句子的语义特征;
所述第二神经网络单元,用于接收每个句子文本信息,根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息,将得到的文本信息向量输入到第二神经网络,输出得到代表每个句子文本信息之间的逻辑关系的向量;
所述分类单元,用于对所述向量进行分类以识别所述长文本的语义。
一种语义识别的装置,包括:
存储器;以及耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行上述任一项所述的语义识别的方法。
一种非易失性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述任一项所述的语义识别的方法。
如上可见,本申请实施例首先将长文本中的每个句子通过设置的第一卷积神经网络得到每个句子的文本信息;其次根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息,将得到的文本信息向量输入到设置的第二神经网络中,输出得到代表每个句子文本信息之间的逻辑关系的向量;最后,对所述向量进行分类以识别长文本的语义。更进一步地,本申请实施例可以应用于电子商务中的处理投诉信息且确定对应的定责信息场景中。由于本申请实施例在确定语义信息时,考虑到了长文本中每个句子文本信息之间的逻辑关系,而并不是基于孤立的根据每个句子文本信息确定的,所以提高了长文本语义识别的准确率。
附图简要说明
图1为本申请实施例提供的基于人工智能的语义识别的方法流程图;
图2为本申请实施例提供的基于人工智能的语义识别方法的另一流程图;
图3为本申请实施例提供的基于人工智能的语义识别方法的又一流程图;
图4为本申请实施例提供的在电子商务系统中确定投诉信息的方法整体框架图;
图5为本申请实施例识别每个句子的文本信息的过程示意图;
图6为本申请实施例中LSTM计算逻辑示意图;
图7为本申请实施例提供的BiLSTM处理数据的过程框图;
图8A-8C为本申请实施例提供的基于人工智能的语义识别的系统结 构示意图;
图9为本申请实施例提供的基于人工智能的语义识别装置的结构示意图。
具体实施方式
为使本申请的目的、技术方案及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。
目前电子商务服务商在处理投诉信息且确定对应的定责信息时,主要采用人工与机器学习算法的结合方式来完成。具体地说,将接收到的投诉信息拼接在一起,构成长文本,再基于所构成的长文本采用设置的机器学习算法,对长文本进行文本分类,得到对应的定责信息,从而实现定责问题。
采用上述分析虽然可以一定程度地实现投诉信息对应的定责信息,但是其中一个很大问题是没有考虑到在投诉信息中,句子与句子之间的逻辑关系,而仅仅考虑了所构成的长文本的分类问题,使得最终得到的定责信息并不准确。也就是说,对于句子的语义识别,仅仅考虑了所构成的长文本的分类问题,而没有考虑句子与句子之间的逻辑关系,从而最终导致语义识别不准确。
可以看出,对语义识别不准确的原因,特别是在电子商务中的处理投诉信息且确定对应的定责信息场景中时定责信息不准确的原因,是没有考虑到句子与句子之间的逻辑关系,而仅仅考虑了所构成的长文本的分类问题。为了克服这个缺陷,本申请实施例首先将长文本中的每个句子通过设置的第一卷积神经网络得到每个句子的文本信息;其次根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息,将得到的文本信息向量输入到设置的第二神经 网络中,输出得到代表每个句子文本信息之间的逻辑关系的向量;最后,对所述向量进行分类以识别所述长文本的语义。由于本申请实施例在确定语义信息时,考虑到了每个句子文本信息之间的逻辑关系,而并不是基于孤立的根据每个句子文本信息确定的,所以提高了语义识别的准确率。
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。
神经网络(Neural Network)是一种数学模型或计算模型,主要包括:输入层、隐藏层和输出层。卷积神经网络(Convolutional Neural Network,CNN)是一类包含卷积计算且具有深度结构的神经网络。
更进一步地,本申请实施例可以应用于电子商务中的处理投诉信息且确定对应的定责信息场景中。
在这里,所述第二神经网络可以为双向长短记忆网络(BiLSTM)或门控循环单元(GRU)。
在电子商务中的处理投诉信息且确定对应的定责信息场景中,由于本申请实施例在确定定责信息时,不仅仅采用设置的第一卷积神经网络确定投诉信息中的各个句子的文本信息,且采用设置的BiLSTM或GRU对各个句子的文本信息之间的逻辑关系进行了识别,这样,最后得到对应的定责信息充分考虑到了投诉信息中的每个句子文本信息之间的逻辑关系,而并不是基于孤立的投诉信息中的每个句子文本信息确定的,所以提高了根据投诉信息确定定责信息的准确率。
图1为本申请实施例提供的基于人工智能的语义识别的方法流程图,可以由任何具有数据处理能力的计算设备执行,例如,终端设备或者服务器,所述终端设备可以是个人计算机(PC)、笔记本电脑等智能终端 设备,也可以是智能手机、平板电脑等智能移动终端设备。其具体步骤为:
步骤101、将长文本中的每个句子输入设置的第一卷积神经网络得到每个句子的文本信息,所述文本信息代表所述句子的语义特征;
步骤102、根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息;将得到的文本信息向量输入到设置的第二神经网络中,输出得到代表每个句子文本信息之间的逻辑关系的向量;
步骤103、对所述向量进行分类以识别所述长文本的语义。
在一些实施例中,所述将长文本中的每个句子通过设置的第一卷积神经网络得到每个句子的文本信息包括:
将长文本中的每个句子按照最小语义单元进行切分,将所述切分得到的最小语义单元按照顺序输入设置的第一卷积神经网络,得到每个句子的文本信息。
在这里,每个句子中的最小语义单元可以为一个汉字或一个单词,是具有语义的字或词的最小单元。
在一些实施例中,所述第一卷积神经网络包括多个卷积单元,每个卷积单元具有不同大小的卷积核,如图2所示,步骤101中所述得到每个句子的文本信息为:
S11,将按照顺序输入的最小语义单元,分别采用设置的不同单元长度值进行组合,得到具有不同单元长度值的最小语义单元组合;
S12,将得到的不同单元长度值的最小语义单元组合分别输入到不同的卷积单元中处理,将从不同卷积单元输出的文本信息向量进行拼接,得到每个句子的文本信息。
在一些实施例中,所述多个卷积单元的数量为三个,分别为第一卷 积单元、第二卷积单元及第三卷积单元;
如图2所示,步骤S11中所述得到不同单元长度值的最小语义单元组合为:
S110,按照所述最小语义单元的输入顺序进行每两个相邻最小语义单元组合,得到单元长度值为2的最小语义单元组合;
S112,按照所述最小语义单元的输入顺序进行每三个相邻最小语义单元组合,得到单元长度值为3的最小语义单元组合;
S114,按照所述最小语义单元的输入顺序进行每四个相邻最小语义单元组合,得到单元长度值为4的最小语义单元组合;
在一些实施例中,如图2所示,步骤S12中将得到的不同单元长度值的最小语义单元组合分别输入到不同的卷积单元中处理为:
S120,将单元长度值为2的最小语义单元组合输入到第一卷积单元处理;
S122,将单元长度值为3的最小语义单元组合输入到第二卷积单元处理;
S124,将单元长度为4的最小语义单元组合输入到第三卷积单元处理。
在一些实施例中,所述设置的第二神经网络包括两个子神经网络,如图3所示,步骤102中所述输出得到代表每个句子文本信息之间的逻辑关系的向量的过程为:
S21,第一子神经网络将所述长文本的文本信息向量中的向量元素从左至右读取并进行处理,得到代表每个句子文本信息之间的从左至右的逻辑关系的向量;
S22,第二子神经网络将所述长文本的文本信息向量中的向量元素从右至左读取并进行处理,得到代表每个句子文本信息之间的从右至左的 逻辑关系的向量;
S23,将得到代表每个句子文本信息之间的从左至右的逻辑关系的向量,及得到代表每个句子文本信息之间的从右至左的逻辑关系的向量进行拼接,得到代表每个句子文本信息之间的逻辑关系的向量。
在一些实施例中,所述第二神经网络为双向长短记忆网络BiLSTM或门控循环单元GRU;
所述第一子神经网络为BiLSTM或GRU;
所述第二子神经网络为BiLSTM或GRU。
在一些实施例中,步骤103中所述分类为:
采用设置的分类方式,将所述向量映射至符合的意图标签中,识别得到所述长文本的语义。
在一些实施例中,所述分类方式为多分类与评估指标(softmax)方式进行,将所述相邻映射至符合的意图标签中,从而识别得到语义。Softmax方式用于多分类过程中,它是将多个输出信息,映射到(0,1)区间内,可以开成为概率来理解,从而进行了多个分类,得到对应的语义。
以下以本申请实施例应用于电子商务中的处理投诉信息且确定对应的定责信息场景中,对本申请实施例进行完整说明,这时,句子为投诉信息,要识别得到的语义为定制信息。
假设投诉信息对应的定责信息(表中表示为标签),如表一所示。
Figure PCTCN2020070175-appb-000001
Figure PCTCN2020070175-appb-000002
表一
表一中的投诉信息是在电子商务过程中产生的数据形式,以定责信息为“破损”和“丢失”举例分析,其中“破损”的数据刚开始站点发现问题的时候是“可能丢失”,在仓储调查之后实际为“破损”。其中“丢失”的数据刚开始站点发现问题的时候是“可能丢失”,在仓储调查之后发现该商品的确由运输至站点的记录,就可以确定为丢失。可见,句子与句子之间的逻辑推理关系,是需要构造第二神经网络的核心。
如图4所示,图4为本申请实施例提供的在电子商务系统中确定投诉信息的方法整体框架图,以第二神经网络是BiLSTM为例进行说明。
首先,采用设置的第一卷积神经网络(例如,图4所示的文本卷积神经网络textCNN)读取投诉信息中每个句子,并输出得到的结果为每个句子的文本信息Out_1,Out_2,…Out_n;然后将每个句子的文本信息作为输入,输入到设置的第二神经网络BiLSTM中进行处理;最后将BiLSTM的输出结果拼接起来,并通过softmax方式进行分类,得到对应的定责信息,也就是对应的定责标签。
在本申请实施例中,投诉信息一般是聊天数据,通常是通过聊天数据识别得到的,因此,在将投诉信息输入到第一卷积神经网络中时,可以输入的是聊天数据,即将聊天数据按照原本对话进行的顺序,依次输入到第一卷积神经网络中。确保句子之间的顺序符合原本聊天数据产生的顺序。
本申请实施例的整个过程为:
第一步骤:准备数据
将聊天数据按照原本对话进行的顺序,依次输入到第一卷积神经网络中。确保句子之间的顺序符合原本聊天产生的顺序。
第二步骤:采用第一卷积神经网络处理输入的聊天数据,识别每个句子的文本信息。图5为本申请实施例识别每个句子的文本信息的过程示意图。如图5所示,其中w1w2w3…….wn为具体输入的句子,w1,w2,…wn分别代表切分得到的最小语义单元。然后使用三个卷积核分别读取句子中的每个最小语义单元(例如,汉字或单词)。左边的卷积核每两个汉字或单词逐步读取数据进行卷积计算,中间的卷积核每三个汉字或单词逐步读取数据进行卷积计算,右边的卷积核每四个汉字或单词逐步读取数据进行卷积计算。最后将每一个卷积核输出的数据,即文本向量拼接在一起,构造得到该句子的文本信息,实际上就是表示该句子的文本信息的向量,输出结果为[Out_1,Out_2……Out_n]。
第三个步骤:使用BiLSTM学习每个句子的文本信息中间的逻辑关系
在这里,使用了BiLSTM对每个句子的文本信息进行逻辑关系的计算。具体地说,循环神经网络(RNN)的兴起因其梯度弥散的原因被人们诟病,在此基础上提出了门限RNN,而LSTM就是门限RNN中最著名的一种。在RNN中的有漏单元通过设计连接间的权重系数,从而允许RNN累计距离较远节点间的长期联系;而门限RNN则反馈了这样的思想,允许在不同时刻改变该系数,且允许网络忘记已经累积的信息。LSTM就是这样的门限RNN,LSTM巧妙之处在于通过增加输入门限,遗忘门限和输出门限,使得自循环的权重是变化的,这样一来在模型参数固定的情况下,不同时刻的积分尺度可以动态改变,从而避免了梯度消失或梯度膨胀的问题。
图6为本申请实施例中LSTM计算逻辑示意图。在本申请实施例中, 使用LSTM对每句话做模型计算,LSTM的具体计算逻辑如图6所示。在本申请实施例中,图7为本申请实施例提供的BiLSTM处理数据的过程框图,如图7所示,采用了BiLSTM对第二步骤得到的每个句子的文本信息使用两个LSTM,即第一子神经网络(LSTM1)及第二子神经网络(LSTM2),从两个方向分别读取句子的文本信息之间的逻辑关系,其中,LSTM1是从左至右读取句子的文本信息,LSTM2是从右至左读取句子的文本信息,最终将每一个LSTM读取之间的信息拼接至一个向量,输出为[can_1,can_2…….can_n]。
第四个步骤:实现定责信息的生成
将第三个步骤输出的矩阵全连接后,通过softmax方式映射至意图标签中,也就是得到了对应的定责分类标签,即定责信息。
进过实验证明,采用本申请实施例提供的方案,经过了两个神经网络对投诉信息的处理,使得最终得到的定责信息的准确率提升至94%。
图8A为本申请实施例提供的基于人工智能的语义识别的系统结构示意图,该系统包括:第一卷积神经网络单元801、第二神经网络单元802及分类单元803,其中,
第一卷积神经网络单元801,用于接收长文本中的每个句子,将每个句子输入第一卷积神经网络,得到每个句子的文本信息,其中,所述文本信息代表所述句子的语义特征;
第二神经网络单元802,用于接收每个句子文本信息,根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息,将得到的文本信息向量输入到第二神经网络中,输出得到代表每个句子文本信息之间的逻辑关系的向量;
分类单元803,用于对所述向量进行分类以识别所述长文本的语义。
在一些实施例中,所述第一卷积神经网络单元801进一步用于:将 每个句子按照最小语义单元进行切分,将所述切分得到的最小语义单元按照顺序输入设置的第一卷积神经网络,得到每个句子的文本信息。
其中,每个句子中的最小语义单元可以为一个汉字或一个单词,是具有语义的字或词的最小单元。
在一些实施例中,如图8B所示,所述第一卷积神经网络单元801进一步包括:组合单元8011和文本信息获取单元8012,其中:
组合单元8011,用于将按照顺序输入的最小语义单元,分别采用设置的不同单元长度值进行组合,得到具有不同单元长度值的最小语义单元组合;
文本信息获取单元8012,用于将得到的不同单元长度值的最小语义单元组合分别输入到不同的卷积核中处理,将从不同卷积核输出的文本信息向量进行拼接,得到每个句子的文本信息。
在一些实施例中,所述组合单元8011进一步用于:
按照所述最小语义单元的输入顺序进行每两个相邻最小语义单元组合,得到单元长度值为2的最小语义单元组合;
按照所述最小语义单元的输入顺序进行每三个相邻最小语义单元组合,得到单元长度值为3的最小语义单元组合;
按照所述最小语义单元的输入顺序进行每四个相邻最小语义单元组合,得到单元长度值为4的最小语义单元组合;
所述文本信息获取单元8012进一步用于:
将单元长度值为2的最小语义单元组合输入到第一卷积核处理;
将单元长度值为3的最小语义单元组合输入到第二卷积核处理;
将单元长度为4的最小语义单元组合输入到第三卷积核处理。
在一些实施例中,如图8C所示,所述第二神经网络单元802进一步包括第一子神经网络单元8021、第二子神经网络单元8022和向量获 取单元8023,其中:
第一子神经网络单元8021,用于将所述长文本的文本信息向量中的向量元素从左至右读取并进行处理,得到代表每个句子文本信息之间的从左至右的逻辑关系的向量;
第二子神经网络单元8022,用于将所述长文本的文本信息向量中的向量元素从右至左读取并进行处理,得到代表每个句子文本信息之间的从右至左的逻辑关系的向量;
向量获取单元8023,将得到代表每个句子文本信息之间的从左至右的逻辑关系的向量,及得到代表每个句子文本信息之间的从右至左的逻辑关系的向量进行拼接,得到代表每个句子文本信息之间的逻辑关系的向量。
在一些实施例中,所述分类单元803进一步用于,采用设置的分类方式,将所述向量映射至符合的意图标签中,识别得到所述长文本的语义。
本申请实施例还提供一种基于人工智能的语义识别的装置,如图9所示,所述电子设备900包括:存储器906、处理器902,通信模块904,用户接口910,以及用于互联这些组件的通信总线908。
存储器906可以是高速随机存取存储器,诸如DRAM、SRAM、DDR RAM、或其他随机存取固态存储设备;或者非易失性存储器,诸如一个或多个磁盘存储设备、光盘存储设备、闪存设备,或其他非易失性固态存储设备。
用户接口910可以包括一个或多个输出设备912,以及一个或多个输入设备914。
存储器906存储处理器902可执行的指令集,包括用于实现上述各实施例中的处理流程的程序,所述处理器902执行所述程序时实现所述 语义识别方法的步骤。
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述任一项所述的语义识别的方法。
本申请实施例采用第一卷积神经网络,具体是文本卷积神经网络(TextCNN)识别投诉信息中的每个句子内的文本信息,使用BiLSTM识别句子之间的逻辑算法,可以有效地捕捉句子之间的推理逻辑关系,从而得到更加准确地进行语义识别。
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。

Claims (16)

  1. 一种基于人工智能的语义识别方法,包括:
    将长文本中的每个句子通过设置的第一卷积神经网络得到每个句子的文本信息,其中,所述文本信息代表所述句子的语义特征;
    根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息;
    将得到的文本信息向量输入到设置的第二神经网络中,输出得到代表每个句子文本信息之间的逻辑关系的向量;
    对所述向量进行分类以识别所述长文本的语义。
  2. 如权利要求1所述的方法,所述将长文本中的每个句子通过设置的第一卷积神经网络得到每个句子的文本信息包括:
    将长文本中的每个句子按照最小语义单元进行切分,将切分得到的最小语义单元按照顺序通过设置的第一卷积神经网络,得到每个句子的文本信息。
  3. 如权利要求2所述的方法,所述第一卷积神经网络包括多个卷积单元,每个卷积单元具有不同大小的卷积核,所述得到每个句子的文本信息为:
    将按照顺序输入的最小语义单元,分别采用设置的不同单元长度值进行组合,得到具有不同单元长度值的最小语义单元组合;
    将得到的不同单元长度值的最小语义单元组合分别输入到不同的卷积单元中处理,将从不同卷积单元输出的文本信息向量进行拼接,得到每个句子的文本信息。
  4. 如权利要求3所述的方法,所述多个卷积单元的数量为三个,分别为第一卷积单元、第二卷积单元及第三卷积单元;
    所述得到不同单元长度值的最小语义单元组合为:
    按照所述最小语义单元的输入顺序进行每两个相邻最小语义单元组合,得到单元长度值为2的最小语义单元组合;
    按照所述最小语义单元的输入顺序进行每三个相邻最小语义单元组合,得到单元长度值为3的最小语义单元组合;
    按照所述最小语义单元的输入顺序进行每四个相邻最小语义单元组合,得到单元长度值为4的最小语义单元组合;
    将得到的不同单元长度值的最小语义单元组合分别输入到不同的卷积单元中处理为:
    将单元长度值为2的最小语义单元组合输入到第一卷积单元处理;
    将单元长度值为3的最小语义单元组合输入到第二卷积单元处理;
    将单元长度为4的最小语义单元组合输入到第三卷积单元处理。
  5. 如权利要求1所述的方法,所述设置的第二神经网络包括两个子神经网络,所述输出得到代表每个句子文本信息之间的逻辑关系的向量的过程为:
    第一子神经网络将所述长文本的文本信息向量中的向量元素从左至右读取并进行处理,得到代表每个句子文本信息之间的从左至右的逻辑关系的向量;
    第二子神经网络将所述长文本的文本信息向量中的向量元素从右至左读取并进行处理,得到代表每个句子文本信息之间的从右至左的逻辑关系的向量;
    将得到代表每个句子文本信息之间的从左至右的逻辑关系的向量,及得到代表每个句子文本信息之间的从右至左的逻辑关系的向量进行拼接,得到代表每个句子文本信息之间的逻辑关系的向量。
  6. 如权利要求5所述的方法,所述第二神经网络为双向长短记忆网络BiLSTM或门控循环单元GRU;
    所述第一子神经网络为BiLSTM或GRU;
    所述第二子神经网络为BiLSTM或GRU。
  7. 如权利要求1所述的方法,所述分类为:
    采用设置的分类方式,将所述向量映射至符合的意图标签中,识别得到所述长文本的语义。
  8. 一种基于人工智能的语义识别系统,包括:第一卷积神经网络单元、第二神经网络单元及分类单元,其中,
    第一卷积神经网络单元,用于接收长文本中的每个句子,将每个句子输入第一卷积神经网络,得到每个句子的文本信息,其中,所述文本信息代表所述句子的语义特征;
    第二神经网络单元,用于接收每个句子文本信息,根据每个句子的文本信息得到所述长文本的文本信息向量,其中每个向量元素为每个句子的文本信息,将得到的文本信息向量输入到第二神经网络,输出得到代表每个句子文本信息之间的逻辑关系的向量;
    分类单元,用于对所述向量进行分类以识别所述长文本的语义。
  9. 根据权利要求8所述语义识别系统,其中,所述第一卷积神经网络单元进一步用于:
    将每个句子按照最小语义单元进行切分,将所述切分得到的最小语义单元按照顺序输入设置的第一卷积神经网络,得到每个句子的文本信息。
  10. 根据权利要求9所述的语义识别系统,其中,所述第一卷积神经网络单元进一步包括:组合单元和文本信息获取单元,所述文本信息获取单元包含不同大小的卷积核;
    所述组合单元,用于将按照顺序输入的最小语义单元,分别采用设置的不同单元长度值进行组合,得到具有不同单元长度值的最小语义单 元组合;
    所述文本信息获取单元,用于将得到的不同单元长度值的最小语义单元组合分别输入到不同的卷积核中处理,将从不同卷积核输出的文本信息向量进行拼接,得到每个句子的文本信息。
  11. 根据权利要求10所述的语义识别系统,其中,所述组合单元进一步用于:
    按照所述最小语义单元的输入顺序进行每两个相邻最小语义单元组合,得到单元长度值为2的最小语义单元组合;
    按照所述最小语义单元的输入顺序进行每三个相邻最小语义单元组合,得到单元长度值为3的最小语义单元组合;
    按照所述最小语义单元的输入顺序进行每四个相邻最小语义单元组合,得到单元长度值为4的最小语义单元组合;
    所述文本信息获取单元进一步用于:
    将单元长度值为2的最小语义单元组合输入到第一卷积核处理;
    将单元长度值为3的最小语义单元组合输入到第二卷积核处理;
    将单元长度为4的最小语义单元组合输入到第三卷积核处理。
  12. 根据权利要求8所述的语义识别系统,其中,所述第二神经网络单元进一步包括:
    第一子神经网络单元,用于将所述长文本的文本信息向量中的向量元素从左至右读取并进行处理,得到代表每个句子文本信息之间的从左至右的逻辑关系的向量;
    第二子神经网络单元,用于将所述长文本的文本信息向量中的向量元素从右至左读取并进行处理,得到代表每个句子文本信息之间的从右至左的逻辑关系的向量;
    向量获取单元,将得到代表每个句子文本信息之间的从左至右的逻 辑关系的向量,及得到代表每个句子文本信息之间的从右至左的逻辑关系的向量进行拼接,得到代表每个句子文本信息之间的逻辑关系的向量。
  13. 根据权利要求12所述的语义识别系统,其中,所述第二神经网络为双向长短记忆网络BiLSTM或门控循环单元GRU;
    所述第一子神经网络单元为BiLSTM或GRU;
    所述第二子神经网络单元为BiLSTM或GRU。
  14. 根据权利要求8所述的语义识别系统,其中,所述分类单元进一步用于,采用设置的分类方式,将所述向量映射至符合的意图标签中,识别得到所述长文本的语义。
  15. 一种语义识别的装置,包括:
    存储器;以及耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行如权利要求1-7中任一项所述的语义识别的方法。
  16. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现权利要求1-7中任一项所述的语义识别的方法。
PCT/CN2020/070175 2019-03-22 2020-01-03 基于人工智能的语义识别的方法、装置系统及存储介质 WO2020192237A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910222540.7 2019-03-22
CN201910222540.7A CN111737971A (zh) 2019-03-22 2019-03-22 一种语义识别的方法及系统

Publications (1)

Publication Number Publication Date
WO2020192237A1 true WO2020192237A1 (zh) 2020-10-01

Family

ID=72608500

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/070175 WO2020192237A1 (zh) 2019-03-22 2020-01-03 基于人工智能的语义识别的方法、装置系统及存储介质

Country Status (2)

Country Link
CN (1) CN111737971A (zh)
WO (1) WO2020192237A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107315737A (zh) * 2017-07-04 2017-11-03 北京奇艺世纪科技有限公司 一种语义逻辑处理方法及系统
CN108846017A (zh) * 2018-05-07 2018-11-20 国家计算机网络与信息安全管理中心 基于Bi-GRU和字向量的大规模新闻文本的端到端分类方法
CN108932226A (zh) * 2018-05-29 2018-12-04 华东师范大学 一种对无标点文本添加标点符号的方法
CN109165384A (zh) * 2018-08-23 2019-01-08 成都四方伟业软件股份有限公司 一种命名实体识别方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107315737A (zh) * 2017-07-04 2017-11-03 北京奇艺世纪科技有限公司 一种语义逻辑处理方法及系统
CN108846017A (zh) * 2018-05-07 2018-11-20 国家计算机网络与信息安全管理中心 基于Bi-GRU和字向量的大规模新闻文本的端到端分类方法
CN108932226A (zh) * 2018-05-29 2018-12-04 华东师范大学 一种对无标点文本添加标点符号的方法
CN109165384A (zh) * 2018-08-23 2019-01-08 成都四方伟业软件股份有限公司 一种命名实体识别方法及装置

Also Published As

Publication number Publication date
CN111737971A (zh) 2020-10-02

Similar Documents

Publication Publication Date Title
WO2020073507A1 (zh) 一种文本分类方法及终端
US20220188521A1 (en) Artificial intelligence-based named entity recognition method and apparatus, and electronic device
US9807473B2 (en) Jointly modeling embedding and translation to bridge video and language
CN108874776B (zh) 一种垃圾文本的识别方法及装置
US11436414B2 (en) Device and text representation method applied to sentence embedding
CN110377759B (zh) 事件关系图谱构建方法及装置
CN111178458B (zh) 分类模型的训练、对象分类方法及装置
CN110321845B (zh) 一种从视频中提取表情包的方法、装置及电子设备
CN112732911A (zh) 基于语义识别的话术推荐方法、装置、设备及存储介质
CN110377733B (zh) 一种基于文本的情绪识别方法、终端设备及介质
CN112214601B (zh) 一种社交短文本情感分类方法、装置及存储介质
CN108960574A (zh) 问答的质量确定方法、装置、服务器和存储介质
WO2023005386A1 (zh) 模型训练方法和装置
CN110598869B (zh) 基于序列模型的分类方法、装置、电子设备
CN112488214A (zh) 一种图像情感分析方法以及相关装置
CN112995414B (zh) 基于语音通话的行为质检方法、装置、设备及存储介质
CN111831826A (zh) 跨领域的文本分类模型的训练方法、分类方法以及装置
CN110913354A (zh) 短信分类方法、装置及电子设备
CN110717019A (zh) 问答处理方法、问答系统、电子设备及介质
EP3812919A1 (en) Methods, apparatuses, and systems for data mapping
CN116226785A (zh) 目标对象识别方法、多模态识别模型的训练方法和装置
CN114817538A (zh) 文本分类模型的训练方法、文本分类方法及相关设备
CN112131506A (zh) 一种网页分类方法、终端设备及存储介质
WO2020192237A1 (zh) 基于人工智能的语义识别的方法、装置系统及存储介质
CN116680401A (zh) 文档处理方法、文档处理装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20779252

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 18.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20779252

Country of ref document: EP

Kind code of ref document: A1