WO2023227030A1 - 一种意图识别方法、装置、存储介质和电子设备 - Google Patents

一种意图识别方法、装置、存储介质和电子设备 Download PDF

Info

Publication number
WO2023227030A1
WO2023227030A1 PCT/CN2023/096071 CN2023096071W WO2023227030A1 WO 2023227030 A1 WO2023227030 A1 WO 2023227030A1 CN 2023096071 W CN2023096071 W CN 2023096071W WO 2023227030 A1 WO2023227030 A1 WO 2023227030A1
Authority
WO
WIPO (PCT)
Prior art keywords
intention
information
feature text
target
identified
Prior art date
Application number
PCT/CN2023/096071
Other languages
English (en)
French (fr)
Inventor
吕田田
吴艳芹
张乐
袁晶晶
郭蓉蓉
Original Assignee
中国电信股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国电信股份有限公司 filed Critical 中国电信股份有限公司
Publication of WO2023227030A1 publication Critical patent/WO2023227030A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present disclosure relates to the technical field of natural language processing, and in particular, to an intention recognition method, device, storage medium and electronic device.
  • the present disclosure provides an intention recognition method, device, storage medium and electronic device.
  • an intention recognition method includes: obtaining information to be recognized; determining a target domain corresponding to the information to be recognized from multiple preset domains; and based on the intent knowledge graph.
  • the depth information of the intention in the target field in the intention knowledge graph, and the semantic distance between the first feature text of the information to be recognized and the intention are determined to determine the relationship between the first feature text and the intention. based on the joint similarity between the first feature text and the intention, determine the target intention corresponding to the information to be recognized.
  • an intention recognition device includes: an information acquisition module for acquiring information to be recognized; a target domain determination module for determining the target domain from multiple preset domains. The target field corresponding to the information to be identified; the joint similarity determination module is used to determine the depth information of the intention in the target field in the intention knowledge graph based on the intention knowledge graph, and the first feature text of the information to be identified.
  • the semantic distance between the intention and the first feature text determines the joint similarity between the first feature text and the intention; the target intention determination module is based on the joint similarity between the first feature text and the intention. , determine the target intention corresponding to the information to be identified.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the above method is implemented.
  • an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the above method by executing the executable instructions.
  • Figure 1 shows a schematic diagram of an application scenario of an intention recognition method in this exemplary embodiment
  • Figure 2 shows a flow chart of an intention identification method in this exemplary embodiment
  • Figure 3 shows a schematic diagram of an intention knowledge graph constructed based on a customer complaint 5G business scenario in an intention identification method in this exemplary embodiment
  • Figure 4 shows a flow chart of an intention identification method in this exemplary embodiment
  • Figure 5 shows a flow chart of an intention identification method in this exemplary embodiment
  • Figure 6 shows a flow chart of an intention identification method in this exemplary embodiment
  • Figure 7 shows a flow chart of an intention identification method in this exemplary embodiment
  • Figure 8 shows a flow chart of an intention identification method in this exemplary embodiment
  • Figure 9 shows a schematic structural diagram of an intention recognition device in this exemplary embodiment
  • FIG. 10 shows a schematic structural diagram of an electronic device in this exemplary embodiment.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments may, however, be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the concepts of the example embodiments. be communicated to those skilled in the art.
  • the described features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
  • numerous specific details are provided to provide a thorough understanding of embodiments of the disclosure.
  • those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details being omitted, or other methods, components, devices, steps, etc. may be adopted.
  • well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the disclosure.
  • intent recognition of user needs to be based on annotation information, and it cannot be fully applicable to the user's unstructured intent recognition, and furthermore, user needs cannot be accurately met; therefore, in order to better meet user needs, it is necessary to accurately identify users' needs. Requirements for intent identification.
  • embodiments of the present disclosure provide an intent recognition method.
  • the information to be identified is obtained;
  • the target field corresponding to the information to be identified is determined from multiple preset fields;
  • thirdly, based on the intent knowledge graph The depth information of the intention in the target field in the intention knowledge graph and the semantic distance between the first feature text of the information to be recognized and the intention are determined.
  • the target intention corresponding to the information to be recognized is determined. In this way, since both the semantic distance and the depth information of the intention knowledge graph are taken into account, when there are multiple similar intentions for the same information to be identified, the target intention corresponding to the information to be identified can be accurately determined.
  • the intention identification method provided by the embodiment of the present disclosure is applied to the operator operation management system 100.
  • the operator operation management system 100 at least includes: network layer 101, acquisition and control layer 102, resource management layer 103, and service layer 104. , business layer 105 and intent layer 106.
  • the network layer 101 is used to collect the user's input information; the acquisition and control layer 102 is used to obtain the user's input information from the network layer 101; the resource management layer 103 is used to operate the resources involved in the services supported by the operator management system 100. (such as: marketing resources, cloud network resources, etc.) for management; the service layer 104 is used to open different services to the business layer 105 and present intent execution results to users; the business layer 105 is used to provide internal and external open capabilities ; The intention layer 106 is used to construct an intention knowledge graph and perform intention recognition on the user's input information.
  • the operator operation management system 100 is applied to customer service scenarios; in one possible implementation, the operator operation management system 100 is applied to complaint scenarios; in one possible implementation , the operator operation management system 100 is applied to human-computer interaction scenarios such as business management scenarios, and is not limited here.
  • the intention recognition method includes the following steps 201 to 204:
  • Step 201 Obtain information to be identified.
  • the information to be identified can be in different formats, such as: structured information, semi-structured information or unstructured information; structured information can be tables, databases and other information expressed in a certain format; unstructured information can be text , audio, video, pictures and other information; semi-structured information is between the above-mentioned unstructured information and structured information, and can be understood as information obtained after making major structural changes to structured information.
  • the information to be identified can be information input by the user.
  • the information to be identified can be obtained through the following process: receiving the information to be identified input by the user through the front-end page or the interface provided by the system, and The information to be identified is sent to the backend for data processing.
  • Step 202 Determine the target field corresponding to the information to be identified from multiple preset fields.
  • multiple preset fields can be determined according to application scenarios. For example: in customer complaint business scenarios, they can be determined according to business types; specifically, when the business types include wireless home entertainment, smart home, etc., wireless home entertainment , Smart home is determined as the default field.
  • the determination of the target domain can be achieved through a classifier.
  • Naive Baye can be used implementation of the classifier; further, if the information to be identified is structured information, the Naive Bayes classifier can be directly used to determine the target field corresponding to the information to be identified; if the information to be identified is unstructured information, first, the information to be identified needs to be Carry out word segmentation. Secondly, extract the backbone of the word segmentation results to obtain the central word set. Thirdly, use the Naive Bayes classifier to classify the central word set to determine the target field corresponding to the information to be identified based on the classification results; in the information to be identified In the case of audio, convert the audio to text before word segmentation.
  • Step 203 Based on the depth information of the intention in the target domain in the intention knowledge graph and the semantic distance between the first feature text of the information to be identified and the intention, determine the The joint similarity between the first feature text and the intent.
  • the intent knowledge graph can be constructed in advance according to the application scenario; in a possible implementation, the intent knowledge graph in the database can be directly used; where, the construction of the intent knowledge graph This can be achieved by performing knowledge extraction, knowledge fusion, knowledge processing and knowledge updating on the data of the application scenario; as shown in Figure 3, it is a schematic diagram of an intention knowledge graph constructed based on the customer complaint 5G business scenario.
  • the intention can be understood as The meaning of the entity-relationship-entity triple representation in the intent knowledge graph.
  • the depth information of the intent in the intent knowledge graph is determined based on the level of the entity-relationship-entity corresponding to the intent in the intent knowledge graph; for example: for Figure 3
  • the HD video-stuck triplet it can be seen that the level of the entity HD video is 2, and the level of the entity stutter is 3. At this time, the average level of the two entities can be regarded as the HD video-stuck level.
  • the level (depth information) of frame triples, that is, 2.5 is used as the depth information of high-definition video-frame triples.
  • the first feature text can be understood as the keywords of the information to be identified, which can be obtained by keyword extraction of the information to be identified.
  • Step 204 Based on the joint similarity between the first feature text and the intention, determine the target intention corresponding to the information to be recognized.
  • the intention identification method firstly obtains the information to be identified; secondly, determines the target field corresponding to the information to be identified from multiple preset fields; thirdly, based on the target field in the intent knowledge graph, The depth information of the intention in the intention knowledge graph, and the semantic distance between the first feature text of the information to be recognized and the intention, determine the joint similarity between the first feature text and the intention degree; finally, based on the joint similarity between the first feature text and the intention, the target intention corresponding to the information to be identified is determined; in this way, since the semantic distance and the depth information of the intention knowledge graph are simultaneously considered, , for the same information to be identified, when there are multiple similar intentions, the target intention corresponding to the information to be identified can be accurately determined.
  • the above step 202 determines the target field corresponding to the information to be identified from multiple preset fields, including the following steps 401 to 402:
  • Step 401 Extract the second feature text set of the information to be identified.
  • the second feature text set can correspond to the above-mentioned central word set; in a possible implementation, the second feature text set can be obtained by extracting the backbone of the information to be identified; for example: the information to be identified "Loading time of game A" "It's getting longer” is used to extract the backbone, and the second feature text set obtained includes: Game A, long loading time.
  • Step 402 Perform domain prediction on each second feature text in the second feature text set, and determine the target domain corresponding to the information to be recognized based on the prediction result of each second feature text.
  • the Naive Bayes classifier can be used to perform domain prediction on each second feature text in the second feature text set. For example, if the second feature text set includes: Game A and the loading time is long, then Naive Bayes is used The classifier performs domain prediction on the second feature text "Game A" and the second feature text "Long Loading Time” in the second feature text set, and then determines the corresponding information to be identified based on the prediction results of each second feature text. target area.
  • the embodiment of the present disclosure extracts the second feature text set of the information to be recognized, and then performs domain prediction on each second feature text in the second feature text set, and based on the prediction of each second feature text As a result, the target field corresponding to the information to be identified is determined, and the target field corresponding to the information to be identified can be determined to implement field classification of the information to be identified.
  • the above step 402 performs domain prediction on each second feature text in the second feature text set, and based on the prediction results of each second feature text Determining the target field corresponding to the information to be identified includes the following steps 501 to 503:
  • Step 501 Determine the association probability between each second feature text and each of the plurality of preset areas.
  • the second feature text can be processed in batches; specifically, the second feature text set can be vectorized first, and then calculated through the Naive Bayes algorithm, as shown in the following formula (1 ) and (2):
  • Step 502 Use the preset area with the highest correlation probability as the area corresponding to the second feature text to obtain the area corresponding to the second feature text set.
  • the domain corresponding to each second feature text can be obtained.
  • the second feature text x 1 to The texts x n all correspond to the field y 1 , where y 1 represents the field of wireless home entertainment.
  • Step 503 Use the field with the most occurrences among the fields corresponding to the second feature text set as the target field corresponding to the information to be identified.
  • the above step 203 is based on the depth information of the intention in the target field in the intention knowledge map and the first step of the information to be identified.
  • the semantic distance between the feature text and the intention is determined, and the joint similarity between the first feature text and the intention is determined, including the following steps 601 to 603:
  • Step 601 Determine the semantic distance between the first feature text of the information to be recognized and the intention.
  • the first characteristic text of the information to be identified may be determined first; in a possible implementation, the first characteristic text may be a keyword, and the first characteristic text may be obtained by keyword extraction of the information to be identified.
  • n t represents the number of times word t appears in the file
  • ⁇ k n k represents the number of times all words appear in the file.
  • TF-IDF TF(t) ⁇ IDF(t) (5);
  • Semantic similarity can be determined based on semantic distance and is calculated as shown in the following formula (6):
  • n1 is the keyword
  • n2 is the consciousness in the consciousness knowledge map
  • represents the semantic distance when the semantic similarity is 0.5, and is an adjustable parameter.
  • Step 602 Determine the minimum depth of the intention and the first feature text based on the depth information of the intention in the target domain in the intention knowledge graph.
  • the depth information can be the distance between the node of the intention and the node of the target field, which is determined based on the level of the entity-relationship-entity triplet representing the intention in the consciousness knowledge graph, where an entity in the triplet can regarded as a node.
  • the minimum depth can be understood as the smaller of the depth information of the intention and the first feature text. Therefore, the smaller of the depth information of the intention and the first feature text is determined as the minimum depth, which can be expressed as in, represents the depth information of the first feature text, and d n2 represents the depth information of the intention.
  • Step 603 Based on the semantic distance and the minimum depth, determine the joint similarity between the first feature text and the intention.
  • the joint similarity between the first feature text and the intention is determined based on the semantic distance and the minimum depth. Due to the introduction of depth, for the same first feature text, when similar intentions exist at different depths, the joint similarity can be accurately determined.
  • the intent of the first feature text is determined.
  • the intention recognition method first determines the semantic distance between the first feature text of the information to be recognized and the intention; secondly, based on the intention in the target field in the intention knowledge graph, Depth information in the intention knowledge graph is used to determine the minimum depth between the intention and the first feature text; finally, based on the semantic distance and the minimum depth, the distance between the first feature text and the intention is determined. Joint similarity; in this way, for the same first feature text, when similar intentions exist at different depths, the intention of the first feature text can be accurately determined.
  • step 603 determines the joint similarity between the first feature text and the intention based on the semantic distance and the minimum depth, including the following steps: 701-Step 703:
  • Step 701 Determine a first operation result based on preset parameters and the minimum depth.
  • Step 702 Weight the first operation result and the semantic distance to determine the second operation result.
  • Step 703 Determine the joint similarity based on the ratio of the first operation result and the second operation result.
  • is the preset parameter; is the minimum depth of keyword n1 and intention n2; is the semantic distance between keyword n1 and intention n2; generally, ⁇ is set based on empirical data, ranging from [1.2, 1.8].
  • the intention recognition method provided by the embodiment of the present disclosure first determines the first operation result based on preset parameters and the minimum depth; secondly, weights the first operation result and the semantic distance to determine the second operation result; Finally, the joint similarity is determined based on the ratio of the first operation result and the second operation result; in this way, the joint similarity that measures similar intentions in the intention knowledge graph can be determined by combining the semantic distance and the minimum depth.
  • the above step 203 is based on the depth information of the intention in the target field in the intention knowledge map and the first step of the information to be identified.
  • the semantic distance between the feature text and the intention, determining the joint similarity between the first feature text and the intention also includes the following steps 801 to 803:
  • Step 801 Determine the semantic distance between the first feature text of the information to be recognized and the intention.
  • Step 802 In the case where the semantic distance between two or more intentions and the first feature text is less than a first preset threshold, use the two or more intentions as candidate intentions.
  • the first preset threshold is determined through empirical data.
  • the semantic distance between two or more intentions and the first feature text is less than the first preset threshold, which can be understood as a situation where the intention of the first feature text cannot be determined based on the semantic distance.
  • the Intentions smaller than the first preset threshold are used as candidate intentions.
  • the first feature text "watching dramas, blurry” has two intentions “Ultra HD Video - Stuttering” whose semantic distance is smaller than the first preset threshold. and "AR/VR-stuck”, then the intents "Ultra HD video-stuck” and "AR/VR-stuck” are used as candidate intentions.
  • Step 803 Based on the depth information of the candidate intention in the intention knowledge graph and the semantic distance between the first feature text and the candidate intention, determine the relationship between the first feature text and the candidate intention. joint similarity between them.
  • the joint similarity between the first feature text and the candidate intent is determined with reference to formula (7).
  • the difference is that the intent in formula (7) can be any intent in the intent knowledge graph, and the intent in this step is Any of the candidate intents.
  • the intention recognition method firstly determines the semantic distance between the first feature text of the information to be recognized and the intention; secondly, when there are two or more intentions and the first feature When the semantic distance of the text is less than the first preset threshold, the two or more intentions are used as candidate intentions; finally, based on the depth information of the candidate intentions in the intention knowledge graph, and the third
  • the semantic distance between a feature text and the candidate intention determines the joint similarity between the first feature text and the candidate intention; in this way, when the intention of the first feature text cannot be determined through semantic distance Under this method, combining semantic distance and minimum depth, the intention of the first feature text can be accurately determined among the candidate intentions.
  • the above step 204 is based on the relationship between the first feature text and the intention. Joint similarity, determining the target intention corresponding to the information to be identified, including: taking the intention whose joint similarity to the first feature text is greater than a second preset threshold as the target intention corresponding to the information to be identified .
  • the second preset threshold can be determined through empirical data.
  • the intention with the largest joint similarity can be used as the target intention corresponding to the information to be identified.
  • the target intention corresponding to the information to be identified can also be executed. Specifically, the target intention corresponding to the information to be identified can be sent to the system development state (offline model training) and running state (online operation implementation) for execution. Strategy analysis and decision-making; then, the execution results of the target intentions are fed back to the user through the system interface.
  • system development state offline model training
  • running state online operation implementation
  • the intention whose joint similarity with the first feature text is greater than the second preset threshold is used as the target intention corresponding to the information to be identified; in this way, two information for the same information to be identified can be used.
  • the target intention corresponding to the information to be identified is accurately determined through joint similarity.
  • an intention recognition device 900 in an embodiment of the present disclosure.
  • Figure 9 shows a schematic architecture diagram of an intention recognition device 900.
  • the intention recognition device 900 includes: an information acquisition module 901, a target domain determination module 902, a joint similarity determination module 903 and a target intention determination module 904, wherein:
  • Information acquisition module 901 used to acquire information to be identified
  • the target area determination module 902 is used to determine the target area corresponding to the information to be identified from multiple preset areas;
  • the joint similarity determination module 903 is used to determine the depth information of the intention in the target domain in the intention knowledge graph based on the depth information in the intention knowledge graph, and the distance between the first feature text of the information to be identified and the intention. Semantic distance, determining the joint similarity between the first feature text and the intention;
  • the target intention determination module 904 is configured to determine the target intention corresponding to the information to be recognized based on the joint similarity between the first feature text and the intention.
  • the target domain determination module 902 is specifically configured to extract a second feature text set of the information to be identified; and perform domain prediction for each second feature text in the second feature text set. , and determine the target field corresponding to the information to be identified based on the prediction result of each second feature text.
  • the target domain determination module 902 is further configured to determine the probability of association between each second feature text and each of the plurality of preset domains; The preset field with the highest correlation probability is used as the field corresponding to the second feature text, and the field corresponding to the second feature text set is obtained; the field with the most occurrences in the field corresponding to the second feature text set is used as the field corresponding to the second feature text set.
  • the joint similarity determination module 903 is specifically used to determine the semantic distance between the first feature text of the information to be identified and the intention; based on the target domain in the intention knowledge graph Determine the minimum depth of the intention and the first feature text based on the depth information of the intention in the intention knowledge graph; The semantic distance and the minimum depth determine the joint similarity between the first feature text and the intent.
  • the joint similarity determination module 903 is further configured to determine a first operation result based on preset parameters and the minimum depth; and weight the first operation result and the semantic distance. , determine the second operation result; determine the joint similarity based on the ratio of the first operation result and the second operation result.
  • the joint similarity determination module 903 is specifically used to determine the semantic distance between the first feature text of the information to be identified and the intention; when there are two or more intentions, When the semantic distance to the first feature text is less than the first preset threshold, the two or more intentions are used as candidate intentions; based on the depth information of the candidate intentions in the intention knowledge graph, and the semantic distance between the first feature text and the candidate intention, and determine the joint similarity between the first feature text and the candidate intention.
  • the target intention determination module 904 is specifically configured to use the intention whose joint similarity with the first feature text is greater than a second preset threshold as the intention corresponding to the information to be identified. Goal intention.
  • Exemplary embodiments of the present disclosure also provide a computer-readable storage medium, which can be implemented in the form of a program product, which includes program code.
  • the program product When the program product is run on an electronic device, the program code is used to cause the electronic device to The steps described in the "Exemplary Methods" section of this specification above according to various exemplary embodiments of the present disclosure are performed.
  • the program product may be implemented as a portable compact disk read-only memory (CD-ROM) and include the program code, and may be run on an electronic device, such as a personal computer.
  • CD-ROM portable compact disk read-only memory
  • the program product of the present disclosure is not limited thereto.
  • a readable storage medium may be any tangible medium containing or storing a program that may be used by or in conjunction with an instruction execution system, apparatus, or device.
  • the Program Product may take the form of one or more readable media in any combination.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a readable signal medium may also be any readable medium other than a readable storage medium that can send, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., as well as conventional procedural programming. Language—such as "C” or a similar programming language.
  • the program code is fully accessible on the user's computing device Execute on the user's device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device or entirely on the remote computing device or server.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device, such as provided by an Internet service. (business comes via Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service business comes via Internet connection
  • an exemplary embodiment of the present disclosure also provides an electronic device 1000, which can be a backend server of an information platform.
  • the electronic device 1000 will be described below with reference to FIG. 10 . It should be understood that the electronic device 1000 shown in FIG. 10 is only an example and should not bring any limitations to the functions and scope of use of the embodiments of the present disclosure.
  • electronic device 1000 is embodied in the form of a general computing device.
  • the components of the electronic device 1000 may include, but are not limited to: at least one processing unit 1010, at least one storage unit 1020, and a bus 1030 connecting different system components (including the storage unit 1020 and the processing unit 1010).
  • the storage unit stores program code, and the program code can be executed by the processing unit 1010, so that the processing unit 1010 performs the steps according to various exemplary embodiments of the present invention described in the "Exemplary Method" section of this specification.
  • the processing unit 1010 may perform the method steps shown in FIG. 2 and the like.
  • the storage unit 1020 may include a volatile storage unit, such as a random access storage unit (RAM) 1021 and/or a cache storage unit 1022, and may further include a read-only storage unit (ROM) 1023.
  • RAM random access storage unit
  • ROM read-only storage unit
  • Storage unit 1020 may also include a program/utility 1024 having a set of (at least one) program modules 1025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples, or some combination, may include the implementation of a network environment.
  • program/utility 1024 having a set of (at least one) program modules 1025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples, or some combination, may include the implementation of a network environment.
  • Bus 1030 may include a data bus, an address bus, and a control bus.
  • Electronic device 1000 may also communicate with one or more external devices 2000 (eg, keyboard, pointing device, Bluetooth device, etc.), which communication may occur through input/output (I/O) interface 1040.
  • Electronic device 1000 may also communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through network adapter 1050.
  • networks eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • network adapter 1050 communicates with other modules of electronic device 1000 via bus 1030.
  • other hardware and/or software modules may be used in conjunction with electronic device 1000, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
  • any step in the above method for determining the navigation satellite constellation can be implemented.
  • modules or units of equipment for action execution are mentioned in the above detailed description, this division is not mandatory.
  • the features and functions of two or more modules or units described above may be embodied in one module or unit.
  • the one described above The features and functions of a module or unit can be further divided into multiple modules or units to be embodied.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)

Abstract

提供了一种意图识别方法、装置、存储介质和电子设备,该意图识别方法包括:获取待识别信息(S201);从多个预设领域中,确定所述待识别信息对应的目标领域(S202);基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度(S203);基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图(S204)。如此,由于同时考虑了语义距离和意图知识图谱的深度信息,因此,针对同一待识别信息,在存在多个相似意图的情况下,能够准确确定出待识别信息对应的目标意图。 (图2)

Description

一种意图识别方法、装置、存储介质和电子设备
本申请要求申请日为2022年05月24日,申请号为202210570098.9,名称为“一种意图识别方法、装置、存储介质和电子设备”的中国专利申请的优先权,该中国专利申请的全部内容通过引用结合在本文中。
技术领域
本公开涉及自然语言处理技术领域,尤其涉及一种意图识别方法、装置、存储介质和电子设备。
背景技术
随着网络的不断发展及5G网络的不断成熟,运营商的网络越来越复杂,业务需求也逐渐多样化,导致运营管理的复杂度不断加剧。
相关技术中,无法准确识别用户意图;因此,为了更好的满足用户需求,需要对用户意图进行准确识别。
发明内容
本公开提供了一种意图识别方法、装置、存储介质和电子设备。
根据本公开的第一方面,提供了一种意图识别方法,所述方法包括:获取待识别信息;从多个预设领域中,确定所述待识别信息对应的目标领域;基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度;基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图。
根据本公开的第二方面,提供了一种意图识别装置,所述装置包括:信息获取模块,用于获取待识别信息;目标领域确定模块,用于从多个预设领域中,确定所述待识别信息对应的目标领域;联合相似度确定模块,用于基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度;目标意图确定模块,基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图。
根据本公开的第三方面,提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如上的方法。
根据本公开的第四方面,提供了一种电子设备,包括:处理器;以及存储器,用于存储处理器的可执行指令;其中,处理器配置为经由执行可执行指令来执行如上的方法。
附图说明
图1示出本示例性实施方式中一种意图识别方法的应用场景示意图;
图2示出本示例性实施方式中一种意图识别方法的流程图;
图3示出本示例性实施方式中一种意图识别方法中基于客户投诉5G业务场景构建的一种意图知识图谱的示意图;
图4示出本示例性实施方式中一种意图识别方法的流程图;
图5示出本示例性实施方式中一种意图识别方法的流程图;
图6示出本示例性实施方式中一种意图识别方法的流程图;
图7示出本示例性实施方式中一种意图识别方法的流程图;
图8示出本示例性实施方式中一种意图识别方法的流程图;
图9示出本示例性实施方式中一种意图识别装置结构示意图;
图10示出本示例性实施方式中一种电子设备的结构示意图。
具体实施方式
现在将参考附图更全面地描述示例性实施方式。然而,示例性实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例性实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本公开的实施方式的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而省略特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知技术方案以避免喧宾夺主而使得本公开的各方面变得模糊。
此外,附图仅为本公开的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
附图中所示的流程图仅是示例性说明,不是必须包括所有的步骤。例如,有的步骤还可以分解,而有的步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。
相关技术中,需要基于标注信息对用户需求进行意图识别,且无法完全适用于用户的非结构化意图识别,进而,无法准确满足用户需求;因此,为了更好的满足用户需求,需要准确对用户需求进行意图识别。
鉴于上述问题,本公开实施例提供了一种意图识别方法,首先,获取待识别信息;其次,从多个预设领域中,确定所述待识别信息对应的目标领域;再次,基于意图知识图谱 中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度;最后,基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图。如此,由于同时考虑了语义距离和意图知识图谱的深度信息,因此,针对同一待识别信息,在存在多个相似意图的情况下,能够准确确定出待识别信息对应的目标意图。
以下对本公开实施例提供的意图识别方法的应用环境作简单介绍:
请参见图1,本公开实施例提供的意图识别方法应用于运营商运营管理系统100,该运营商运营管理系统100至少包括:网络层101、采控层102、资源管理层103、服务层104、业务层105与意图层106。
其中,网络层101用于采集用户的输入信息;采控层102用于从网络层101获取用户的输入信息;资源管理层103用于对运营商运营管理系统100所支持的业务所涉及的资源(比如:营销资源、云网资源等)进行管理;服务层104用于将不同的服务开放给业务层105并为用户呈现意图执行结果;业务层105用于提供对内开放、对外开放的能力;意图层106用于构建意图知识图谱,并对用户的输入信息进行意图识别。
在一种可能的实现方式中,该运营商运营管理系统100应用于客服场景;在一种可能的实现方式中,该运营商运营管理系统100应用于投诉场景;在一种可能的实现方式中,该运营商运营管理系统100应用于业务办理场景等人机交互场景,此处不做限定。
下面以上述意图层106为执行主体,将该意图识别方法应用于上述的意图层106确定用户意图为例进行举例说明。请参见图2,本公开实施例提供的意图识别方法包括如下步骤201-步骤204:
步骤201、获取待识别信息。
其中,待识别信息可以是不同格式的,比如:结构化信息、半结构化信息或非结构化信息;结构化信息可以是表格、数据库等按照一定格式表示的信息;非结构化信息可以是文本、音频、视频、图片等信息;半结构化信息介于上述非结构化信息和结构化信息之间,可以理解为对结构化信息做了较大的结构变化之后得到的信息。
其中,待识别信息可以是用户输入的信息,在一种可能的实现方式中,待识别信息的获取可以通过以下过程实现:接收用户通过前端页面或系统提供的接口输入的待识别信息,并将待识别信息发送至后端进行数据处理。
步骤202、从多个预设领域中,确定所述待识别信息对应的目标领域。
其中,多个预设领域可以根据应用场景确定,比如:在客户投诉业务场景中,可以根据业务类型确定;具体地,在业务类型包括无线家庭娱乐、智能家居等的情况下,将无线家庭娱乐、智能家居确定为预设领域。
目标领域的确定可以通过分类器实现,在一种可能的实现方式中,可以通过朴素贝叶 斯分类器实现;进一步地,如果待识别信息是结构化信息,可以直接采用朴素贝叶斯分类器确定待识别信息对应的目标领域;如果待识别信息是非结构化信息,首先,需要对待识别信息进行分词,其次,对分词结果进行主干提取,得到中心词集合,再次,对中心词集合采用朴素贝叶斯分类器进行分类,以根据分类结果确定待识别信息对应的目标领域;在待识别信息为音频的情况下,在分词前先将音频转换为文本。
步骤203、基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度。
在一种可能的实现方式中,在步骤203之前,可以预先根据应用场景构建意图知识图谱;在一种可能的实现方式中,可以直接利用数据库中的意图知识图谱;其中,意图知识图谱的构建可以通过对应用场景的数据进行知识抽取、知识融合、知识加工和知识更新实现;如图3所示,为基于客户投诉5G业务场景构建的一种意图知识图谱的示意图,这里,意图可以理解为意图知识图谱中的实体-关系-实体三元组表征的含义,意图在意图知识图谱中的深度信息基于该意图对应的实体-关系-实体在意图知识图谱中的层级确定;比如:对于图3中的意图知识图谱,高清视频-卡顿三元组,可以看出实体高清视频的层级为2,实体卡顿的层级为3,此时,可以将两个实体的平均层级作为高清视频-卡顿三元组的层级(深度信息),即,将2.5作为高清视频-卡顿三元组的深度信息。
其中,第一特征文本可以理解为待识别信息的关键词,可以通过对待识别信息进行关键词提取得到。
由于不同深度实体-关系-实体三元组可能表征相似的意图,因此,引入深度信息,并基于深度信息和语义距离确定联合相似度,能够在相似的意图中更准确的识别用户需求。
步骤204、基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图。
本公开实施例提供的意图识别方法,首先,获取待识别信息;其次,从多个预设领域中,确定所述待识别信息对应的目标领域;再次,基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度;最后,基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图;如此,由于同时考虑了语义距离和意图知识图谱的深度信息,因此,针对同一待识别信息,在存在多个相似意图的情况下,能够准确确定出待识别信息对应的目标意图。
请参见图4,在本公开一个可选实施例中,上述步骤202从多个预设领域中,确定所述待识别信息对应的目标领域,包括如下步骤401-步骤402:
步骤401、提取所述待识别信息的第二特征文本集。
其中,第二特征文本集可以对应上述中心词集合;在一种可能的实现方式中,可以通过对待识别信息进行主干提取,得到第二特征文本集;比如:对待识别信息“游戏A的加载时间变长了”进行主干提取,得到的第二特征文本集包括:游戏A、加载时间长。
步骤402、对所述第二特征文本集中的每一第二特征文本进行领域预测,并基于每一所述第二特征文本的预测结果确定所述待识别信息对应的目标领域。
其中,可以通过朴素贝叶斯分类器对第二特征文本集中的每一第二特征文本进行领域预测,比如:第二特征文本集包括:游戏A、加载时间长,那么,采用朴素贝叶斯分类器对第二特征文本集中的第二特征文本“游戏A”和第二特征文本“加载时间长”分别进行领域预测,进而,基于每一第二特征文本的预测结果确定待识别信息对应的目标领域。
本公开实施例通过提取所述待识别信息的第二特征文本集,然后对所述第二特征文本集中的每一第二特征文本进行领域预测,并基于每一所述第二特征文本的预测结果确定所述待识别信息对应的目标领域,可以确定出待识别信息对应的目标领域,以实现对待识别信息的领域分类。
请参见图5,在本公开一个可选实施例中,上述步骤402对所述第二特征文本集中的每一第二特征文本进行领域预测,并基于每一所述第二特征文本的预测结果确定所述待识别信息对应的目标领域,包括如下步骤501-步骤503:
步骤501、确定每一所述第二特征文本与多个所述预设领域中每一所述预设领域的关联概率。
其中,可以通过朴素贝叶斯算法确定每一第二特征文本与多个预设领域中每一预设领域的关联概率;这里,关联概率可以理解为第二特征文本属于预设领域的概率,比如:在第二特征文本为xn、多个预设领域为Y={y1,y2,....ym}的情况下,关联概率就是xn属于ym的概率,可以用P(xn|yi)表示。
在一种可能的实现方式中,在第二特征文本集为X={x1,x2,....xn}、多个预设领域为Y={y1,y2,....ym}的情况下,可以对第二特征文本进行批量化处理;具体地,可以先对第二特征文本集进行向量化,然后再通过朴素贝叶斯算法进行计算,如下公式(1)和(2)所示:

步骤502、将关联概率最大的预设领域作为所述第二特征文本对应的领域,得到所述第二特征文本集对应的领域。
在第二特征文本集为X={x1,x2,....xn}、多个预设领域为Y={y1,y2,....ym}的情况下,通 过对第二特征文本x1、x2…..xn进行领域预测,能够得到每一第二特征文本对应的领域,比如:如表1所示,第二特征文本x1至第二特征文本xn均对应领域y1,其中,y1表示无线家庭娱乐领域。
表1
步骤503、将所述第二特征文本集对应的领域中出现次数最多的领域作为所述待识别信息对应的目标领域。
基于图5的方法,首先,确定每一所述第二特征文本与多个所述预设领域中每一所述预设领域的关联概率;其次,将关联概率最大的预设领域作为所述第二特征文本对应的领域,得到所述第二特征文本集对应的领域;最后,将所述第二特征文本集对应的领域中出现次数最多的领域作为所述待识别信息对应的目标领域;如此,在从待识别信息中提取出多个第二特征文本的情况下,能够准确对待识别信息进行分类,确定出待识别信息对应的目标领域。
请参见图6,在本公开一个可选实施例中,上述步骤203基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度,包括如下步骤601-步骤603:
步骤601、确定所述待识别信息的第一特征文本与所述意图之间的语义距离。
其中,在执行步骤601之前,可以先确定待识别信息的第一特征文本;在一种可能的实现方式中,第一特征文本可以是关键词,进而可以通过对待识别信息进行关键词提取得到第一特征文本;进一步地,可以通过TF-IDF进行关键词提取,计算如下公式(3)至(5)所示:
其中,nt表示词语t在文件中出现的次数;∑knk表示所有词汇在文件中出现的次数。
其中,N为总文件数;n为包含词语t的文件数。
TF-IDF=TF(t)×IDF(t)   (5);
语义相似度可以基于语义距离确定,计算如下公式(6)所示:
其中,n1为关键词;n2为意识知识图谱中的意识;为语义距离;α表示语义相似度为0.5时的语义距离,为可调节参数。
如表2所示,对文本text1-text6进行关键词提取,得到text1-text6的关键词如下表所示:
表2
如表3所示,对文本text1-text6提取出的关键词和意图知识图谱中的意图进行语义相似度的计算,得到的计算结果如下表所示:
表3
步骤602、基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,确定所述意图与所述第一特征文本的最小深度。
其中,深度信息可以是意图的节点与目标领域的节点之间的距离,基于表征意图的实体-关系-实体三元组在意识知识图谱中的层级确定,其中,三元组中的一个实体可以视为 一个节点。
最小深度可以理解为意图与第一特征文本的深度信息中的小者,因此,将意图与第一特征文本的深度信息中的较小者确定为最小深度,可以表示为其中,表示第一特征文本的深度信息,dn2表示意图的深度信息。
步骤603、基于所述语义距离和所述最小深度,确定所述第一特征文本与所述意图之间的联合相似度。
这里,结合语义距离和最小深度确定第一特征文本与意图之间的联合相似度,由于引入了深度,因此,针对同一第一特征文本,在不同的深度存在相似的意图的情况下,能够准确确定出该第一特征文本的意图。
本公开实施例提供的意图识别方法,首先,确定所述待识别信息的第一特征文本与所述意图之间的语义距离;其次,基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,确定所述意图与所述第一特征文本的最小深度;最后,基于所述语义距离和所述最小深度,确定所述第一特征文本与所述意图之间的联合相似度;如此,针对同一第一特征文本,在不同的深度存在相似的意图的情况下,能够准确确定出该第一特征文本的意图。
请参见图7,在本公开一个可选实施例中,上述步骤603基于所述语义距离和所述最小深度,确定所述第一特征文本与所述意图之间的联合相似度,包括如下步骤701-步骤703:
步骤701、基于预设参数和所述最小深度确定第一运算结果。
步骤702、对所述第一运算结果和所述语义距离进行加权,确定第二运算结果。
步骤703、基于所述第一运算结果和所述第二运算结果的比值确定所述联合相似度。
其中,联合相似度的计算如下公式(7)所示:
其中,β为预设参数;为关键词n1与意图n2的最小深度;为关键词n1与意图n2之间的语义距离;一般地,β是基于经验数据设置的,范围在[1.2,1.8]。
公式(7)中的分子可以理解为第一运算结果;
公式(7)中的分母可以理解为第二运算结果。
本公开实施例提供的意图识别方法,首先,基于预设参数和所述最小深度确定第一运算结果;其次,对所述第一运算结果和所述语义距离进行加权,确定第二运算结果;最后,基于所述第一运算结果和所述第二运算结果的比值确定所述联合相似度;如此,能够结合语义距离和最小深度确定出在意图知识图谱中衡量相似意图的联合相似度。
请参见图8,在本公开一个可选实施例中,上述步骤203基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度,还包括如下步骤801-步骤803:
步骤801、确定所述待识别信息的第一特征文本与所述意图之间的语义距离。
其中,语义距离的计算参考上述公式(6)或公式(7)中的
步骤802、在存在两个或两个以上意图与所述第一特征文本的语义距离小于第一预设阈值的情况下,将所述两个或两个以上意图作为候选意图。
其中,第一预设阈值是通过经验数据确定的。这里,存在两个或两个以上意图与第一特征文本的语义距离小于第一预设阈值的情况,可以理解为,无法根据语义距离确定出第一特征文本的意图的情况,此时,将小于第一预设阈值的意图作为候选意图,比如:通过计算语义距离,第一特征文本“看剧、模糊”存在两个语义距离小于第一预设阈值的意图“超高清视频-卡顿”和“AR/VR-卡顿”,那么,将意图“超高清视频-卡顿”和“AR/VR-卡顿”作为候选意图。
步骤803、基于所述候选意图在所述意图知识图谱中的深度信息,以及所述第一特征文本与所述候选意图之间的语义距离,确定所述第一特征文本与所述候选意图之间的联合相似度。
这里,确定第一特征文本与候选意图之间的联合相似度参考公式(7),区别在于,公式(7)中的意图可以是意图知识图谱中的任一意图,而本步骤中的意图为候选意图中的任一意图。
本公开实施例提供的意图识别方法,首先,确定所述待识别信息的第一特征文本与所述意图之间的语义距离;其次,在存在两个或两个以上意图与所述第一特征文本的语义距离小于第一预设阈值的情况下,将所述两个或两个以上意图作为候选意图;最后,基于所述候选意图在所述意图知识图谱中的深度信息,以及所述第一特征文本与所述候选意图之间的语义距离,确定所述第一特征文本与所述候选意图之间的联合相似度;如此,在通过语义距离无法确定出第一特征文本的意图的情况下,结合语义距离和最小深度,能够在候选意图中准确确定出第一特征文本的意图。
在本公开一个可选实施例中,上述步骤204基于所述第一特征文本与所述意图之间的 联合相似度,确定所述待识别信息对应的目标意图,包括:将与所述第一特征文本的所述联合相似度大于第二预设阈值的意图,作为所述待识别信息对应的目标意图。
其中,第二预设阈值可以通过经验数据确定。
在一些实施例中,可以将联合相似度最大的意图作为待识别信息对应的目标意图。
在本步骤之后,还可以执行待识别信息对应的目标意图,具体地,可以将待识别信息对应的目标意图下发至系统开发态(离线模型训练)和运行态(在线运行实现),进行执行策略分析和决策;然后,将目标意图的执行结果通过系统接口反馈给用户。
本公开实施例通过将与所述第一特征文本的所述联合相似度大于第二预设阈值的意图,作为所述待识别信息对应的目标意图;如此,能够在针对同一待识别信息存在两个或两个以上的相似意图(候选意图)的情况下,通过联合相似度准确确定出待识别信息对应的目标意图。
请参见图9,为了实现上述意图识别方法,本公开的一个实施例中提供一种意图识别装置900。图9示出了意图识别装置900的示意性架构图,该意图识别装置900包括:信息获取模块901、目标领域确定模块902、联合相似度确定模块903和目标意图确定模块904,其中:
信息获取模块901,用于获取待识别信息;
目标领域确定模块902,用于从多个预设领域中,确定所述待识别信息对应的目标领域;
联合相似度确定模块903,用于基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度;
目标意图确定模块904,用于基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图。
在一个可选的实施例中,该目标领域确定模块902具体用于,提取所述待识别信息的第二特征文本集;对所述第二特征文本集中的每一第二特征文本进行领域预测,并基于每一所述第二特征文本的预测结果确定所述待识别信息对应的目标领域。
在一个可选的实施例中,该目标领域确定模块902具体还用于,确定每一所述第二特征文本与多个所述预设领域中每一所述预设领域的关联概率;将关联概率最大的预设领域作为所述第二特征文本对应的领域,得到所述第二特征文本集对应的领域;将所述第二特征文本集对应的领域中出现次数最多的领域作为所述待识别信息对应的目标领域。
在一个可选的实施例中,该联合相似度确定模块903具体用于,确定所述待识别信息的第一特征文本与所述意图之间的语义距离;基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,确定所述意图与所述第一特征文本的最小深度;基于 所述语义距离和所述最小深度,确定所述第一特征文本与所述意图之间的联合相似度。
在一个可选的实施例中,该联合相似度确定模块903具体还用于,基于预设参数和所述最小深度确定第一运算结果;对所述第一运算结果和所述语义距离进行加权,确定第二运算结果;基于所述第一运算结果和所述第二运算结果的比值确定所述联合相似度。
在一个可选的实施例中,该联合相似度确定模块903具体用于,确定所述待识别信息的第一特征文本与所述意图之间的语义距离;在存在两个或两个以上意图与所述第一特征文本的语义距离小于第一预设阈值的情况下,将所述两个或两个以上意图作为候选意图;基于所述候选意图在所述意图知识图谱中的深度信息,以及所述第一特征文本与所述候选意图之间的语义距离,确定所述第一特征文本与所述候选意图之间的联合相似度。
在一个可选的实施例中,该目标意图确定模块904具体用于,将与所述第一特征文本的所述联合相似度大于第二预设阈值的意图,作为所述待识别信息对应的目标意图。
本公开的示例性实施方式还提供了一种计算机可读存储介质,可以实现为一种程序产品的形式,其包括程序代码,当程序产品在电子设备上运行时,程序代码用于使电子设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的步骤。在一种实施方式中,该程序产品可以实现为便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在电子设备,例如个人电脑上运行。然而,本公开的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设 备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。在本公开实施例中,计算机可读存储介质中存储的程序代码被执行时可以实现如上导航卫星星座确定方法中的任一步骤。
请参见图10,本公开的示例性实施方式还提供了一种电子设备1000,可以是信息平台的后台服务器。下面参考图10对该电子设备1000进行说明。应当理解,图10显示的电子设备1000仅仅是一个示例,不应对本公开实施方式的功能和使用范围带来任何限制。
如图10所示,电子设备1000以通用计算设备的形式表现。电子设备1000的组件可以包括但不限于:至少一个处理单元1010、至少一个存储单元1020、连接不同系统组件(包括存储单元1020和处理单元1010)的总线1030。
其中,存储单元存储有程序代码,程序代码可以被处理单元1010执行,使得处理单元1010执行本说明书上述“示例性方法”部分中描述的根据本发明各种示例性实施方式的步骤。例如,处理单元1010可以执行如图2所示的方法步骤等。
存储单元1020可以包括易失性存储单元,例如随机存取存储单元(RAM)1021和/或高速缓存存储单元1022,还可以进一步包括只读存储单元(ROM)1023。
存储单元1020还可以包括具有一组(至少一个)程序模块1025的程序/实用工具1024,这样的程序模块1025包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线1030可以包括数据总线、地址总线和控制总线。
电子设备1000也可以与一个或多个外部设备2000(例如键盘、指向设备、蓝牙设备等)通信,这种通信可以通过输入/输出(I/O)接口1040进行。电子设备1000还可以通过网络适配器1050与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器1050通过总线1030与电子设备1000的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备1000使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。
在本公开实施例中,电子设备中存储的程序代码被执行时可以实现如上导航卫星星座确定方法中的任一步骤。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的示例性实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一 个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
所属技术领域的技术人员能够理解,本公开的各个方面可以实现为系统、方法或程序产品。因此,本公开的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“系统”。本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其他实施方式。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施方式仅被视为示例性的,本公开的真正范围和精神由权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限定。

Claims (10)

  1. 一种意图识别方法,包括:
    获取待识别信息;
    从多个预设领域中,确定所述待识别信息对应的目标领域;
    基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度;
    基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图。
  2. 根据权利要求1所述的意图识别方法,其中,所述从多个预设领域中,确定所述待识别信息对应的目标领域,包括:
    提取所述待识别信息的第二特征文本集;
    对所述第二特征文本集中的每一第二特征文本进行领域预测,并基于每一所述第二特征文本的预测结果确定所述待识别信息对应的目标领域。
  3. 根据权利要求2所述的意图识别方法,其中,所述对所述第二特征文本集中的每一第二特征文本进行领域预测,并基于每一所述第二特征文本的预测结果确定所述待识别信息对应的目标领域,包括:
    确定每一所述第二特征文本与多个所述预设领域中每一所述预设领域的关联概率;
    将关联概率最大的预设领域作为所述第二特征文本对应的领域,得到所述第二特征文本集对应的领域;
    将所述第二特征文本集对应的领域中出现次数最多的领域作为所述待识别信息对应的目标领域。
  4. 根据权利要求1所述的意图识别方法,其中,所述基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度,包括:
    确定所述待识别信息的第一特征文本与所述意图之间的语义距离;
    基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,确定所述意图与所述第一特征文本的最小深度;
    基于所述语义距离和所述最小深度,确定所述第一特征文本与所述意图之间的联合相似度。
  5. 根据权利要求4所述的意图识别方法,其中,所述基于所述语义距离和所述最小深度,确定所述第一特征文本与所述意图之间的联合相似度,包括:
    基于预设参数和所述最小深度确定第一运算结果;
    对所述第一运算结果和所述语义距离进行加权,确定第二运算结果;
    基于所述第一运算结果和所述第二运算结果的比值确定所述联合相似度。
  6. 根据权利要求1所述的意图识别方法,其中,所述基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度,包括:
    确定所述待识别信息的第一特征文本与所述意图之间的语义距离;
    在存在两个或两个以上意图与所述第一特征文本的语义距离小于第一预设阈值的情况下,将所述两个或两个以上意图作为候选意图;
    基于所述候选意图在所述意图知识图谱中的深度信息,以及所述第一特征文本与所述候选意图之间的语义距离,确定所述第一特征文本与所述候选意图之间的联合相似度。
  7. 根据权利要求1至6任一项所述的意图识别方法,其中,所述基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图,包括:
    将与所述第一特征文本的所述联合相似度大于第二预设阈值的意图,作为所述待识别信息对应的目标意图。
  8. 一种意图识别装置,所述装置包括:
    信息获取模块,用于获取待识别信息;
    目标领域确定模块,用于从多个预设领域中,确定所述待识别信息对应的目标领域;
    联合相似度确定模块,用于基于意图知识图谱中所述目标领域下的意图在所述意图知识图谱中的深度信息,以及所述待识别信息的第一特征文本与所述意图之间的语义距离,确定所述第一特征文本与所述意图之间的联合相似度;
    目标意图确定模块,基于所述第一特征文本与所述意图之间的联合相似度,确定所述待识别信息对应的目标意图。
  9. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至7任一项所述的方法。
  10. 一种电子设备,包括:
    处理器;以及
    存储器,用于存储所述处理器的可执行指令;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1至7任一项所述的方法。
PCT/CN2023/096071 2022-05-24 2023-05-24 一种意图识别方法、装置、存储介质和电子设备 WO2023227030A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210570098.9 2022-05-24
CN202210570098.9A CN117151107A (zh) 2022-05-24 2022-05-24 一种意图识别方法、装置、存储介质和电子设备

Publications (1)

Publication Number Publication Date
WO2023227030A1 true WO2023227030A1 (zh) 2023-11-30

Family

ID=88906783

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/096071 WO2023227030A1 (zh) 2022-05-24 2023-05-24 一种意图识别方法、装置、存储介质和电子设备

Country Status (2)

Country Link
CN (1) CN117151107A (zh)
WO (1) WO2023227030A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150286709A1 (en) * 2014-04-02 2015-10-08 Samsung Electronics Co., Ltd. Method and system for retrieving information from knowledge-based assistive network to assist users intent
CN111291156A (zh) * 2020-01-21 2020-06-16 同方知网(北京)技术有限公司 一种基于知识图谱的问答意图识别方法
CN111737430A (zh) * 2020-06-16 2020-10-02 北京百度网讯科技有限公司 实体链接方法、装置、设备以及存储介质
CN112560505A (zh) * 2020-12-09 2021-03-26 北京百度网讯科技有限公司 一种对话意图的识别方法、装置、电子设备及存储介质
CN112905774A (zh) * 2021-02-22 2021-06-04 武汉市聚联科软件有限公司 一种基于事理图谱的人机对话深度意图理解方法
CN113127626A (zh) * 2021-04-22 2021-07-16 广联达科技股份有限公司 基于知识图谱的推荐方法、装置、设备及可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150286709A1 (en) * 2014-04-02 2015-10-08 Samsung Electronics Co., Ltd. Method and system for retrieving information from knowledge-based assistive network to assist users intent
CN111291156A (zh) * 2020-01-21 2020-06-16 同方知网(北京)技术有限公司 一种基于知识图谱的问答意图识别方法
CN111737430A (zh) * 2020-06-16 2020-10-02 北京百度网讯科技有限公司 实体链接方法、装置、设备以及存储介质
CN112560505A (zh) * 2020-12-09 2021-03-26 北京百度网讯科技有限公司 一种对话意图的识别方法、装置、电子设备及存储介质
CN112905774A (zh) * 2021-02-22 2021-06-04 武汉市聚联科软件有限公司 一种基于事理图谱的人机对话深度意图理解方法
CN113127626A (zh) * 2021-04-22 2021-07-16 广联达科技股份有限公司 基于知识图谱的推荐方法、装置、设备及可读存储介质

Also Published As

Publication number Publication date
CN117151107A (zh) 2023-12-01

Similar Documents

Publication Publication Date Title
US10553202B2 (en) Method, apparatus, and system for conflict detection and resolution for competing intent classifiers in modular conversation system
US10664505B2 (en) Method for deducing entity relationships across corpora using cluster based dictionary vocabulary lexicon
US10586155B2 (en) Clarification of submitted questions in a question and answer system
CN107992585B (zh) 通用标签挖掘方法、装置、服务器及介质
US9311823B2 (en) Caching natural language questions and results in a question and answer system
CN106960030B (zh) 基于人工智能的推送信息方法及装置
KR20210104571A (ko) 멀티 모달리티를 기반으로 하는 주제 분류 방법, 장치, 기기 및 저장 매체
US20150310862A1 (en) Deep learning for semantic parsing including semantic utterance classification
US20120259801A1 (en) Transfer of learning for query classification
WO2018045646A1 (zh) 基于人工智能的人机交互方法和装置
US20140207716A1 (en) Natural language processing method and system
US11086941B2 (en) Generating suggestions for extending documents
CN113806588B (zh) 搜索视频的方法和装置
US9092673B2 (en) Computing visual and textual summaries for tagged image collections
US20200167613A1 (en) Image analysis enhanced related item decision
CN114528588A (zh) 跨模态隐私语义表征方法、装置、设备及存储介质
US10572601B2 (en) Unsupervised template extraction
WO2020052059A1 (zh) 用于生成信息的方法和装置
US9600687B2 (en) Cognitive digital security assistant utilizing security statements to control personal data access
CN114298007A (zh) 一种文本相似度确定方法、装置、设备及介质
US20230112385A1 (en) Method of obtaining event information, electronic device, and storage medium
WO2023227030A1 (zh) 一种意图识别方法、装置、存储介质和电子设备
CN111368036B (zh) 用于搜索信息的方法和装置
JP2020016960A (ja) 推定装置、推定方法及び推定プログラム
CN116010607A (zh) 文本聚类方法、装置、计算机系统及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23811100

Country of ref document: EP

Kind code of ref document: A1