WO2023206703A1 - Event slot extraction method and apparatus, storage medium and electronic apparatus - Google Patents

Event slot extraction method and apparatus, storage medium and electronic apparatus Download PDF

Info

Publication number
WO2023206703A1
WO2023206703A1 PCT/CN2022/096436 CN2022096436W WO2023206703A1 WO 2023206703 A1 WO2023206703 A1 WO 2023206703A1 CN 2022096436 W CN2022096436 W CN 2022096436W WO 2023206703 A1 WO2023206703 A1 WO 2023206703A1
Authority
WO
WIPO (PCT)
Prior art keywords
candidate
verb
words
target
relationship
Prior art date
Application number
PCT/CN2022/096436
Other languages
French (fr)
Chinese (zh)
Inventor
苑春明
Original Assignee
青岛海尔科技有限公司
海尔智家股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 青岛海尔科技有限公司, 海尔智家股份有限公司 filed Critical 青岛海尔科技有限公司
Publication of WO2023206703A1 publication Critical patent/WO2023206703A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present disclosure relates to the field of communications, and specifically, to an event slot extraction method and device, a storage medium, and an electronic device.
  • slot extraction In the field of intelligent question answering, user intentions can be understood through slot extraction.
  • the slots involved can include slot information such as time, events, singers, and songs.
  • event slots Compared with other slots, event slots have richer content, which makes their extraction more complex.
  • the dictionary matching method is usually used to extract the event slot from the statement: words in the dictionary are matched with the statement, and the slot in the statement where the word matching the dictionary is located is determined as the event slot.
  • event slot extraction method in the related art has a problem of low accuracy in event slot extraction due to the difficulty in exhaustively enumerating event slots.
  • Embodiments of the present disclosure provide an event slot extraction method and device, a storage medium, and an electronic device to at least solve the problem of event slot extraction methods in related technologies due to the difficulty in exhaustively enumerating event slots.
  • the problem of low extraction accuracy is a problem of low extraction accuracy.
  • a method for extracting event slots including: obtaining a target control statement of the event slot to be extracted, wherein the target control statement includes a plurality of words;
  • the control statement performs syntactic analysis to determine a group of candidate words from the plurality of words, wherein the number of candidate words included in the group of candidate words is smaller than the number of words included in the plurality of words; according to the The syntactic relationship between each candidate word in a set of candidate words and the associated words of each candidate word, and the target word is determined from the set of candidate words, wherein the target word and the target word There is a designated syntactic relationship between the related words; the target word and the related words of the target word are determined as the target event slot of the target control statement.
  • an event slot extraction device including: an acquisition unit configured to acquire a target control statement of the event slot to be extracted, wherein the target control statement includes a plurality of words; an analysis unit configured to perform syntactic analysis on the target control statement and determine a group of candidate words from the plurality of words, wherein the number of candidate words included in the group of candidate words is less than the number of candidate words.
  • the number of words contained in the plurality of words; the first determination unit is configured to determine the number of words from the set of candidate words based on the syntactic relationship between each candidate word in the set of candidate words and the associated words of each candidate word.
  • the target word is determined among the candidate words, wherein there is a specified syntactic relationship between the target word and the associated words of the target word; the second determination unit is configured to determine the target word and the associated words of the target word. , determined as the target event slot of the target control statement.
  • a computer-readable storage medium stores a computer program, wherein the computer program is configured to execute the above event slot during runtime. extraction method.
  • an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned steps through the computer program. Extraction method of event slot.
  • the candidate words in the control statement are first extracted, and then words belonging to the event slot and their associated words are selected based on the syntactic relationship between the candidate words and associated words.
  • the target control statement of the slot wherein the target control statement includes multiple words; perform syntactic analysis on the target control statement, and determine a group of candidate words from the multiple words, wherein the candidate words included in the group of candidate words are The number is less than the number of words contained in multiple words; according to the syntactic relationship between each candidate word in the set of candidate words and the associated words of each candidate word, the target word is determined from a set of candidate words, where the target word There is a specified syntactic relationship between the word and the associated words of the target word; the target word and the associated words of the target word are determined as the target event slot of the target control statement.
  • the candidates in the control statement are obtained words, according to the syntactic structure formed by the candidate words and the associated words of the candidate words, select words that can be used as event slots, and determine the selected words and their associated words as the extracted event slots, by considering the part of speech, syntax, and sentence structure
  • the purpose of accurately extracting event slots can be achieved based on the syntactic structure, achieving the technical effect of improving the accuracy of event slot extraction, and thus solving the related technical problems.
  • the event slot extraction method in has the problem of low accuracy in event slot extraction due to the difficulty in exhaustively enumerating event slots.
  • Figure 1 is a schematic diagram of the hardware environment of an optional event slot extraction method according to an embodiment of the present disclosure
  • Figure 2 is a schematic flowchart of an optional event slot extraction method according to an embodiment of the present disclosure
  • Figure 3 is a schematic diagram of an optional statement structure according to an embodiment of the present disclosure.
  • Figure 4 is a schematic diagram of another optional statement structure according to an embodiment of the present disclosure.
  • Figure 5 is a schematic diagram of yet another optional statement structure according to an embodiment of the present disclosure.
  • Figure 6 is a schematic diagram of yet another optional statement structure according to an embodiment of the present disclosure.
  • Figure 7 is a schematic diagram of yet another optional statement structure according to an embodiment of the present disclosure.
  • Figure 8 is a schematic flowchart of another optional event slot extraction method according to an embodiment of the present disclosure.
  • Figure 9 is a structural block diagram of an optional event slot extraction device according to an embodiment of the present disclosure.
  • Figure 10 is a structural block diagram of an optional electronic device according to an embodiment of the present disclosure.
  • a method for extracting event slots is provided.
  • the event slot extraction method can be applied to whole-house intelligent digital control application scenarios such as smart home, smart home, smart home device ecology, and intelligent residence (Intelligence House) ecology.
  • the above event slot extraction method can be applied to the hardware environment composed of the terminal 102 and the server 104 as shown in Figure 1 .
  • the server 104 is connected to the terminal 102 through the network and can be used to provide services (such as application services, etc.) for the terminal or the client installed on the terminal.
  • Cloud computing and/or can be configured on the server or independently of the server.
  • Edge computing services are used to provide data computing services for the server 104.
  • the above-mentioned network may include but is not limited to at least one of the following: wired network, wireless network.
  • the above-mentioned wired network may include but is not limited to at least one of the following: wide area network, metropolitan area network, and local area network.
  • the above-mentioned wireless network may include at least one of the following: WIFI (Wireless Fidelity, Wireless Fidelity), Bluetooth.
  • the terminal 102 may be, but is not limited to, a PC, a mobile phone, a tablet, a smart air conditioner, a smart hood, a smart refrigerator, a smart oven, a smart stove, a smart washing machine, a smart water heater, a smart washing equipment, a smart dishwasher, a smart projection device, Smart TV, smart clothes drying rack, smart curtains, smart audio and video, smart sockets, smart audio, smart speakers, smart fresh air equipment, smart kitchen and bathroom equipment, smart bathroom equipment, smart sweeping robot, smart window cleaning robot, smart mopping robot, smart Air purification equipment, smart steamers, smart microwave ovens, smart kitchen appliances, smart purifiers, smart water dispensers, smart door locks, etc.
  • the event slot extraction method in the embodiment of the present disclosure may be executed by the server 104, may be executed by the terminal 102, or may be executed jointly by the server 104 and the terminal 102. Wherein, the terminal 102 may also perform the event slot extraction method of the embodiment of the present disclosure by a client installed thereon.
  • Figure 2 is a schematic flowchart of an optional event slot extraction method according to an embodiment of the present disclosure. As shown in Figure 2, The process of this method may include the following steps:
  • Step S202 Obtain the target control statement of the event slot to be extracted, where the target control statement includes multiple words.
  • the event slot extraction method in this embodiment can be applied to the scenario of extracting event slots in control statements in the smart home or smart home field, and can be applied to systems with human-computer interaction functions such as voice intelligent question and answer systems. middle.
  • the user inputs a control statement, and the machine extracts the slot of the control statement, thereby analyzing the user's intention and responding to different user intentions.
  • the slots involved may include slot information such as time, events, singers, and songs.
  • time slots can be extracted using matching rules defined by regular expressions.
  • Singer and song slots can be extracted by maintaining a data dictionary and matching lyrics and music data.
  • Event slots are richer in content. making its extraction more complex.
  • the dictionary matching method is usually used to extract event slots from statements.
  • event slots for example, event slot extraction in the schedule field
  • the dictionary matching method is usually used to extract event slots from statements.
  • due to the difficulty in exhaustively enumerating event slots for example, event slots in the schedule field
  • the dictionary library is too large, it will affect the system running speed to a certain extent. Therefore, the strong matching strategy also greatly reduces its generalization.
  • algorithm models can also be used to automatically extract event slots, and the model can be trained with large-scale data to automatically extract slots.
  • this method is only suitable for slot extraction tasks with a limited number of slots, and has a narrow scope of application. It is more commonly used in tasks with a controllable number of slots, but the generalization controllability of this method is lower than that of the dictionary library. It can be seen that the method of extracting event slots based on dictionary matching and algorithm models cannot take into account the two aspects of generalization and accuracy.
  • part-of-speech rules can be formulated with the help of part-of-speech analysis to match and extract event slots, and all matches in the sentence can be Phrases of part-of-speech rules are extracted as event slots.
  • event slots such as "playing basketball ⁇ n” and “going to school”
  • part-of-speech rules can be formulated with the help of part-of-speech analysis to match and extract event slots, and all matches in the sentence can be Phrases of part-of-speech rules are extracted as event slots.
  • the syntactic structure of a verb plus a noun is considered an event slot, and its generalization is stronger than that of the dictionary database.
  • part-of-speech rules to match the way to extract event slots will reduce the accuracy of event slot extraction. For example, because they do not comply with the gerund rules, event slots such as "Go out to play” and “Go to the gym to play basketball” will not be extracted. For another example, "use ⁇ v song ⁇ n A ⁇ n” also satisfies the defined event slot rules, and the song slot is mistakenly extracted as an event slot. This adds a certain degree of difficulty to the semantic understanding of the question and answer system.
  • syntactic analysis can be introduced and matching rules can be formulated to assist in extracting event slots.
  • the control statements are parsed by considering syntactic and sentence structure features to determine the corresponding event slot. bit, which can improve the accuracy of event slot extraction.
  • the target server (an example of the above-mentioned server 104) can obtain the target control statement of the event slot to be extracted, and the target control statement includes multiple words.
  • the above process of obtaining the target control statement of the event slot to be extracted may include receiving the target control statement sent by the terminal device through a communication connection established with the terminal device.
  • the user can issue a target control statement to his terminal device through voice or other means, and the control statement can be used to control the terminal device to perform the target device operation.
  • the terminal device may send the obtained target control statement to the target server.
  • what the target server obtains may be a target voice instruction containing a control statement, etc., and extracts the target control statement contained therein by parsing the received target voice instruction.
  • the smartphone the above-mentioned terminal device
  • the smartphone the above-mentioned terminal device
  • the server can parse the voice commands and extract the control statements.
  • the device operation indicated by the above target control instruction may be a device operation performed by the terminal device, or may be a device operation performed by controlling other devices (for example, smart home devices) other than the terminal device.
  • the above-mentioned equipment operation may be an operation of adjusting equipment parameters, or may be an operation of setting a schedule, etc.
  • the target control instruction may be a schedule reminder instruction.
  • the target control instruction may be an equipment control instruction such as "turn on the water heater tomorrow morning.” It can also be a schedule reminder instruction such as "remind me to go to the gym to play basketball at 8 o'clock tomorrow morning", etc., which is not limited in this embodiment.
  • Step S204 Perform syntax analysis on the target control statement to determine a group of candidate words from multiple words, where the number of candidate words included in a group of candidate words is smaller than the number of words included in the multiple words.
  • the target server can perform syntax analysis on the target control statement and determine a group of candidate words from multiple words.
  • the number of candidate words included in the group of candidate words is less than a plurality of candidate words.
  • the extracted group of candidate words can be words with a verb part of speech in the target control sentence, or words with a noun part of speech.
  • a group of candidate words is a group of candidate verbs, it can be the core verb in the control statement, it can be all the verbs in the control statement, it can be the verb after the core verb in the control statement, or it can be the core verb in the control statement.
  • the previous verb may be a verb whose semantic weight for the control statement exceeds the weight threshold, or may be a verb located at a non-endpoint position of the target control statement in the target control statement, which is not limited in this embodiment.
  • the above-mentioned process of syntactic analysis of the target control statement may be to use a syntactic analysis algorithm to perform syntactic analysis of the target control statement (which may be implemented using open source syntax analysis software), and extract multiple words included in the target control statement (which may be It is to extract the core words of the sentence), analyze the part-of-speech of each word in multiple words and the relationship between the words (that is, extract the structural features of the sentence), and extract the part-of-speech and sentence structure features of the sentence.
  • a syntactic analysis algorithm to perform syntactic analysis of the target control statement (which may be implemented using open source syntax analysis software), and extract multiple words included in the target control statement (which may be It is to extract the core words of the sentence), analyze the part-of-speech of each word in multiple words and the relationship between the words (that is, extract the structural features of the sentence), and extract the part-of-speech and sentence structure features of the sentence.
  • the target server may determine a group of candidate words based on the part-of-speech of each word in the multiple words, or may first perform a query on the target control sentence. Syntactic analysis determines the core words in the target control sentence, and then determines the words that have a specified syntactic relationship with the core words as candidate words, and then obtains a set of candidate words. It can also use other methods to determine candidate words. This is not limited in the examples.
  • Step S206 Determine the target word from a group of candidate words based on the syntactic relationship between each candidate word in the group of candidate words and the associated words of each candidate word, where the target word and the associated words of the target word are specifies the syntactic relationship between them.
  • the target server may determine words associated with each candidate word in the set of candidate words.
  • the associated words of each of the above candidate words may be adjacent words of each candidate word, or may be words that have a syntactic relationship with each candidate word, or may be other associated relationships with each candidate word.
  • the word for example, may be the first word after each candidate word, or the first noun after each candidate word, which is not limited in this embodiment.
  • the target server may determine the target word from a group of candidate words based on the syntactic relationship between each candidate word and the associated words of each candidate word. , optionally, a candidate word whose syntactic relationship between a set of candidate words and associated words is a specified syntactic relationship can be determined as the target word.
  • the syntactic relationship as a verb-object relationship or a parallel relationship
  • the associated words of each candidate word are the adjacent words of the candidate word.
  • the candidate words in the control statement “Remind me to go to the gym to play basketball with song A at 8 o'clock tomorrow morning” include : “Reminder”, “Go”, “Hit”, “Reminder” adjacent words are "I”, the syntactic relationship formed between the two is neither a verb-object relationship nor a parallel relationship, therefore, "Remind” is not the target word , the word adjacent to "go” is "gymnasium”, and the syntactic relationship formed between the two is a verb-object relationship. "Go” is the target word, and the word adjacent to "play” is “basketball”, and the relationship formed between the two is The syntactic relationship of is also a verb-object relationship, and "beat” is also a target word.
  • Step S208 Determine the target word and related words of the target word as the target event slot of the target control statement.
  • the target server may determine the target word and words associated with the target word as the target event slot of the target control statement.
  • each target word and associated words of each target word may be determined as the target event slot of the target control statement.
  • the server determines that the target words include “go” and “play” from the control statement "Remind me to go to the gymnasium to play basketball with song A at 8 o'clock tomorrow morning", it can add both "go to the gym” and “play basketball”. Determine the event slot of this control statement.
  • the target control statement of the event slot to be extracted is obtained, where the target control statement includes multiple words; the target control statement is syntactically analyzed to determine a group of candidate words from the multiple words, wherein a group of The number of candidate words contained in a candidate word is smaller than the number of words contained in multiple words; based on the syntactic relationship between each candidate word in a set of candidate words and the associated words of each candidate word, from a set of candidate words
  • the target word is determined, in which the target word and the associated words of the target word are designated syntactic relationships; the target word and the associated words of the target word are determined as the target event slot of the target control statement, solving the problems in related technologies
  • the event slot extraction method has the problem of low accuracy in event slot extraction due to the difficulty in exhaustively enumerating event slots. This method improves the accuracy of event slot extraction.
  • the target word is determined from a set of candidate words based on the syntactic relationship between each candidate word in the set of candidate words and associated words of each candidate word, including:
  • Target words include target verbs.
  • the target verb can be determined first among a group of candidate words, and then According to the target verb, determine the target event slot of the target control statement.
  • the above-mentioned process of determining the target verb in a group of candidate words may be: based on the syntactic relationship between each candidate verb in the group of candidate verbs and the associated words of each candidate verb, from a group of candidate verbs The target verb is determined, the above group of candidate words includes a group of candidate verbs, and the target word includes the target verb.
  • syntactic relations may be specified syntactic relations or other syntactic relations.
  • a candidate verb in a set of candidate verbs whose syntactic relationship with the associated word is a specified syntactic relationship may be determined as the target verb.
  • Verbs include: “remind”, “go”, “beat”, the adjacent word of "remind” is "I”, the syntactic relationship formed between the two is neither a verb-object relationship nor a parallel relationship, therefore, "remind” is not
  • the target verb, the adjacent word of "go” is "gymnasium”, and the syntactic relationship formed between the two is a verb-object relationship.
  • "Go” is the target verb, and the adjacent word of "play” is "basketball”.
  • the syntactic relationship formed between them is also a verb-object relationship
  • "beat” is also the target verb.
  • the target verb is determined from the control statement according to the syntactic relationship between each candidate verb in the control statement and the associated words of each candidate verb, which can improve the efficiency of determining the target verb.
  • syntactic analysis is performed on the target control statement to determine a set of candidate words from multiple words, including:
  • S21 perform syntactic analysis on the target control statement, and determine multiple words contained in the target control statement and the part-of-speech of each word in the multiple words;
  • S22 Determine each word whose part-of-speech is a verb among the plurality of words as a candidate verb, and obtain a set of candidate verbs.
  • the target server may perform syntax analysis on the target control statement, and determine multiple words included in the target control sentence and the part-of-speech of each word in the multiple words.
  • the target server may also determine a syntactic relationship between each word and words other than the word among the plurality of words. For example, the server can perform syntactic analysis on the control statement "Remind me to go to the gymnasium to play basketball with song A at 8 o'clock tomorrow morning” and obtain the part of speech of each word in the multiple words included in the control statement, as well as the relationship between each word and other words. syntactic relationship between them.
  • the above-mentioned process of performing syntactic analysis on the target control sentence and determining the multiple words contained in the target control sentence and the part-of-speech of each word in the multiple words may be: analyzing the target control sentence through a part-of-speech tagging algorithm to obtain that the target control sentence contains of multiple words and the part of speech for each of the multiple words.
  • the above-mentioned part-of-speech tagging algorithm may include at least one of the following: sequence model, hidden Markov model (Hidden Markov Model, HMM), Markov model, for example, MEMM (Maximum Entropy Markov Model, maximum entropy Markov model), CRFS (Conditional Random Fields, Conditional Random Fields) and other Markov models in a broad sense, deep learning, for example, conventional classifiers of machine learning (for example, SVM (Support Vector Machine, Support Vector Machine), to RNN (Recurrent Neural Network, Deep learning algorithms represented by recurrent neural networks), etc. are not limited in this embodiment.
  • sequence model Hidden Markov Model, HMM
  • Markov model for example, MEMM (Maximum Entropy Markov Model, maximum entropy Markov model), CRFS (Conditional Random Fields, Conditional Random Fields) and other Markov models in a broad sense
  • deep learning for example, conventional classifiers of machine learning (for example, SVM (Support
  • nt temporary noun, time noun
  • p preposition, time noun
  • n general noun, general noun
  • v verb, verb
  • r pronoun, pronoun
  • wp punctuation, Punctuation
  • nl location noun, location noun
  • nd direction noun, direction noun
  • HED head, core word
  • ATT attributete, centering relationship
  • ADV above the statement (adverbial, adverbial structure), POB (preposition-object, preposition-object relationship), DBL (double, concurrent language), VOB (verb-object, verb-object relationship), COO (coordinate, parallel relationship), SBV (
  • the target server can determine each word whose part-of-speech is a verb among the multiple words as a candidate verb, and obtain a set of candidate verbs, that is, a set of The candidate verbs are all verbs in multiple words.
  • words whose part-of-speech is verb in the control statement are determined as candidate verbs, which can simplify the determination steps of candidate verbs and improve the efficiency of determining candidate verbs.
  • syntactic analysis is performed on the target control statement to determine a set of candidate words from multiple words, including:
  • a set of candidate verbs may be determined based on core verbs in the target control statement.
  • the target server can perform syntax analysis on the target control statement and determine the core verb of the target control instruction.
  • the above-mentioned core verbs are verbs that dominate other words in the target control instructions and are not dominated by other words.
  • the core verb represents the core HED of the entire sentence (root).
  • the core verbs in the schedule field are generally "remind", "call”, etc.
  • the position of the core word in the sentence has certain reference value for event slot extraction; based on the core
  • the syntactic structure composed of verbs and adjacent words of the core verb is used to extract event slots. Since syntactic and sentence structure features are taken into account, the accuracy of event slot extraction can be improved.
  • the words pointed to by root in Figures 3, 4, 5, 6 and 7 are the core verbs corresponding to the control statements.
  • "remind” is the part of the control statement.
  • "go” is the core word in the control statement.
  • "remind” is the core word in the control statement, as shown in Figure 6
  • "remind” is the core word in the control statement.
  • "remember” is the core word in the control statement.
  • the target server can determine each verb among the multiple words included in the target control statement that has a specified syntactic relationship with the core verb as a candidate verb, and obtain a set of candidate verbs.
  • the above specified syntactic relationship may be a verb that has a VOB relationship with the core verb, or it may be a verb that has a COO relationship with the core verb, which is not limited in this embodiment.
  • the above group of candidate verbs may include multiple candidate verbs, or may include only one candidate verb, which is not limited in this embodiment.
  • the verb "go” that has a VOB relationship with "remind” can be determined as a candidate verb, because in the control statement, in addition to the candidate After the verb "go", there is no verb with a VOB relationship or a COO relationship with the core verb. Therefore, the set of candidate verbs in the control statement only includes the candidate verb "go".
  • the target server may look for verbs that have a specified syntactic relationship with any candidate verb in the set of candidate verbs.
  • the above-mentioned process of finding verbs that have a specified syntactic relationship with any candidate verb in a group of candidate verbs may be: finding verbs that have a VOB relationship with any candidate verb in a group of candidate verbs or verbs that have a COO relationship. verb. This is not limited in this embodiment.
  • the found verb can be added as a candidate verb to a set of candidate verbs.
  • the verbs that have a specified syntactic relationship with the core verb in the control statement are determined as candidate verbs, and the verbs that have a specified syntactic relationship with the candidate verbs are also determined as candidate verbs, which can improve the determination of candidate verbs. accuracy to avoid missing candidate verbs.
  • the associated words of each candidate verb are adjacent words of each candidate verb, and the adjacent words of each candidate verb are located after each candidate verb and adjacent to each candidate verb.
  • Words determine the target verb from a set of candidate verbs based on the syntactic relationship between each candidate verb in the set of candidate verbs and the associated words of each candidate verb, including:
  • S41 Determine the candidate verb in a group of candidate verbs that satisfies the first filtering condition between adjacent words as the target verb, where the first filtering condition includes a syntactic relationship, which is a specified syntactic relationship.
  • the target server may determine a candidate verb among the group of candidate verbs that satisfies the first filtering condition between adjacent words as the target verb.
  • the associated words of each candidate verb are adjacent words of each candidate verb, and the adjacent words of each candidate verb are words located after each candidate verb and adjacent to each candidate verb.
  • a filter condition including syntactic relations is the specified syntactic relation.
  • the candidate words that have a specified syntactic relationship with adjacent words in a group of candidate verbs are determined as target words, which can improve the accuracy and flexibility of determining the target words.
  • a candidate verb among a group of candidate verbs that satisfies the first filtering condition between adjacent words is determined as the target verb, including:
  • the above-mentioned first filtering condition may include but is not limited to one of the following: the syntactic relationship between the candidate verb and the adjacent words is a verb-object relationship, and the part of speech between the candidate verb and the adjacent words is a noun; The syntactic relationship between the candidate verb and adjacent words is a parallel relationship.
  • the syntactic relationship between any candidate verb in a group of candidate verbs and the adjacent words of any candidate verb can be a verb-object relationship, and the adjacent words of any candidate verb When the part-of-speech of the word is a noun, any candidate verb is determined as the target verb.
  • the above-mentioned specified syntactic relationship includes a verb-object relationship (ie, VOB relationship).
  • the above-mentioned first filtering condition also includes that the part-of-speech of the adjacent word is a noun.
  • any candidate verb can be determined as the target verb when the syntactic relationship between any candidate verb and the adjacent words of any candidate verb is a parallel relationship, where the syntax is specified Relationships include parallel relationships (ie, COO relationships).
  • the target verb is determined from a group of candidate verbs according to the syntactic relationship between any candidate verb in the group of candidate verbs and the adjacent words of any candidate verb and the part of speech of the adjacent words of any candidate verb. , which can improve the accuracy of determining the target verb, thereby improving the accuracy of the determined event slot.
  • the associated words of each candidate verb are words that have a syntactic relationship with each candidate verb; according to the syntax between each candidate verb in a set of candidate verbs and the associated words of each candidate verb Relationship, determine the target verb from a set of candidate verbs, including:
  • S61 Determine a candidate verb in a group of candidate verbs that satisfies a second filtering condition with a word that has a syntactic relationship as a target verb, where the second filtering condition includes a syntactic relationship that is a specified syntactic relationship.
  • the target server may determine a candidate verb among the group of candidate verbs that satisfies the second filtering condition with words that have a syntactic relationship as the target verb.
  • the associated words of each of the above candidate verbs are words that have a syntactic relationship with each candidate verb, and the second filtering condition includes that the syntactic relationship is a specified syntactic relationship.
  • a candidate verb whose syntactic relationship between a group of candidate verbs and a word that has a syntactic relationship is a VOB relationship or has a COO relationship can be determined as the target verb.
  • a candidate word with a specified syntactic relationship between a group of candidate verbs and a word with a syntactic relationship is determined as the target word, which can improve the accuracy and flexibility of determining the target word.
  • a candidate verb that satisfies the second filtering condition among a group of candidate verbs and words that have a syntactic relationship is determined as the target verb:
  • the syntactic relationship between any candidate verb in a group of candidate verbs and the word that has a syntactic relationship with any candidate verb is a verb-object relationship
  • the part of speech of the word that has a syntactic relationship with any candidate verb is a noun
  • any candidate verb is determined as the target verb, wherein the specified syntactic relationship includes a verb-object relationship, and the second filtering condition also includes that the part-of-speech of the word with the syntactic relationship is a noun;
  • the above-mentioned first filtering condition may include but is not limited to one of the following: the syntactic relationship between the candidate verb and the words that have a syntactic relationship with the candidate verb is a verb-object relationship, and the candidate verb has an adjacent word The part of speech of is a noun; the syntactic relationship between the candidate verb and the words that have a syntactic relationship with the candidate verb is a parallel relationship.
  • the syntactic relationship between any candidate verb in a set of candidate verbs and the word that has a syntactic relationship with any candidate verb is a verb-object relationship, and has a syntactic relationship with any candidate verb.
  • any candidate verb is determined as the target verb.
  • the above-mentioned specified syntactic relationship includes a verb-object relationship (i.e., VOB relationship).
  • the above-mentioned second filtering condition also includes the part-of-speech of the word with the syntactic relationship. as a noun.
  • any candidate verb can be determined as the target verb, as specified above
  • Syntactic relations include parallel relations (i.e., COO relations).
  • the associated words of the candidate verbs "remind”, “go out” and “walk” are “I” and "walk” respectively.
  • the syntactic relationship formed between the candidate verb "remind” and “I” is that DBL is not a verb-object relationship or a parallel relationship, so the candidate verb "remind” is not the target verb, and the candidate verb "go out” and "walk”
  • the syntactic relationship formed between them is a parallel relationship, so the candidate verb "go out” can be determined as the target verb.
  • a group of candidate verbs is determined from a group of candidate verbs based on the syntactic relationship between any candidate verb in the group of candidate verbs and the words that have a syntactic relationship with any candidate verb, and the part of speech of the associated words of any candidate verb.
  • the target verb can improve the accuracy of determining the target verb, thereby improving the accuracy of determining the event slot.
  • the target word and the associated words of the target word are determined as the target event slot of the target control statement, including:
  • S82 Determine the combination of event slots corresponding to each target word as the target event slot.
  • the target server can determine the event slot corresponding to each target word and the associated words of each target word, and add the event slot corresponding to each target word.
  • the combination of event slots corresponding to target words is determined as the target event slot.
  • the target words "go", “play” and the words “gymnasium” and “basketball” associated with the target words can be determined as Event slot, the final event slot obtained is: “Go to the gym to play basketball", rather than a single "Play basketball”, so that the slot extraction is more complete and comprehensive, making the question and answer system in NLG (Natural Language Generation, natural language generation) ) broadcast reply, the user experience is better.
  • NLG Natural Language Generation, natural language generation
  • the above-mentioned process of determining the combination of event slots corresponding to each target word as the target event slot may be to combine the adjacent event slots when the event slots corresponding to each target word are adjacent. Splice them into one event slot, and then determine the spliced event slot as the target event slot.
  • the target server can determine the schedule reminder time corresponding to the target schedule from the target control statement. For example, when the control statement is "tomorrow at 8 a.m. When “remind me to go to the gym to play basketball with song A”, you can determine 8 a.m. tomorrow as the schedule reminder time corresponding to the schedule.
  • the target server can construct a schedule reminder statement according to the target event slot.
  • the target reminder statement is used to remind the execution of the target schedule.
  • the above-mentioned process of constructing a schedule reminder sentence according to the target event slot may be: according to the target event slot, extract the words located in the target event slot from the target control sentence, and construct a schedule reminder sentence based on these words.
  • this No restrictions after constructing the schedule reminder sentence, when the target reminder time arrives, the target reminder sentence can be converted into the corresponding reminder voice for voice broadcast.
  • the target reminder time is the reminder time of the target schedule extracted from the target control sentence. .
  • the combination of event slots corresponding to each target word is determined as the event slot, which can improve the accuracy of determining the event slot.
  • This optional example provides an optimization solution that uses syntactic analysis to extract slots for schedule items in the schedule field.
  • syntactic analysis is introduced to assist in extracting event slots in the schedule field.
  • the process of the event slot extraction method in this optional example may include the following steps:
  • Step S802 parse the sentence by calling a syntax analysis algorithm to obtain candidate verbs in the sentence.
  • Step S804 Determine the part-of-speech of adjacent words of the candidate verb and the structural relationship formed between the adjacent words and the candidate verb.
  • Step S806 When the syntactic relationship formed between the adjacent words and the candidate verb is a verb-object relationship or a parallel relationship, the slot where the candidate verb and the words adjacent to the candidate verb are located are determined as event slots.
  • the syntactic structure composed of a verb and a noun immediately adjacent is a verb-object structure VOB (corresponding to the above verb-object relationship)
  • the two VOB structures are also adjacent, the final event slot is the assembly of the two VOBs, for example, "Go to the gym to play basketball.”
  • the final event slot is the assembly of the two VOBs, for example, "Go to the gym to play basketball.”
  • COO parallel relationship consisting of a verb immediately adjacent to the verb
  • syntactic analysis technology is applied to the event slot extraction of complex sentences.
  • the position of the core verb in the sentence, the verb-object structure VOB in the sentence, and the parallel relationship COO can all provide information for the extraction of event slots in the schedule field.
  • Certain auxiliary information makes the slot extraction algorithm more robust and accurate.
  • the introduction of syntactic structure feature information enables more accurate extraction of event slots in complex slots such as schedule matters.
  • the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation.
  • the technical solution of the present disclosure can be embodied in the form of a software product in essence or that contributes to the existing technology.
  • the computer software product is stored in a storage medium (such as ROM (Read-Only Memory, Read-only memory)/RAM (Random Access Memory, disk, optical disk), including a number of instructions to make a terminal device (can be a mobile phone, computer, server, or network device, etc.) to execute this Methods described in various embodiments are disclosed.
  • FIG. 9 is a structural block diagram of an optional event slot extraction device according to an embodiment of the present disclosure. As shown in Figure 9, the device may include:
  • the acquisition unit 902 is configured to acquire the target control statement of the event slot to be extracted, where the target control statement includes multiple words;
  • the analysis unit 904 is connected to the acquisition unit 902 and is configured to perform syntactic analysis on the target control statement and determine a group of candidate words from a plurality of words, wherein the number of candidate words contained in a group of candidate words is less than the number of candidate words contained in the plurality of words. the number of words;
  • the first determination unit 906 is connected to the analysis unit 904 and is configured to determine the target word from a group of candidate words based on the syntactic relationship between each candidate word in the group of candidate words and the associated words of each candidate word, Among them, there is a specified syntactic relationship between the target word and the associated words of the target word;
  • the second determination unit 908 is connected to the first determination unit 906 and is configured to determine the target word and related words of the target word as the target event slot of the target control statement.
  • the acquisition unit 902 in this embodiment can be configured to perform the above step S202
  • the analysis unit 904 in this embodiment can be configured to perform the above step S204
  • the first determination unit 906 in this embodiment can be configured to The above-mentioned step S206 is executed
  • the second determination unit 908 in this embodiment may be configured to execute the above-mentioned step S208.
  • the target control sentence of the event slot to be extracted is obtained, where the target control sentence includes multiple words; the target control sentence is syntactically analyzed to determine a set of candidate words from the multiple words, among which a set of The number of candidate words contained in a candidate word is smaller than the number of words contained in multiple words; based on the syntactic relationship between each candidate word in a set of candidate words and the associated words of each candidate word, from a set of candidate words
  • the target word is determined, in which the target word and the associated words of the target word are designated syntactic relationships; the target word and the associated words of the target word are determined as the target event slot of the target control statement, solving the problems in related technologies
  • the event slot extraction method has the problem of low accuracy in event slot extraction due to the difficulty in exhaustively enumerating event slots. This method improves the accuracy of event slot extraction.
  • the first determining unit includes:
  • the first determination module is configured to determine the target verb from a group of candidate verbs based on the syntactic relationship between each candidate verb in the group of candidate verbs and associated words of each candidate verb, where the group of candidate words includes A set of candidate verbs, the target word includes the target verb.
  • the analysis unit includes:
  • the second determination module is configured to perform syntactic analysis on the target control statement and determine multiple words contained in the target control statement and the part-of-speech of each word in the multiple words;
  • the third determination module is configured to determine each word whose part-of-speech is a verb among the plurality of words as a candidate verb, and obtain a group of candidate verbs.
  • the analysis unit includes:
  • the analysis module is configured to perform syntactic analysis on the target control statement and determine the core verb of the target control instruction
  • the fourth determination module is configured to determine each verb in multiple words that has a specified syntactic relationship with the core verb as a candidate verb, and obtain a set of candidate verbs;
  • a search module configured to search for verbs that have a specified syntactic relationship with any candidate verb in a set of candidate verbs
  • the add module is configured to add the found verb as a candidate verb to a set of candidate verbs when a verb having a specified syntactic relationship with any candidate verb is found.
  • the associated words of each candidate verb are adjacent words of each candidate verb, and the adjacent words of each candidate verb are located after each candidate verb and adjacent to each candidate verb.
  • Word; the first determination module includes:
  • the first determination sub-module is configured to determine candidate verbs from a group of candidate verbs that meet the first filtering condition between adjacent words as the target verb, where the first filtering condition includes a syntactic relationship, which is a specified syntactic relationship.
  • the first determination sub-module includes:
  • the first determination subunit is set so that the syntactic relationship between any candidate verb in a group of candidate verbs and the adjacent words of any candidate verb is a verb-object relationship, and the syntactic relationship between the adjacent words of any candidate verb is When the part-of-speech is a noun, any candidate verb is determined as the target verb, where the specified syntactic relationship includes a verb-object relationship, and the first filtering condition also includes that the part-of-speech of the adjacent word is a noun;
  • the second determination subunit is configured to determine any candidate verb as the target verb when the syntactic relationship between any candidate verb and the adjacent words of any candidate verb is a parallel relationship, wherein the specified syntactic relationship includes parallelism relation.
  • the associated words of each candidate verb are words that have a syntactic relationship with each candidate verb; the first determination module includes:
  • the second determination submodule is configured to determine candidate verbs from a group of candidate verbs that meet the second filtering condition with words that have syntactic relationships as target verbs, where the second filtering condition includes a syntactic relationship that is a specified syntactic relationship.
  • the second determination sub-module includes:
  • the third determination subunit is set so that the syntactic relationship between any candidate verb in a group of candidate verbs and the word that has a syntactic relationship with any candidate verb is a verb-object relationship, and has a syntactic relationship with any candidate verb.
  • the part-of-speech of the word is a noun
  • any candidate verb is determined as the target verb, where the specified syntactic relationship includes a verb-object relationship
  • the second filtering condition also includes the part-of-speech of the word with the syntactic relationship being a noun;
  • the fourth determination subunit is configured to determine any candidate verb as the target verb when the syntactic relationship between any candidate verb and a word that has a syntactic relationship with any candidate verb is a parallel relationship, wherein the syntactic relationship is specified Including parallel relationships.
  • the second determining unit includes:
  • the fifth determination module is configured to determine each target word and associated words of each target word as event slots corresponding to each target word when there are multiple target words;
  • the sixth determination module is configured to determine the combination of event slots corresponding to each target word as the target event slot.
  • the above module as part of the device, can run in the hardware environment as shown in Figure 1, and can be implemented by software or hardware, where the hardware environment includes a network environment.
  • a storage medium is also provided.
  • the above storage medium can be used to execute the program code of any of the above event slot extraction methods in the embodiment of the present disclosure.
  • the above storage medium may be located on at least one network device among multiple network devices in the network shown in the above embodiment.
  • the storage medium is configured to store program codes for performing the following steps:
  • S2 perform syntactic analysis on the target control statement, and determine a set of candidate words from multiple words, where the number of candidate words included in a set of candidate words is smaller than the number of words included in multiple words;
  • S3 Determine the target word from a set of candidate words based on the syntactic relationship between each candidate word in the set of candidate words and the associated words of each candidate word, where the relationship between the target word and the associated words of the target word To specify syntactic relations;
  • S4 Determine the target word and related words of the target word as the target event slot of the target control statement.
  • the above-mentioned storage medium may include but is not limited to: U disk, ROM, RAM, mobile hard disk, magnetic disk or optical disk and other various media that can store program codes.
  • an electronic device for implementing the above event slot extraction method is also provided.
  • the electronic device may be a server, a terminal, or a combination thereof.
  • Figure 10 is a structural block diagram of an optional electronic device according to an embodiment of the present disclosure. As shown in Figure 10, it includes a processor 1002, a communication interface 1004, a memory 1006 and a communication bus 1008. The processor 1002, the communication interface 1004 and memory 1006 complete communication with each other through communication bus 1008, where,
  • Memory 1006 configured to store computer programs
  • processor 1002 When the processor 1002 is configured to execute the computer program stored on the memory 1006, it implements the following steps:
  • S2 perform syntactic analysis on the target control statement, and determine a set of candidate words from multiple words, where the number of candidate words included in a set of candidate words is smaller than the number of words included in multiple words;
  • S3 Determine the target word from a set of candidate words based on the syntactic relationship between each candidate word in the set of candidate words and the associated words of each candidate word, where the relationship between the target word and the associated words of the target word To specify syntactic relations;
  • S4 Determine the target word and related words of the target word as the target event slot of the target control statement.
  • the communication bus may be a PCI (Peripheral Component Interconnect, Peripheral Component Interconnect Standard) bus, or an EISA (Extended Industry Standard Architecture, Extended Industry Standard Architecture) bus, etc.
  • the communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 10, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used for communication between the above-mentioned electronic device and other equipment.
  • the memory may include RAM or non-volatile memory, such as at least one disk memory.
  • the memory may also be at least one storage device located remotely from the aforementioned processor.
  • the memory 1006 may include, but is not limited to, the acquisition unit 902, the analysis unit 904, the first determination unit 906 and the second determination unit 908 in the event slot extraction device. In addition, it may also include but is not limited to other module units in the extraction device of the above-mentioned event slots, which will not be described again in this example.
  • the above-mentioned processor can be a general-purpose processor, which can include but is not limited to: CPU (Central Processing Unit, central processing unit), NP (Network Processor, network processor), etc.; it can also be a DSP (Digital Signal Processing, digital signal processor) ), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • CPU Central Processing Unit, central processing unit
  • NP Network Processor, network processor
  • DSP Digital Signal Processing, digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array, field programmable gate array
  • other programmable logic devices discrete gate or transistor logic devices, discrete hardware components.
  • the device that implements the above event slot extraction method can be a terminal device, and the terminal device can be a smart phone (such as an Android phone, iOS phone, etc.), Tablet computers, handheld computers, and mobile Internet devices (Mobile Internet Devices, MID), PAD and other terminal devices.
  • FIG. 10 does not limit the structure of the above-mentioned electronic device.
  • the electronic device may also include more or fewer components (such as network interfaces, display devices, etc.) than shown in FIG. 10 , or have a different configuration than that shown in FIG. 10 .
  • the program can be stored in a computer-readable storage medium, and the storage medium can Including: flash disk, ROM, RAM, magnetic disk or optical disk, etc.
  • the integrated units in the above embodiments are implemented in the form of software functional units and sold or used as independent products, they can be stored in the above computer-readable storage medium.
  • the technical solution of the present disclosure is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, It includes several instructions to cause one or more computer devices (which can be personal computers, servers or network devices, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the disclosed client can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division.
  • multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the units or modules may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution provided in this embodiment.
  • each functional unit in various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Selective Calling Equipment (AREA)

Abstract

An event slot extraction method and apparatus, a storage medium and an electronic apparatus, relating to the technical field of smart home/home automation, the method comprising: acquiring a target control statement of an event slot to be extracted, the target control statement comprising a plurality of words; performing syntactic analysis on the target control statement, and determining a group of candidate words from among the plurality of words, the number of the candidate words included in the group of candidate words being smaller than the number of words included in the plurality of words; according to the syntactic relationship between each candidate word in the group of candidate words and an associated word of said candidate word, determining a target word from among the group of candidate words, a specified syntactic relationship being between the target word and the associated word of the target word; and determining the target word and the associated word of the target word as a target event slot of the target control statement.

Description

事件槽位的提取方法和装置、存储介质及电子装置Event slot extraction method and device, storage medium and electronic device
本公开要求于2022年4月29日提交中国专利局、申请号为202210468880.X、发明名称“事件槽位的提取方法和装置、存储介质及电子装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。This disclosure claims the priority of the Chinese patent application submitted to the China Patent Office on April 29, 2022, with application number 202210468880.X and invention title "Event Slot Extraction Method and Device, Storage Medium and Electronic Device", all of which The contents are incorporated by reference into this disclosure.
技术领域Technical field
本公开涉及通信领域,具体而言,涉及一种事件槽位的提取方法和装置、存储介质及电子装置。The present disclosure relates to the field of communications, and specifically, to an event slot extraction method and device, a storage medium, and an electronic device.
背景技术Background technique
在智能问答领域中,可以通过槽位提取来理解用户意图,以问答系统中日程领域的槽位抽取为例,涉及的槽位可以包括时间、事件、歌手、歌曲等槽位信息。相对于其他槽位,事件槽的内容更丰富,这就导致其提取更加复杂。In the field of intelligent question answering, user intentions can be understood through slot extraction. Taking slot extraction in the schedule field in the question and answer system as an example, the slots involved can include slot information such as time, events, singers, and songs. Compared with other slots, event slots have richer content, which makes their extraction more complex.
相关技术中,通常采用词典库匹配的方式从语句中提取出事件槽:通过词典库中的词语与语句进行匹配,将语句中与词典库匹配的词语所在的槽位确定为事件槽。In the related art, the dictionary matching method is usually used to extract the event slot from the statement: words in the dictionary are matched with the statement, and the slot in the statement where the word matching the dictionary is located is determined as the event slot.
然而,采用词典库匹配的方式,需要维护事件槽位的数据词典库,通过强匹配策略抽取时间槽,而由于事件槽位无法做到穷举,往往只能用来提取一些高频事件槽,而很难提取低频事件槽,从而降低事件槽位提取的准确性。However, using dictionary matching method requires maintaining a data dictionary of event slots and extracting time slots through a strong matching strategy. However, since event slots cannot be exhaustive, they can often only be used to extract some high-frequency event slots. It is difficult to extract low-frequency event slots, thereby reducing the accuracy of event slot extraction.
由此可见,相关技术中的事件槽位的提取方法存在由于难以穷举事件槽位所导致的事件槽位提取的准确性低的问题。It can be seen that the event slot extraction method in the related art has a problem of low accuracy in event slot extraction due to the difficulty in exhaustively enumerating event slots.
发明内容Contents of the invention
本公开实施例提供了一种事件槽位的提取方法和装置、存储介质及电子装置,以至少解决相关技术中的事件槽位的提取方法存在由于难以穷举事件槽位所导致的事件槽位提取的准确性低的问题。Embodiments of the present disclosure provide an event slot extraction method and device, a storage medium, and an electronic device to at least solve the problem of event slot extraction methods in related technologies due to the difficulty in exhaustively enumerating event slots. The problem of low extraction accuracy.
根据本公开实施例的一个方面,提供了一种事件槽位的提取方法,包括:获取待提取事件槽位的目标控制语句,其中,所述目标控制语句中包括多个词语;对所述目标控制语句进行句法分析,从所述多个词语中确定出一组候选词,其中,所述一组候选词中包含的候选词的数量小于所述多个词语包含的词语的数量;根据所述一组候选词中的每个候选词与所述每个候选词的关联词语之间的句法关系,从所述一组候选词中确定出目标词语,其中,所述目标词语与所述目标词语的关联词语之间为指定句法关系;将所述目标词语、以及所述目标词语的关联词语,确定为所述目标控制语句的目标事件槽位。According to an aspect of an embodiment of the present disclosure, a method for extracting event slots is provided, including: obtaining a target control statement of the event slot to be extracted, wherein the target control statement includes a plurality of words; The control statement performs syntactic analysis to determine a group of candidate words from the plurality of words, wherein the number of candidate words included in the group of candidate words is smaller than the number of words included in the plurality of words; according to the The syntactic relationship between each candidate word in a set of candidate words and the associated words of each candidate word, and the target word is determined from the set of candidate words, wherein the target word and the target word There is a designated syntactic relationship between the related words; the target word and the related words of the target word are determined as the target event slot of the target control statement.
根据本公开实施例的另一个方面,还提供了一种事件槽位的提取装置,包括:获取单元,设置为获取待提取事件槽位的目标控制语句,其中,所述目标控制语句中包括多个词语;分析单元,设置为对所述目标控制语句进行句法分析,从所述多个词语中确定出一组候选词,其中,所述一组候选词中包含的候选词的数量小于所述多个词语包含的词语的数量;第一确定单元,设置为根据所述一组候选词中的每个候选词与所述每个候选词的关联词语之间的句法关系,从所述一组候选词中确定出目标词语,其中,所述目标词语与所述目标词语的关联词语之间为指定句法关系;第二确定单元,设置为将所述目标词语、以及所述目标词语的关联词语,确定为所述目标控制语句的目标事件槽位。According to another aspect of the embodiment of the present disclosure, an event slot extraction device is also provided, including: an acquisition unit configured to acquire a target control statement of the event slot to be extracted, wherein the target control statement includes a plurality of words; an analysis unit configured to perform syntactic analysis on the target control statement and determine a group of candidate words from the plurality of words, wherein the number of candidate words included in the group of candidate words is less than the number of candidate words. The number of words contained in the plurality of words; the first determination unit is configured to determine the number of words from the set of candidate words based on the syntactic relationship between each candidate word in the set of candidate words and the associated words of each candidate word. The target word is determined among the candidate words, wherein there is a specified syntactic relationship between the target word and the associated words of the target word; the second determination unit is configured to determine the target word and the associated words of the target word. , determined as the target event slot of the target control statement.
根据本公开实施例的又一方面,还提供了一种计算机可读的存储介质,该计算机可读的存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述事件槽位的提取方法。According to yet another aspect of the embodiment of the present disclosure, a computer-readable storage medium is also provided. The computer-readable storage medium stores a computer program, wherein the computer program is configured to execute the above event slot during runtime. extraction method.
根据本公开实施例的又一方面,还提供了一种电子装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,上述处理器通过计算机程序执行上述的事件槽位的提取方法。According to another aspect of the embodiment of the present disclosure, an electronic device is also provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned steps through the computer program. Extraction method of event slot.
在本公开实施例中,采用首先提取出控制语句中的候选词、然后根据候选词和关联词语之间的句法关系选取出属于事件槽位的词语及其关联词语的方式,通过获取待提取事件槽位的目标控制语句,其中,目标控制语句中包括多个词语;对目标控制语句进行句法分析,从多个词语中确定出一组候选词,其中,一组候选词中包含的候选词的数量小于多个词语包含的词语的数量;根据一组候选词中 的每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,其中,目标词语与目标词语的关联词语之间为指定句法关系;将目标词语、以及目标词语的关联词语,确定为目标控制语句的目标事件槽位,由于对控制语句进行句法分析,得到控制语句中的候选词,根据候选词与候选词的关联词语所构成的句法结构,选取出可以作为事件槽的词语,并将选取的词语及其关联词语确定为提取的事件槽,通过考虑词性、句法、句子结构等信息特征进行事件槽提取,无需维护事件槽位对应的词典库,可以基于句法结构实现准确提取事件槽位的目的,达到了提高事件槽位提取的准确性的技术效果,进而解决了相关技术中的事件槽位的提取方法存在由于难以穷举事件槽位所导致的事件槽位提取的准确性低的问题。In the embodiment of the present disclosure, the candidate words in the control statement are first extracted, and then words belonging to the event slot and their associated words are selected based on the syntactic relationship between the candidate words and associated words. By obtaining the event to be extracted The target control statement of the slot, wherein the target control statement includes multiple words; perform syntactic analysis on the target control statement, and determine a group of candidate words from the multiple words, wherein the candidate words included in the group of candidate words are The number is less than the number of words contained in multiple words; according to the syntactic relationship between each candidate word in the set of candidate words and the associated words of each candidate word, the target word is determined from a set of candidate words, where the target word There is a specified syntactic relationship between the word and the associated words of the target word; the target word and the associated words of the target word are determined as the target event slot of the target control statement. Due to the syntactic analysis of the control statement, the candidates in the control statement are obtained words, according to the syntactic structure formed by the candidate words and the associated words of the candidate words, select words that can be used as event slots, and determine the selected words and their associated words as the extracted event slots, by considering the part of speech, syntax, and sentence structure When extracting event slots using other information features, there is no need to maintain a dictionary corresponding to the event slot. The purpose of accurately extracting event slots can be achieved based on the syntactic structure, achieving the technical effect of improving the accuracy of event slot extraction, and thus solving the related technical problems. The event slot extraction method in has the problem of low accuracy in event slot extraction due to the difficulty in exhaustively enumerating event slots.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, for those of ordinary skill in the art, It is said that other drawings can be obtained based on these drawings without exerting creative labor.
图1是根据本公开实施例的一种可选的事件槽位的提取方法的硬件环境的示意图;Figure 1 is a schematic diagram of the hardware environment of an optional event slot extraction method according to an embodiment of the present disclosure;
图2是根据本公开实施例的一种可选的事件槽位的提取方法的流程示意图;Figure 2 is a schematic flowchart of an optional event slot extraction method according to an embodiment of the present disclosure;
图3是根据本公开实施例的一种可选的语句结构的示意图;Figure 3 is a schematic diagram of an optional statement structure according to an embodiment of the present disclosure;
图4是根据本公开实施例的另一种可选的语句结构的示意图;Figure 4 is a schematic diagram of another optional statement structure according to an embodiment of the present disclosure;
图5是根据本公开实施例的又一种可选的语句结构的示意图;Figure 5 is a schematic diagram of yet another optional statement structure according to an embodiment of the present disclosure;
图6是根据本公开实施例的又一种可选的语句结构的示意图;Figure 6 is a schematic diagram of yet another optional statement structure according to an embodiment of the present disclosure;
图7是根据本公开实施例的又一种可选的语句结构的示意图;Figure 7 is a schematic diagram of yet another optional statement structure according to an embodiment of the present disclosure;
图8是根据本公开实施例的另一种可选的事件槽位的提取方法的流程示意图;Figure 8 is a schematic flowchart of another optional event slot extraction method according to an embodiment of the present disclosure;
图9是根据本公开实施例的一种可选的事件槽位的提取装置的结构框图;Figure 9 is a structural block diagram of an optional event slot extraction device according to an embodiment of the present disclosure;
图10是根据本公开实施例的一种可选的电子装置的结构框图。Figure 10 is a structural block diagram of an optional electronic device according to an embodiment of the present disclosure.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本公开方案,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分的实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。In order to enable those skilled in the art to better understand the present disclosure, the following will clearly and completely describe the technical solutions in the present disclosure embodiments in conjunction with the accompanying drawings. Obviously, the described embodiments are only These are part of the embodiments of this disclosure, not all of them. Based on the embodiments in this disclosure, all other embodiments obtained by those of ordinary skill in the art without creative efforts should fall within the scope of protection of this disclosure.
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the description and claims of the present disclosure and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the disclosure described herein can be practiced in sequences other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, e.g., a process, method, system, product, or apparatus that encompasses a series of steps or units and need not be limited to those explicitly listed. Those steps or elements may instead include other steps or elements not expressly listed or inherent to the process, method, product or apparatus.
根据本公开实施例的一个方面,提供了一种事件槽位的提取方法。该事件槽位的提取方法可以应用于智慧家庭(Smart Home)、智能家居、智能家用设备生态、智慧住宅(Intelligence House)生态等全屋智能数字化控制应用场景,可选地,在本实施例中,上述事件槽位的提取方法可以应用于如图1所示的由终端102和服务器104所构成的硬件环境中。如图1所示,服务器104通过网络与终端102进行连接,可用于为终端或终端上安装的客户端提供服务(如应用服务等),可在服务器上或独立于服务器配置云计算和/或边缘计算服务,用于为服务器104提供数据运算服务。According to one aspect of an embodiment of the present disclosure, a method for extracting event slots is provided. The event slot extraction method can be applied to whole-house intelligent digital control application scenarios such as smart home, smart home, smart home device ecology, and intelligent residence (Intelligence House) ecology. Optionally, in this embodiment , the above event slot extraction method can be applied to the hardware environment composed of the terminal 102 and the server 104 as shown in Figure 1 . As shown in Figure 1, the server 104 is connected to the terminal 102 through the network and can be used to provide services (such as application services, etc.) for the terminal or the client installed on the terminal. Cloud computing and/or can be configured on the server or independently of the server. Edge computing services are used to provide data computing services for the server 104.
上述网络可以包括但不限于以下至少之一:有线网络,无线网络。上述有线网络可以包括但不限于以下至少之一:广域网,城域网,局域网,上述无线网络可以包括但不限于以下至少之一:WIFI(Wireless Fidelity,无线保真),蓝牙。 终端102可以并不限定于为PC、手机、平板电脑、智能空调、智能烟机、智能冰箱、智能烤箱、智能炉灶、智能洗衣机、智能热水器、智能洗涤设备、智能洗碗机、智能投影设备、智能电视、智能晾衣架、智能窗帘、智能影音、智能插座、智能音响、智能音箱、智能新风设备、智能厨卫设备、智能卫浴设备、智能扫地机器人、智能擦窗机器人、智能拖地机器人、智能空气净化设备、智能蒸箱、智能微波炉、智能厨宝、智能净化器、智能饮水机、智能门锁等。The above-mentioned network may include but is not limited to at least one of the following: wired network, wireless network. The above-mentioned wired network may include but is not limited to at least one of the following: wide area network, metropolitan area network, and local area network. The above-mentioned wireless network may include at least one of the following: WIFI (Wireless Fidelity, Wireless Fidelity), Bluetooth. The terminal 102 may be, but is not limited to, a PC, a mobile phone, a tablet, a smart air conditioner, a smart hood, a smart refrigerator, a smart oven, a smart stove, a smart washing machine, a smart water heater, a smart washing equipment, a smart dishwasher, a smart projection device, Smart TV, smart clothes drying rack, smart curtains, smart audio and video, smart sockets, smart audio, smart speakers, smart fresh air equipment, smart kitchen and bathroom equipment, smart bathroom equipment, smart sweeping robot, smart window cleaning robot, smart mopping robot, smart Air purification equipment, smart steamers, smart microwave ovens, smart kitchen appliances, smart purifiers, smart water dispensers, smart door locks, etc.
本公开实施例的事件槽位的提取方法可以由服务器104来执行,也可以由终端102来执行,还可以是由服务器104和终端102共同执行。其中,终端102执行本公开实施例的事件槽位的提取方法也可以是由安装在其上的客户端来执行。The event slot extraction method in the embodiment of the present disclosure may be executed by the server 104, may be executed by the terminal 102, or may be executed jointly by the server 104 and the terminal 102. Wherein, the terminal 102 may also perform the event slot extraction method of the embodiment of the present disclosure by a client installed thereon.
以由服务器104来执行本实施例中的事件槽位的提取方法为例,图2是根据本公开实施例的一种可选的事件槽位的提取方法的流程示意图,如图2所示,该方法的流程可以包括以下步骤:Taking the server 104 executing the event slot extraction method in this embodiment as an example, Figure 2 is a schematic flowchart of an optional event slot extraction method according to an embodiment of the present disclosure. As shown in Figure 2, The process of this method may include the following steps:
步骤S202,获取待提取事件槽位的目标控制语句,其中,目标控制语句中包括多个词语。Step S202: Obtain the target control statement of the event slot to be extracted, where the target control statement includes multiple words.
本实施例中的事件槽位的提取方法可以应用在智能家居或者智慧家庭领域中对控制语句中的事件槽位进行提取的场景,可以应用于如语音智能问答系统等具有人机交互功能的系统中。用户通过输入控制语句,机器对控制语句进行槽位提取,从而分析用户意图,对不同的用户意图进行响应。比如,对于日程领域的槽位抽取,涉及的槽位可以包括时间、事件、歌手、歌曲等槽位信息。The event slot extraction method in this embodiment can be applied to the scenario of extracting event slots in control statements in the smart home or smart home field, and can be applied to systems with human-computer interaction functions such as voice intelligent question and answer systems. middle. The user inputs a control statement, and the machine extracts the slot of the control statement, thereby analyzing the user's intention and responding to different user intentions. For example, for slot extraction in the schedule field, the slots involved may include slot information such as time, events, singers, and songs.
对于不同的槽位,可以采用不同的槽位提取方式。比如,时间槽位可以借助正则表达式所定义的匹配规则来提取槽位,歌手、歌曲槽位可以通过维护数据词典库,借助词曲数据匹配来提取槽位,而事件槽位的内容更丰富,导致其提取更加复杂。For different slots, different slot extraction methods can be used. For example, time slots can be extracted using matching rules defined by regular expressions. Singer and song slots can be extracted by maintaining a data dictionary and matching lyrics and music data. Event slots are richer in content. making its extraction more complex.
为了提取事件槽位(例如,日程领域的事件槽位提取),通常采用词典库匹配的方式从语句中提取出事件槽,然而,由于难以穷举事件槽位(比如,日程领域的事件槽位无法做到穷举),导致事件槽位提取的准确性低。并且,即使可以穷举,如果词典库过于庞大,将一定程度影响系统运行速度,因此,强匹配策略也大大降低了其泛化性。In order to extract event slots (for example, event slot extraction in the schedule field), the dictionary matching method is usually used to extract event slots from statements. However, due to the difficulty in exhaustively enumerating event slots (for example, event slots in the schedule field) cannot be exhaustive), resulting in low accuracy in event slot extraction. Moreover, even if exhaustive enumeration is possible, if the dictionary library is too large, it will affect the system running speed to a certain extent. Therefore, the strong matching strategy also greatly reduces its generalization.
此外,还可以采用算法模型自动提取事件槽位,通过大规模的数据训练模型,让模型自动提取槽位。然而,该方式仅适用于槽位数目有限的槽位提取任务,适用范围较窄,在槽位数目可控的任务上比较常用,但该方法泛化的可控性低于词典库。可见,基于词典匹配、算法模型进行事件槽位的提取方式,无法兼顾泛化性和精准度这两个方面。In addition, algorithm models can also be used to automatically extract event slots, and the model can be trained with large-scale data to automatically extract slots. However, this method is only suitable for slot extraction tasks with a limited number of slots, and has a narrow scope of application. It is more commonly used in tasks with a controllable number of slots, but the generalization controllability of this method is lower than that of the dictionary library. It can be seen that the method of extracting event slots based on dictionary matching and algorithm models cannot take into account the two aspects of generalization and accuracy.
考虑到如“打\v篮球\n”、“去\v学校\n”这些事件槽位都是动词加名词结构,可以借助词性分析制定词性规则来匹配提取事件槽位,将句子中所有符合词性规则的短语都提取为事件槽位。比如,动词加名词的句法结构就认为是事件槽,其泛化性强于词典库。Considering that event slots such as "playing basketball\n" and "going to school" are all verbs plus noun structures, part-of-speech rules can be formulated with the help of part-of-speech analysis to match and extract event slots, and all matches in the sentence can be Phrases of part-of-speech rules are extracted as event slots. For example, the syntactic structure of a verb plus a noun is considered an event slot, and its generalization is stronger than that of the dictionary database.
如果只考虑词性特征信息,而不考虑句法、句子结构特征,通过制定词性规则来匹配提取事件槽位的方式,会降低事件槽位提取的精确度。比如,由于不符合动名词规则,像“出去\v玩\v”、“去\v体育馆\n打\v篮球\n”这些事件槽,则不会被提取出来。再比如,“用\v歌曲\n A\n”也满足定义的事件槽位规则,歌曲槽位就被误提取为事件槽位,这给问答系统进行语义理解增加了一定难度。If only part-of-speech feature information is considered, without considering syntax and sentence structure features, formulating part-of-speech rules to match the way to extract event slots will reduce the accuracy of event slot extraction. For example, because they do not comply with the gerund rules, event slots such as "Go out to play" and "Go to the gym to play basketball" will not be extracted. For another example, "use \v song\n A\n" also satisfies the defined event slot rules, and the song slot is mistakenly extracted as an event slot. This adds a certain degree of difficulty to the semantic understanding of the question and answer system.
为了提高事件槽位提取的精准性,在本实施例中,可以引入句法分析,制定匹配规则,来辅助提取事件槽位,通过考虑句法和句子结构特征对控制语句进行解析,确定对应的事件槽位,可以提高事件槽位提取的精准度。In order to improve the accuracy of event slot extraction, in this embodiment, syntactic analysis can be introduced and matching rules can be formulated to assist in extracting event slots. The control statements are parsed by considering syntactic and sentence structure features to determine the corresponding event slot. bit, which can improve the accuracy of event slot extraction.
目标服务器(上述服务器104的一种示例)可以获取待提取事件槽位的目标控制语句,该目标控制语句中包括多个词语。上述获取待提取事件槽位的目标控制语句的过程,可以是通过与终端设备建立的通信连接接收终端设备发送的目标控制语句。The target server (an example of the above-mentioned server 104) can obtain the target control statement of the event slot to be extracted, and the target control statement includes multiple words. The above process of obtaining the target control statement of the event slot to be extracted may include receiving the target control statement sent by the terminal device through a communication connection established with the terminal device.
用户可以通过语音等方式向其终端设备发出目标控制语句,该控制语句可以用于控制终端设备执行目标设备操作。在获取到目标控制语句之后,终端设备可以向目标服务器发送获取到的目标控制语句。可选地,目标服务器获取到的可以是包含控制语句的目标语音指令等,通过对接收到目标语音指令进行解析,从而提取出其中包含的目标控制语句。The user can issue a target control statement to his terminal device through voice or other means, and the control statement can be used to control the terminal device to perform the target device operation. After obtaining the target control statement, the terminal device may send the obtained target control statement to the target server. Optionally, what the target server obtains may be a target voice instruction containing a control statement, etc., and extracts the target control statement contained therein by parsing the received target voice instruction.
例如,当智能手机(上述终端设备)获取到用户发送的“明天上午8点提醒我去体育馆打篮球”的语音指令后,可以将获取到的语音指令发送至服务器。对 于接收到语音指令,服务器可以对语音指令进行解析,提取出其中的控制语句。For example, after the smartphone (the above-mentioned terminal device) obtains the voice command sent by the user "Remind me to go to the gym to play basketball at 8 am tomorrow", it can send the obtained voice command to the server. For receiving voice commands, the server can parse the voice commands and extract the control statements.
需要说明的是,上述目标控制指令所指示的设备操作可以是终端设备执行的设备操作,也可以是控制除了终端设备以外的其他设备(例如,智能家居设备)执行的设备操作。上述设备操作可以是调整设备参数的操作,也可以是设置日程等的操作,对应地,目标控制指令可以是日程提醒指令,例如,目标控制指令可以是“明天早上开启热水器”等设备控制指令,也是可以是“明天上午8点提醒我去体育馆打篮球”等日程提醒指令,本实施例中对此不做限定。It should be noted that the device operation indicated by the above target control instruction may be a device operation performed by the terminal device, or may be a device operation performed by controlling other devices (for example, smart home devices) other than the terminal device. The above-mentioned equipment operation may be an operation of adjusting equipment parameters, or may be an operation of setting a schedule, etc. Correspondingly, the target control instruction may be a schedule reminder instruction. For example, the target control instruction may be an equipment control instruction such as "turn on the water heater tomorrow morning." It can also be a schedule reminder instruction such as "remind me to go to the gym to play basketball at 8 o'clock tomorrow morning", etc., which is not limited in this embodiment.
步骤S204,对目标控制语句进行句法分析,从多个词语中确定出一组候选词,其中,一组候选词中包含的候选词的数量小于多个词语包含的词语的数量。Step S204: Perform syntax analysis on the target control statement to determine a group of candidate words from multiple words, where the number of candidate words included in a group of candidate words is smaller than the number of words included in the multiple words.
在本实施例中,在获取目标控制语句之后,目标服务器可以对目标控制语句进行句法分析,从多个词语中确定出一组候选词,上述一组候选词中包含的候选词的数量小于多个词语包含的词语的数量,提取的一组候选词可以是目标控制语句中词性为动词的词语,也可以是词性为名词的词语。当一组候选词为一组候选动词时,可以是控制语句中的核心动词,可以是控制语句中的全部动词,可以是控制语句中位于核心动词之后的动词,可以是控制语句中位于核心动词之前的动词,可以是对于控制语句的语义权重超过权重阈值的动词,还可以是可以是目标控制语句中位于目标控制语句的非端点位置的动词,本实施例中对此不做限定。In this embodiment, after acquiring the target control statement, the target server can perform syntax analysis on the target control statement and determine a group of candidate words from multiple words. The number of candidate words included in the group of candidate words is less than a plurality of candidate words. The number of words contained in each word. The extracted group of candidate words can be words with a verb part of speech in the target control sentence, or words with a noun part of speech. When a group of candidate words is a group of candidate verbs, it can be the core verb in the control statement, it can be all the verbs in the control statement, it can be the verb after the core verb in the control statement, or it can be the core verb in the control statement. The previous verb may be a verb whose semantic weight for the control statement exceeds the weight threshold, or may be a verb located at a non-endpoint position of the target control statement in the target control statement, which is not limited in this embodiment.
上述对目标控制语句进行句法分析的过程,可以是利用句法分析算法对目标控制语句进行句法分析(可以是使用开源的句法分析软件实现的),提取目标控制语句中所包括的多个词语(可以是提取句子的核心词),对多个词语中每个词语的词性以及词语之间的关系进行分析(即,提取句子的结构特征),提取句子的词性以及句子结构特征。The above-mentioned process of syntactic analysis of the target control statement may be to use a syntactic analysis algorithm to perform syntactic analysis of the target control statement (which may be implemented using open source syntax analysis software), and extract multiple words included in the target control statement (which may be It is to extract the core words of the sentence), analyze the part-of-speech of each word in multiple words and the relationship between the words (that is, extract the structural features of the sentence), and extract the part-of-speech and sentence structure features of the sentence.
可选地,目标服务器在确定目标控制语句中所包括的多个词语以及词语的词性之后,可以根据多个词语中每个词语的词性,确定一组候选词,也可以先对目标控制语句进行句法分析,确定出目标控制语句中的核心词语,再将与核心词语之间具有指定句法关系的词语,确定为候选词,进而得到一组候选词,还可以是其他确定候选词的方式,本实施例中对此不做限定。Optionally, after determining the plurality of words included in the target control sentence and the part-of-speech of the words, the target server may determine a group of candidate words based on the part-of-speech of each word in the multiple words, or may first perform a query on the target control sentence. Syntactic analysis determines the core words in the target control sentence, and then determines the words that have a specified syntactic relationship with the core words as candidate words, and then obtains a set of candidate words. It can also use other methods to determine candidate words. This is not limited in the examples.
例如,如图3所示,当控制语句为“明天上午8点用歌曲A提醒我去体育馆 打篮球”时,可以将语句中的“提醒”、“去”以及“打”确定为一组候选词。For example, as shown in Figure 3, when the control statement is "Remind me to go to the gymnasium to play basketball with song A at 8 o'clock tomorrow morning", the "remind", "go" and "play" in the statement can be determined as a set of candidates word.
步骤S206,根据一组候选词中的每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,其中,目标词语与目标词语的关联词语之间为指定句法关系。Step S206: Determine the target word from a group of candidate words based on the syntactic relationship between each candidate word in the group of candidate words and the associated words of each candidate word, where the target word and the associated words of the target word are specifies the syntactic relationship between them.
在本实施例中,在从多个词语中确定出一组候选词之后,目标服务器可以确定与一组候选词中每个候选词的关联词语。上述每个候选词的关联词语可以是每个候选词的相邻词语,也可以是与每个候选词之间具有句法关系的词语,还可以是与每个候选词之间具有其他关联关系的词语,例如,可以是每个候选词后的第一个词语,也可以是每个候选词后的第一个名词,本实施例中对此不做限定。In this embodiment, after determining a set of candidate words from a plurality of words, the target server may determine words associated with each candidate word in the set of candidate words. The associated words of each of the above candidate words may be adjacent words of each candidate word, or may be words that have a syntactic relationship with each candidate word, or may be other associated relationships with each candidate word. The word, for example, may be the first word after each candidate word, or the first noun after each candidate word, which is not limited in this embodiment.
在本实施例中,目标服务器在确定与每个候选词的关联词语之后,可以根据每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,可选地,可以将一组候选词中与关联词语之间的句法关系为指定句法关系的候选词,确定为目标词语。In this embodiment, after determining the associated words with each candidate word, the target server may determine the target word from a group of candidate words based on the syntactic relationship between each candidate word and the associated words of each candidate word. , optionally, a candidate word whose syntactic relationship between a set of candidate words and associated words is a specified syntactic relationship can be determined as the target word.
例如,指定句法关系为动宾关系或者并列关系,每个候选词的关联词语为候选词的相邻词语,控制语句“明天上午8点用歌曲A提醒我去体育馆打篮球”中的候选词包括:“提醒”,“去”,“打”,“提醒”的相邻词语是“我”,两者之间形成的句法关系既不是动宾关系或者并列关系,因此,“提醒”不是目标词语,“去”的相邻词语是“体育馆”,两者之间形成的句法关系为动宾关系,“去”是目标词语,“打”的相邻词语是“篮球”,两者之间形成的句法关系也为动宾关系,“打”也是目标词语。For example, specify the syntactic relationship as a verb-object relationship or a parallel relationship, and the associated words of each candidate word are the adjacent words of the candidate word. The candidate words in the control statement "Remind me to go to the gym to play basketball with song A at 8 o'clock tomorrow morning" include : "Reminder", "Go", "Hit", "Reminder" adjacent words are "I", the syntactic relationship formed between the two is neither a verb-object relationship nor a parallel relationship, therefore, "Remind" is not the target word , the word adjacent to "go" is "gymnasium", and the syntactic relationship formed between the two is a verb-object relationship. "Go" is the target word, and the word adjacent to "play" is "basketball", and the relationship formed between the two is The syntactic relationship of is also a verb-object relationship, and "beat" is also a target word.
步骤S208,将目标词语、以及目标词语的关联词语,确定为目标控制语句的目标事件槽位。Step S208: Determine the target word and related words of the target word as the target event slot of the target control statement.
在本实施例中,在确定目标词语之后,目标服务器可以将目标词语以及与目标词语的关联词语,确定为目标控制语句的目标事件槽位。可选地,在目标词语为多个的情况下,可以将每个目标词语以及每个目标词语的关联词语,都确定为目标控制语句的目标事件槽位。In this embodiment, after determining the target word, the target server may determine the target word and words associated with the target word as the target event slot of the target control statement. Optionally, when there are multiple target words, each target word and associated words of each target word may be determined as the target event slot of the target control statement.
例如,当服务器从控制语句“明天上午8点拿用歌曲A提醒我去体育馆打篮 球”中,确定目标词语包括“去”和“打”时,可以将“去体育馆”和“打篮球”都确定为该控制语句的事件槽位。For example, when the server determines that the target words include "go" and "play" from the control statement "Remind me to go to the gymnasium to play basketball with song A at 8 o'clock tomorrow morning", it can add both "go to the gym" and "play basketball". Determine the event slot of this control statement.
通过上述步骤,获取待提取事件槽位的目标控制语句,其中,目标控制语句中包括多个词语;对目标控制语句进行句法分析,从多个词语中确定出一组候选词,其中,一组候选词中包含的候选词的数量小于多个词语包含的词语的数量;根据一组候选词中的每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,其中,目标词语与目标词语的关联词语之间为指定句法关系;将目标词语、以及目标词语的关联词语,确定为目标控制语句的目标事件槽位,解决了相关技术中的事件槽位的提取方法存在由于难以穷举事件槽位所导致的事件槽位提取的准确性低的问题,提高了事件槽位提取的准确性。Through the above steps, the target control statement of the event slot to be extracted is obtained, where the target control statement includes multiple words; the target control statement is syntactically analyzed to determine a group of candidate words from the multiple words, wherein a group of The number of candidate words contained in a candidate word is smaller than the number of words contained in multiple words; based on the syntactic relationship between each candidate word in a set of candidate words and the associated words of each candidate word, from a set of candidate words The target word is determined, in which the target word and the associated words of the target word are designated syntactic relationships; the target word and the associated words of the target word are determined as the target event slot of the target control statement, solving the problems in related technologies The event slot extraction method has the problem of low accuracy in event slot extraction due to the difficulty in exhaustively enumerating event slots. This method improves the accuracy of event slot extraction.
在一个示例性实施例中,根据一组候选词中的每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,包括:In an exemplary embodiment, the target word is determined from a set of candidate words based on the syntactic relationship between each candidate word in the set of candidate words and associated words of each candidate word, including:
S11,根据一组候选动词中的每个候选动词与每个候选动词的关联词语之间的句法关系,从一组候选动词中确定出目标动词,其中,一组候选词包括一组候选动词,目标词语包括目标动词。S11, determine the target verb from a group of candidate verbs based on the syntactic relationship between each candidate verb in the group of candidate verbs and the associated words of each candidate verb, where the group of candidate words includes a group of candidate verbs, Target words include target verbs.
由于控制语句中的事件槽位一般都包含动词,因此,在对目标控制语句进行句法分析,从多个词语中确定出一组候选词之后,可以先在一组候选词中确定目标动词,再根据目标动词,确定目标控制语句的目标事件槽位。可选地,上述在一组候选词中确定目标动词的过程可以是:根据一组候选动词中的每个候选动词与每个候选动词的关联词语之间的句法关系,从一组候选动词中确定出目标动词,上述一组候选词包括一组候选动词,目标词语包括目标动词。Since event slots in control statements generally contain verbs, after performing syntactic analysis on the target control statement and determining a group of candidate words from multiple words, the target verb can be determined first among a group of candidate words, and then According to the target verb, determine the target event slot of the target control statement. Optionally, the above-mentioned process of determining the target verb in a group of candidate words may be: based on the syntactic relationship between each candidate verb in the group of candidate verbs and the associated words of each candidate verb, from a group of candidate verbs The target verb is determined, the above group of candidate words includes a group of candidate verbs, and the target word includes the target verb.
上述句法关系可以是指定句法关系,也可以是其他句法关系。可选地,可以将一组候选动词中与关联词语之间的句法关系为指定句法关系的候选动词,确定为目标动词。The above-mentioned syntactic relations may be specified syntactic relations or other syntactic relations. Optionally, a candidate verb in a set of candidate verbs whose syntactic relationship with the associated word is a specified syntactic relationship may be determined as the target verb.
例如,当指定句法关系为动宾关系或者并列关系,每个候选动词的关联词语为候选动词的相邻词语时,控制语句“明天上午8点用歌曲A提醒我去体育馆打篮球”中的候选动词包括:“提醒”,“去”,“打”,“提醒”的相邻词语是“我”,两者之间形成的句法关系既不是动宾关系或者并列关系,因此,“提醒”不是目 标动词,“去”的相邻词语是“体育馆”,两者之间形成的句法关系为动宾关系,“去”是目标动词,“打”的相邻词语是“篮球”,两者之间形成的句法关系也为动宾关系,“打”也是目标动词。For example, when the syntactic relationship is specified as a verb-object relationship or a parallel relationship, and the associated words of each candidate verb are adjacent words of the candidate verb, the candidates in the control statement "Remind me to go to the gym to play basketball with song A at 8 o'clock tomorrow morning" Verbs include: "remind", "go", "beat", the adjacent word of "remind" is "I", the syntactic relationship formed between the two is neither a verb-object relationship nor a parallel relationship, therefore, "remind" is not The target verb, the adjacent word of "go" is "gymnasium", and the syntactic relationship formed between the two is a verb-object relationship. "Go" is the target verb, and the adjacent word of "play" is "basketball". The syntactic relationship formed between them is also a verb-object relationship, and "beat" is also the target verb.
通过本实施例,根据控制语句中的每个候选动词与每个候选动词的关联词语之间的句法关系,从控制语句中确定出目标动词,可以提高目标动词的确定效率。Through this embodiment, the target verb is determined from the control statement according to the syntactic relationship between each candidate verb in the control statement and the associated words of each candidate verb, which can improve the efficiency of determining the target verb.
在一个示例性实施例中,对目标控制语句进行句法分析,从多个词语中确定出一组候选词,包括:In an exemplary embodiment, syntactic analysis is performed on the target control statement to determine a set of candidate words from multiple words, including:
S21,对目标控制语句进行句法分析,确定目标控制语句包含的多个词语以及多个词语中的每个词语的词性;S21, perform syntactic analysis on the target control statement, and determine multiple words contained in the target control statement and the part-of-speech of each word in the multiple words;
S22,将多个词语中词性为动词的每个词语,确定为一个候选动词,得到一组候选动词。S22: Determine each word whose part-of-speech is a verb among the plurality of words as a candidate verb, and obtain a set of candidate verbs.
在本实施例中,目标服务器可以对目标控制语句进行句法分析,确定目标控制语句包含的多个词语以及多个词语中的每个词语的词性。可选地,除了每个词语的词性之外,目标服务器还可以确定每个词语与多个词语中除了该词语之外的词语之间的句法关系。例如,服务器可以对控制语句“明天上午8点拿歌曲A提醒我去体育馆打篮球”进行句法分析,得到控制语句所包括的多个词语中的每个词语的词性、以及每个词语与其他词语之间的句法关系。In this embodiment, the target server may perform syntax analysis on the target control statement, and determine multiple words included in the target control sentence and the part-of-speech of each word in the multiple words. Optionally, in addition to the part of speech of each word, the target server may also determine a syntactic relationship between each word and words other than the word among the plurality of words. For example, the server can perform syntactic analysis on the control statement "Remind me to go to the gymnasium to play basketball with song A at 8 o'clock tomorrow morning" and obtain the part of speech of each word in the multiple words included in the control statement, as well as the relationship between each word and other words. syntactic relationship between them.
上述对目标控制语句进行句法分析,确定目标控制语句包含的多个词语以及多个词语中的每个词语的词性的过程可以是:通过词性标注算法对目标控制语句进行分析,得到目标控制语句包含的多个词语以及多个词语中的每个词语的词性。上述词性标注算法可以至少包括以下之一:序列模型,隐马尔可夫模型(Hidden Markov Model,HMM),马尔可夫模型,例如,MEMM(Maximum Entropy Markov Model,最大熵马尔可夫模型)、CRFS(Conditional Random Fields,条件随机场)等广义上的马尔可夫模型,深度学习,例如,机器学习的常规分类器(例如,SVM(Support Vector Machine,支持向量机),以RNN(Recurrent Neural Network,循环神经网络)为代表的深度学习算法等,本实施例中对此不做限定。The above-mentioned process of performing syntactic analysis on the target control sentence and determining the multiple words contained in the target control sentence and the part-of-speech of each word in the multiple words may be: analyzing the target control sentence through a part-of-speech tagging algorithm to obtain that the target control sentence contains of multiple words and the part of speech for each of the multiple words. The above-mentioned part-of-speech tagging algorithm may include at least one of the following: sequence model, hidden Markov model (Hidden Markov Model, HMM), Markov model, for example, MEMM (Maximum Entropy Markov Model, maximum entropy Markov model), CRFS (Conditional Random Fields, Conditional Random Fields) and other Markov models in a broad sense, deep learning, for example, conventional classifiers of machine learning (for example, SVM (Support Vector Machine, Support Vector Machine), to RNN (Recurrent Neural Network, Deep learning algorithms represented by recurrent neural networks), etc. are not limited in this embodiment.
例如,如图3、图4、图5、图6以及图7所示,可以对不同的控制语句进行 句法分析,确定出不同控制语句中包含的词语、词语的词性以及词语之间的句法关系,其中,词语下方的nt(temporal noun,时间名词)、p(preposition,时间名词)、n(general noun,一般名词)、v(verb,动词)、r(pronoun,代词)、wp(punctuation,标点)、nl(location noun,地点名词)、nd(direction noun,方向名词)用于表示对应词语的词性,控制语句上方的HED(head,核心词)、ATT(attribute,定中关系)、ADV(adverbial,状中结构)、POB(preposition-object,介宾关系)、DBL(double,兼语)、VOB(verb-object,动宾关系)、COO(coordinate,并列关系)、SBV(subject-verb,主谓关系)、WP(punctuation,标点)用于表示词语之间的句法关系,箭头用于表示词语之间的指向关系。For example, as shown in Figures 3, 4, 5, 6 and 7, different control statements can be syntactically analyzed to determine the words contained in the different control statements, the parts of speech of the words, and the syntactic relationships between the words. , among them, nt (temporal noun, time noun), p (preposition, time noun), n (general noun, general noun), v (verb, verb), r (pronoun, pronoun), wp (punctuation, Punctuation), nl (location noun, location noun), nd (direction noun, direction noun) are used to indicate the part of speech of the corresponding word, and control the HED (head, core word), ATT (attribute, centering relationship), ADV above the statement (adverbial, adverbial structure), POB (preposition-object, preposition-object relationship), DBL (double, concurrent language), VOB (verb-object, verb-object relationship), COO (coordinate, parallel relationship), SBV (subject- verb (subject-predicate relationship) and WP (punctuation, punctuation) are used to express the syntactic relationship between words, and arrows are used to express the directional relationship between words.
在确定目标控制语句包含的多个词语以及每个词语的词性之后,目标服务器可以将多个词语中词性为动词的每个词语,确定为一个候选动词,得到一组候选动词,即,一组候选动词中为多个词语中的所有动词。After determining the multiple words contained in the target control statement and the part-of-speech of each word, the target server can determine each word whose part-of-speech is a verb among the multiple words as a candidate verb, and obtain a set of candidate verbs, that is, a set of The candidate verbs are all verbs in multiple words.
通过本实施例,将控制语句中词性为动词的词语确定为候选动词,可以简化候选动词的确定步骤,提高候选动词的确定效率。Through this embodiment, words whose part-of-speech is verb in the control statement are determined as candidate verbs, which can simplify the determination steps of candidate verbs and improve the efficiency of determining candidate verbs.
在一个示例性实施例中,对目标控制语句进行句法分析,从多个词语中确定出一组候选词,包括:In an exemplary embodiment, syntactic analysis is performed on the target control statement to determine a set of candidate words from multiple words, including:
S31,对目标控制语句进行句法分析,确定目标控制指令的核心动词;S31, perform syntactic analysis on the target control statement and determine the core verb of the target control instruction;
S32,将多个词语中与核心动词之间具有指定句法关系的每个动词,分别确定为一个候选动词,得到一组候选动词;S32, determine each verb in multiple words that has a specified syntactic relationship with the core verb as a candidate verb, and obtain a set of candidate verbs;
S33,查找与一组候选动词中的任一候选动词具有指定句法关系的动词;S33, find verbs that have a specified syntactic relationship with any candidate verb in a set of candidate verbs;
S34,在查找到与任一候选动词具有指定句法关系的动词的情况下,将查找到的动词作为候选动词添加到一组候选动词中。S34: When a verb having a specified syntactic relationship with any candidate verb is found, add the found verb as a candidate verb to a group of candidate verbs.
在本实施例中,可以基于目标控制语句中的核心动词确定一组候选动词。目标服务器可以对目标控制语句进行句法分析,确定目标控制指令的核心动词。上述核心动词为目标控制指令中支配其他词语、且不受其他词语的支配的动词。核心动词表示整个句子(root(根))的核心HED,比如,日程领域核心动词一般为“提醒”、“叫”等,核心词在句中的位置对事件槽提取有一定参考价值;基于 核心动词和核心动词的相邻词语所组成的句法结构,进行事件槽位提取。由于考虑了句法和句子结构特征,可以提高事件槽位提取的精准度。In this embodiment, a set of candidate verbs may be determined based on core verbs in the target control statement. The target server can perform syntax analysis on the target control statement and determine the core verb of the target control instruction. The above-mentioned core verbs are verbs that dominate other words in the target control instructions and are not dominated by other words. The core verb represents the core HED of the entire sentence (root). For example, the core verbs in the schedule field are generally "remind", "call", etc. The position of the core word in the sentence has certain reference value for event slot extraction; based on the core The syntactic structure composed of verbs and adjacent words of the core verb is used to extract event slots. Since syntactic and sentence structure features are taken into account, the accuracy of event slot extraction can be improved.
例如,图3、图4、图5、图6以及图7中root所指向的词语为对应控制语句的核心动词,其中,图3所示的控制语句中,“提醒”为该控制语句中的核心词语,如图4所示的控制语句中,“去”为该控制语句中的核心词语,如图5所示的控制语句中,“提醒”为该控制语句中的核心词语,如图6所示的控制语句中,“提醒”为该控制语句中的核心词语,如图7所示的控制语句中,“记得”为该控制语句中的核心词语。For example, the words pointed to by root in Figures 3, 4, 5, 6 and 7 are the core verbs corresponding to the control statements. In the control statement shown in Figure 3, "remind" is the part of the control statement. Core words. In the control statement shown in Figure 4, "go" is the core word in the control statement. In the control statement shown in Figure 5, "remind" is the core word in the control statement, as shown in Figure 6 In the control statement shown, "remind" is the core word in the control statement. In the control statement shown in Figure 7, "remember" is the core word in the control statement.
在确定目标控制指令的核心动词之后,目标服务器可以将目标控制语句所包括的多个词语中与核心动词之间具有指定句法关系的每个动词,分别确定为一个候选动词,得到一组候选动词,上述指定句法关系可以是与核心动词之间具有VOB关系的动词,也可以是与核心动词之间具有COO关系的动词,本实施例中对此不做限定。After determining the core verb of the target control instruction, the target server can determine each verb among the multiple words included in the target control statement that has a specified syntactic relationship with the core verb as a candidate verb, and obtain a set of candidate verbs. , the above specified syntactic relationship may be a verb that has a VOB relationship with the core verb, or it may be a verb that has a COO relationship with the core verb, which is not limited in this embodiment.
上述一组候选动词中可以包括多个候选动词,也可以只包括一个候选动词,本实施例中对此不做限定。例如,如图3所示,在确定出“提醒”为控制语句的核心词语之后,可以将与“提醒”之间具有VOB关系的动词“去”确定为候选动词,由于该控制语句中除了候选动词“去”之后,不存在与核心动词之间具有VOB关系或者COO关系的动词,因此,该控制语句中的一组候选动词只包括“去”这一个候选动词。The above group of candidate verbs may include multiple candidate verbs, or may include only one candidate verb, which is not limited in this embodiment. For example, as shown in Figure 3, after determining that "remind" is the core word of the control statement, the verb "go" that has a VOB relationship with "remind" can be determined as a candidate verb, because in the control statement, in addition to the candidate After the verb "go", there is no verb with a VOB relationship or a COO relationship with the core verb. Therefore, the set of candidate verbs in the control statement only includes the candidate verb "go".
在确定一组候选动词之后,目标服务器可以查找与一组候选动词中的任一候选动词具有指定句法关系的动词。可选地,上述查找与一组候选动词中的任一候选动词具有指定句法关系的动词的过程可以是:查找与一组候选动词中的任一候选动词具有VOB关系的动词或者具有COO关系的动词。本实施例中对此不做限定。在查找到与任一候选动词具有指定句法关系的动词的情况下,可以将查找到的动词作为候选动词添加到一组候选动词中。After determining a set of candidate verbs, the target server may look for verbs that have a specified syntactic relationship with any candidate verb in the set of candidate verbs. Optionally, the above-mentioned process of finding verbs that have a specified syntactic relationship with any candidate verb in a group of candidate verbs may be: finding verbs that have a VOB relationship with any candidate verb in a group of candidate verbs or verbs that have a COO relationship. verb. This is not limited in this embodiment. When a verb with a specified syntactic relationship to any candidate verb is found, the found verb can be added as a candidate verb to a set of candidate verbs.
例如,如图3所示,在确定候选动词“去”之后,由于“打”与候选动词“去”之间具有COO关系,故可以将“打”也添加到一组候选动词中,此时一组候选动词所包括的动词由1个(即“去”)变为2个(即“打”、“去”)。如图4所示, 在确定候选动词“去”之后,并且将与“提醒”之间具有COO关系的动词“打”、“记得”确定为一组候选动词之后,由于动词“提醒”与候选动词“记得”具有VOB关系,故可以将“提醒”也添加到一组候选动词中,此时一组候选动词所包括的动词由2个(即“打”、“记得”)变为3个(即“打”、“记得”、“提醒”)。For example, as shown in Figure 3, after determining the candidate verb "go", since there is a COO relationship between "beat" and the candidate verb "go", "beat" can also be added to a group of candidate verbs. At this time The number of verbs included in a group of candidate verbs has changed from 1 (i.e. "to") to 2 (i.e. "beat", "go"). As shown in Figure 4, after determining the candidate verb "go" and determining the verbs "hit" and "remember" that have a COO relationship with "remind" as a group of candidate verbs, since the verb "remind" is different from the candidate verb The verb "remember" has a VOB relationship, so "remind" can also be added to a group of candidate verbs. At this time, the number of verbs included in a group of candidate verbs changes from 2 (i.e., "beat", "remember") to 3 (i.e. "beat", "remember", "remind").
通过本实施例,将与控制语句中的核心动词之间具有指定句法关系的动词确定为候选动词,并且将与候选动词之间具有指定句法关系的动词也确定为候选动词,可以提高候选动词确定的准确度,避免候选动词出现遗漏。Through this embodiment, the verbs that have a specified syntactic relationship with the core verb in the control statement are determined as candidate verbs, and the verbs that have a specified syntactic relationship with the candidate verbs are also determined as candidate verbs, which can improve the determination of candidate verbs. accuracy to avoid missing candidate verbs.
在一个示例性实施例中,每个候选动词的关联词语为每个候选动词的相邻词语,每个候选动词的相邻词语为位于每个候选动词之后、且与每个候选动词相邻的词语;根据一组候选动词中的每个候选动词与每个候选动词的关联词语之间的句法关系,从一组候选动词中确定出目标动词,包括:In an exemplary embodiment, the associated words of each candidate verb are adjacent words of each candidate verb, and the adjacent words of each candidate verb are located after each candidate verb and adjacent to each candidate verb. Words; determine the target verb from a set of candidate verbs based on the syntactic relationship between each candidate verb in the set of candidate verbs and the associated words of each candidate verb, including:
S41,将一组候选动词中与相邻词语之间满足第一筛选条件的候选动词,确定为目标动词,其中,第一筛选条件包括句法关系为指定句法关系。S41: Determine the candidate verb in a group of candidate verbs that satisfies the first filtering condition between adjacent words as the target verb, where the first filtering condition includes a syntactic relationship, which is a specified syntactic relationship.
在本实施例中,在确定出一组候选动词之后,目标服务器可以将一组候选动词中与相邻词语之间满足第一筛选条件的候选动词,确定为目标动词。In this embodiment, after determining a group of candidate verbs, the target server may determine a candidate verb among the group of candidate verbs that satisfies the first filtering condition between adjacent words as the target verb.
可选地,上述每个候选动词的关联词语为每个候选动词的相邻词语,每个候选动词的相邻词语为位于每个候选动词之后、且与每个候选动词相邻的词语,第一筛选条件包括句法关系为指定句法关系。Optionally, the associated words of each candidate verb are adjacent words of each candidate verb, and the adjacent words of each candidate verb are words located after each candidate verb and adjacent to each candidate verb. A filter condition including syntactic relations is the specified syntactic relation.
例如,如图4所示,控制语句“我明天要去打篮球,记得在明天早上八点钟提醒我”中,候选动词“打”、“记得”,“提醒”的关联词语为“篮球”、“在”以及“我”,而不是“去”、“,”以及“八点钟”。For example, as shown in Figure 4, in the control statement "I am going to play basketball tomorrow, remember to remind me at eight o'clock tomorrow morning", the candidate verbs "play" and "remember", and the associated words of "remind" are "basketball" , "at" and "I" instead of "to", "," and "eight o'clock".
通过本实施例,将一组候选动词中与相邻词语之间具有指定句法关系的候选词语,确定为目标词语,可以提高目标词语的确定准确度和灵活性。Through this embodiment, the candidate words that have a specified syntactic relationship with adjacent words in a group of candidate verbs are determined as target words, which can improve the accuracy and flexibility of determining the target words.
在一个示例性实施例中,将一组候选动词中与相邻词语之间满足第一筛选条件的候选动词,确定为目标动词,包括:In an exemplary embodiment, a candidate verb among a group of candidate verbs that satisfies the first filtering condition between adjacent words is determined as the target verb, including:
S51,在一组候选动词中的任一候选动词、以及与任一候选动词的相邻词语 之间的句法关系为动宾关系,并且任一候选动词的相邻词语的词性为名词的情况下,将任一候选动词确定为目标动词,其中,指定句法关系包括动宾关系,第一筛选条件还包括相邻词语的词性为名词;S51, when the syntactic relationship between any candidate verb in a group of candidate verbs and the adjacent words of any candidate verb is a verb-object relationship, and the part of speech of any adjacent word of the candidate verb is a noun. , determine any candidate verb as the target verb, where the specified syntactic relationship includes a verb-object relationship, and the first filtering condition also includes the part-of-speech of the adjacent word being a noun;
S52,在任一候选动词与任一候选动词的相邻词语之间的句法关系为并列关系的情况下,将任一候选动词确定为目标动词,其中,指定句法关系包括并列关系。S52: When the syntactic relationship between any candidate verb and the adjacent words of any candidate verb is a parallel relationship, determine any candidate verb as the target verb, where the specified syntactic relationship includes a parallel relationship.
在本实施例中,上述第一筛选条件可以包括但不限于以下之一:候选动词中与相邻词语之间的句法关系为动宾关系,且候选动词中与相邻词语的词性为名词;候选动词中与相邻词语之间的句法关系为并列关系。In this embodiment, the above-mentioned first filtering condition may include but is not limited to one of the following: the syntactic relationship between the candidate verb and the adjacent words is a verb-object relationship, and the part of speech between the candidate verb and the adjacent words is a noun; The syntactic relationship between the candidate verb and adjacent words is a parallel relationship.
作为一种可选的实施方式,可以在一组候选动词中的任一候选动词、以及与任一候选动词的相邻词语之间的句法关系为动宾关系,并且任一候选动词的相邻词语的词性为名词的情况下,将任一候选动词确定为目标动词,上述指定句法关系包括动宾关系(即,VOB关系),上述第一筛选条件还包括相邻词语的词性为名词。As an optional implementation, the syntactic relationship between any candidate verb in a group of candidate verbs and the adjacent words of any candidate verb can be a verb-object relationship, and the adjacent words of any candidate verb When the part-of-speech of the word is a noun, any candidate verb is determined as the target verb. The above-mentioned specified syntactic relationship includes a verb-object relationship (ie, VOB relationship). The above-mentioned first filtering condition also includes that the part-of-speech of the adjacent word is a noun.
例如,如图4所示,控制语句“我明天要去打篮球,记得在明天早上八点钟提醒我”中,候选动词“打”、“记得”、“提醒”的相邻词语分别是“篮球”、“在”以及“我”,其中,候选动词“打”与“篮球”之间形成的句法关系为动宾关系,并且“篮球”为名词,则可以将候选动词“打”确定为目标动词,候选动词“记得”与“在”之间不存在句法关系,因此候选动词“记得”不为目标动词,候选动词“提醒”与“我”之间构成的句法关系虽然是动宾关系,但是“我”的词性是一个代词,并不是名词,因此候选动词“提醒”也不是目标动词。For example, as shown in Figure 4, in the control statement "I am going to play basketball tomorrow, remember to remind me at eight o'clock tomorrow morning", the adjacent words of the candidate verbs "play", "remember" and "remind" are respectively " "Basketball", "in" and "I". Among them, the syntactic relationship formed between the candidate verb "play" and "basketball" is a verb-object relationship, and "basketball" is a noun, then the candidate verb "play" can be determined as There is no syntactic relationship between the target verb and the candidate verbs "remember" and "in", so the candidate verb "remember" is not the target verb. Although the syntactic relationship between the candidate verb "remind" and "I" is a verb-object relationship , but the part of speech of "I" is a pronoun, not a noun, so the candidate verb "remind" is not the target verb.
作为另一种可选的实施方式,可以在任一候选动词与任一候选动词的相邻词语之间的句法关系为并列关系的情况下,将任一候选动词确定为目标动词,其中,指定句法关系包括并列关系(即,COO关系)。As another optional implementation, any candidate verb can be determined as the target verb when the syntactic relationship between any candidate verb and the adjacent words of any candidate verb is a parallel relationship, where the syntax is specified Relationships include parallel relationships (ie, COO relationships).
例如,如图7所示,控制语句“记得在明天早上八点钟提醒我出去散步”中,候选动词“提醒”、“出去”、“散步”的相邻词语分别是“我”和“散步”,其中,候选动词“提醒”与“我”之间形成的句法关系为DBL不为动宾关系或者并列关系,因此候选动词“提醒”不为目标动词,候选动词“出去”与“散步”之间 形成的句法关系为并列关系,因此可以将候选动词“出去”确定为目标动词。For example, as shown in Figure 7, in the control statement "Remember to remind me to go for a walk at eight o'clock tomorrow morning", the adjacent words of the candidate verbs "remind", "go out" and "walk" are "I" and "walk" respectively. ", among which, the syntactic relationship formed between the candidate verb "remind" and "I" is that DBL is not a verb-object relationship or a parallel relationship, so the candidate verb "remind" is not the target verb, and the candidate verb "go out" and "walk" The syntactic relationship formed between them is a parallel relationship, so the candidate verb "go out" can be determined as the target verb.
通过本实施例,根据一组候选动词中的任一候选动词与任一候选动词的相邻词语之间的句法关系以及任一候选动词的相邻词语的词性从一组候选动词中确定目标动词,可以提高目标动词的确定准确性,进而提高确定出的事件槽位的准确性。Through this embodiment, the target verb is determined from a group of candidate verbs according to the syntactic relationship between any candidate verb in the group of candidate verbs and the adjacent words of any candidate verb and the part of speech of the adjacent words of any candidate verb. , which can improve the accuracy of determining the target verb, thereby improving the accuracy of the determined event slot.
在一个示例性实施例中,每个候选动词的关联词语为与每个候选动词具有句法关系的词语;根据一组候选动词中的每个候选动词与每个候选动词的关联词语之间的句法关系,从一组候选动词中确定出目标动词,包括:In an exemplary embodiment, the associated words of each candidate verb are words that have a syntactic relationship with each candidate verb; according to the syntax between each candidate verb in a set of candidate verbs and the associated words of each candidate verb Relationship, determine the target verb from a set of candidate verbs, including:
S61,将一组候选动词中与具有句法关系的词语之间满足第二筛选条件的候选动词,确定为目标动词,其中,第二筛选条件包括句法关系为指定句法关系。S61: Determine a candidate verb in a group of candidate verbs that satisfies a second filtering condition with a word that has a syntactic relationship as a target verb, where the second filtering condition includes a syntactic relationship that is a specified syntactic relationship.
在本实施例中,在确定出一组候选动词之后,目标服务器可以将一组候选动词中与具有句法关系的词语之间满足第二筛选条件的候选动词,确定为目标动词。In this embodiment, after determining a group of candidate verbs, the target server may determine a candidate verb among the group of candidate verbs that satisfies the second filtering condition with words that have a syntactic relationship as the target verb.
可选地,上述每个候选动词的关联词语为与每个候选动词具有句法关系的词语,第二筛选条件包括句法关系为指定句法关系。Optionally, the associated words of each of the above candidate verbs are words that have a syntactic relationship with each candidate verb, and the second filtering condition includes that the syntactic relationship is a specified syntactic relationship.
例如,可以将一组候选动词中与具有句法关系的词语之间的句法关系为VOB关系或者具有COO关系的候选动词,确定为目标动词。For example, a candidate verb whose syntactic relationship between a group of candidate verbs and a word that has a syntactic relationship is a VOB relationship or has a COO relationship can be determined as the target verb.
通过本实施例,将一组候选动词中与与具有句法关系的词语之间具有指定句法关系的候选词语,确定为目标词语,可以提高目标词语的确定准确度和灵活性。。Through this embodiment, a candidate word with a specified syntactic relationship between a group of candidate verbs and a word with a syntactic relationship is determined as the target word, which can improve the accuracy and flexibility of determining the target word. .
在一个示例性实施例中,将一组候选动词中与具有句法关系的词语之间满足第二筛选条件的候选动词,确定为目标动词:In an exemplary embodiment, a candidate verb that satisfies the second filtering condition among a group of candidate verbs and words that have a syntactic relationship is determined as the target verb:
S71,在一组候选动词中的任一候选动词、以及与任一候选动词具有句法关系的词语之间的句法关系为动宾关系,并且与任一候选动词具有句法关系的词语的词性为名词的情况下,将任一候选动词确定为目标动词,其中,指定句法关系包括动宾关系,第二筛选条件还包括具有句法关系的词语的词性为名词;S71, the syntactic relationship between any candidate verb in a group of candidate verbs and the word that has a syntactic relationship with any candidate verb is a verb-object relationship, and the part of speech of the word that has a syntactic relationship with any candidate verb is a noun In the case of , any candidate verb is determined as the target verb, wherein the specified syntactic relationship includes a verb-object relationship, and the second filtering condition also includes that the part-of-speech of the word with the syntactic relationship is a noun;
S72,在任一候选动词和与任一候选动词具有句法关系的词语之间的句法关系为并列关系的情况下,将任一候选动词确定为目标动词,其中,指定句法关系 包括并列关系。S72, in the case where the syntactic relationship between any candidate verb and a word having a syntactic relationship with any candidate verb is a parallel relationship, determine any candidate verb as the target verb, where the specified syntactic relationship includes a parallel relationship.
在本实施例中,上述第一筛选条件可以包括但不限于以下之一:候选动词中和与候选动词具有句法关系的词语之间的句法关系为动宾关系,且候选动词中与相邻词语的词性为名词;候选动词中和与候选动词具有句法关系的词语之间的句法关系为并列关系。In this embodiment, the above-mentioned first filtering condition may include but is not limited to one of the following: the syntactic relationship between the candidate verb and the words that have a syntactic relationship with the candidate verb is a verb-object relationship, and the candidate verb has an adjacent word The part of speech of is a noun; the syntactic relationship between the candidate verb and the words that have a syntactic relationship with the candidate verb is a parallel relationship.
作为一种可选的实施方式,可以在一组候选动词中的任一候选动词以及与任一候选动词具有句法关系的词语之间的句法关系为动宾关系,并且与任一候选动词具有句法关系的词语的词性为名词的情况下,将任一候选动词确定为目标动词,上述指定句法关系包括动宾关系(即,VOB关系),上述第二筛选条件还包括具有句法关系的词语的词性为名词。As an optional implementation, the syntactic relationship between any candidate verb in a set of candidate verbs and the word that has a syntactic relationship with any candidate verb is a verb-object relationship, and has a syntactic relationship with any candidate verb. When the part-of-speech of the related word is a noun, any candidate verb is determined as the target verb. The above-mentioned specified syntactic relationship includes a verb-object relationship (i.e., VOB relationship). The above-mentioned second filtering condition also includes the part-of-speech of the word with the syntactic relationship. as a noun.
例如,如图4所示,控制语句“我明天要去打篮球,记得在明天早上八点钟提醒我”中,候选动词“去”、“打”,“记得”以及“提醒”的关联词语分别是“去”、“篮球”、“提醒”以及“我”,其中,候选动词“去”除了“打”以外,没有具有动宾关系或者并列关系的词语,并且,“打”还与“篮球”之间具有动宾关系,因此,“去”不是目标动词,候选动词“打”与“篮球”之间具有动宾关系,并且“篮球”为名词,因此可以将候选动词“打”确定为目标动词,候选动词“记得”与“提醒”之间具有并列关系,但是由于“提醒”除了候选动词“记得”以外,还和“我”之间具有动宾关系,因此候选动词“记得”不为目标动词,候选动词“提醒”与“我”之间具有动宾关系,但是“我”的词性是一个代词,并不是名词,因此候选动词“提醒”也不是目标动词。For example, as shown in Figure 4, in the control statement "I am going to play basketball tomorrow, remember to remind me at eight o'clock tomorrow morning", the candidate verbs "go", "play", "remember" and "remind" are associated words They are "go", "basketball", "remind" and "I". Among them, the candidate verb "go" has no word with verb-object relationship or parallel relationship except "beat", and "beat" is also related to "beat". There is a verb-object relationship between "basketball", so "go" is not the target verb, there is a verb-object relationship between the candidate verb "play" and "basketball", and "basketball" is a noun, so the candidate verb "play" can be determined As the target verb, the candidate verb "remember" has a parallel relationship with "remind". However, since "remind" not only has the candidate verb "remember", it also has a verb-object relationship with "I", so the candidate verb "remember" It is not the target verb. The candidate verb "remind" has a verb-object relationship with "I". However, the part of speech of "I" is a pronoun, not a noun, so the candidate verb "remind" is not the target verb.
作为另一种可选的实施方式,可以在任一候选动词和与任一候选动词具有句法关系的词语之间的句法关系为并列关系的情况下,将任一候选动词确定为目标动词,上述指定句法关系包括并列关系(即,COO关系)。As another optional implementation, when the syntactic relationship between any candidate verb and a word that has a syntactic relationship with any candidate verb is a parallel relationship, any candidate verb can be determined as the target verb, as specified above Syntactic relations include parallel relations (i.e., COO relations).
例如,如图7所示,控制语句“记得在明天早上八点钟提醒我出去散步”中,候选动词“提醒”、“出去”、“散步”的关联词语分别是“我”和“散步”,其中,候选动词“提醒”与“我”之间形成的句法关系为DBL不为动宾关系或者并列关系,因此候选动词“提醒”不为目标动词,候选动词“出去”与“散步”之间形成的句法关系为并列关系,因此可以将候选动词“出去”确定为目标动词。For example, as shown in Figure 7, in the control statement "Remember to remind me to go for a walk at eight o'clock tomorrow morning", the associated words of the candidate verbs "remind", "go out" and "walk" are "I" and "walk" respectively. , among them, the syntactic relationship formed between the candidate verb "remind" and "I" is that DBL is not a verb-object relationship or a parallel relationship, so the candidate verb "remind" is not the target verb, and the candidate verb "go out" and "walk" The syntactic relationship formed between them is a parallel relationship, so the candidate verb "go out" can be determined as the target verb.
需要说明的是,“去\v体育馆\n”、“打\v篮球\n”这种动名词结构在句法上称为动宾关系VOB,而两个位置相邻的动宾关系VOB结构在句法上称为并列关系COO。借助这种动宾关系中的并列结构,可以提取到正确的事件槽位:“去体育馆打篮球”,其事件槽位提取精度更高。It should be noted that the gerund structure of "go to the gym\n" or "play basketball\n" is called the verb-object relationship VOB in syntax, and the verb-object relationship VOB structure between two adjacent positions is in Syntactically it is called the parallel relationship COO. With the help of this parallel structure in the object-verb relationship, the correct event slot can be extracted: "Go to the gym to play basketball", and the event slot extraction accuracy is higher.
通过本实施例,根据一组候选动词中的任一候选动词以及与任一候选动词具有句法关系的词语之间的句法关系以及任一候选动词的关联词语的词性,从一组候选动词中确定目标动词,可以提高目标动词的确定准确性,进而提高事件槽位的确定准确性。Through this embodiment, a group of candidate verbs is determined from a group of candidate verbs based on the syntactic relationship between any candidate verb in the group of candidate verbs and the words that have a syntactic relationship with any candidate verb, and the part of speech of the associated words of any candidate verb. The target verb can improve the accuracy of determining the target verb, thereby improving the accuracy of determining the event slot.
在一个示例性实施例中,将目标词语、以及目标词语的关联词语,确定为目标控制语句的目标事件槽位,包括:In an exemplary embodiment, the target word and the associated words of the target word are determined as the target event slot of the target control statement, including:
S81,在目标词语存在多个的情况下,将每个目标词语、以及每个目标词语的关联词语,确定为与每个目标词语对应的事件槽位;S81, when there are multiple target words, determine each target word and the associated words of each target word as event slots corresponding to each target word;
S82,将与每个目标词语对应的事件槽位的组合,确定为目标事件槽位。S82: Determine the combination of event slots corresponding to each target word as the target event slot.
在本实施例中,在目标词语存在多个的情况下,目标服务器可以将每个目标词语、以及每个目标词语的关联词语,确定与每个目标词语对应的事件槽位,并将与每个目标词语对应的事件槽位的组合,确定为目标事件槽位。In this embodiment, when there are multiple target words, the target server can determine the event slot corresponding to each target word and the associated words of each target word, and add the event slot corresponding to each target word. The combination of event slots corresponding to target words is determined as the target event slot.
例如,当控制语句为“明天上午8点用歌曲A提醒我去体育馆打篮球”时,可以将目标词语“去”、“打”以及与目标词语关联的词语“体育馆”、“篮球”确定为事件槽位,得到的最终事件槽位是:“去体育馆打篮球”,而不在是单一的“打篮球”,这样槽位提取更加充分全面,使得问答系统在NLG(Natural Language Generation,自然语言生成)播报回复上,用户的体验效果更佳。For example, when the control statement is "Remind me to go to the gymnasium to play basketball with song A at 8 o'clock tomorrow morning", the target words "go", "play" and the words "gymnasium" and "basketball" associated with the target words can be determined as Event slot, the final event slot obtained is: "Go to the gym to play basketball", rather than a single "Play basketball", so that the slot extraction is more complete and comprehensive, making the question and answer system in NLG (Natural Language Generation, natural language generation) ) broadcast reply, the user experience is better.
上述将与每个目标词语对应的事件槽位的组合,确定为目标事件槽位的过程,可以是在与每个目标词语对应的事件槽位相邻的情况下,将相邻的事件槽位拼接为一个事件槽位,再将拼接的事件槽位确定为目标事件槽位。The above-mentioned process of determining the combination of event slots corresponding to each target word as the target event slot may be to combine the adjacent event slots when the event slots corresponding to each target word are adjacent. Splice them into one event slot, and then determine the spliced event slot as the target event slot.
可选地,在目标控制语句为与目标日程对应的日程设置语句的情况下,目标服务器可以从目标控制语句中确定与目标日程对应的日程提醒时间,例如,当控制语句为“明天上午8点用歌曲A提醒我去体育馆打篮球”时,可以将明天上午 8点确定为该日程对应的日程提醒时间。Optionally, when the target control statement is a schedule setting statement corresponding to the target schedule, the target server can determine the schedule reminder time corresponding to the target schedule from the target control statement. For example, when the control statement is "tomorrow at 8 a.m. When "remind me to go to the gym to play basketball with song A", you can determine 8 a.m. tomorrow as the schedule reminder time corresponding to the schedule.
目标服务器在确定目标事件槽位之后,可以按照目标事件槽位构造日程提醒语句,该目标提醒语句用于提醒执行目标日程。上述按照目标事件槽位构造日程提醒语句的过程可以是:按照目标事件槽位,从目标控制语句中提取位于目标事件槽位的词语,并根据这些词语构造日程提醒语句,本实施例中对此不做限定。,在构造日程提醒语句之后,可以在目标提醒时间到达的情况下,将目标提醒语句转换为对应的提醒语音进行语音播报,这里,目标提醒时间为从目标控制语句提取出的目标日程的提醒时间。After determining the target event slot, the target server can construct a schedule reminder statement according to the target event slot. The target reminder statement is used to remind the execution of the target schedule. The above-mentioned process of constructing a schedule reminder sentence according to the target event slot may be: according to the target event slot, extract the words located in the target event slot from the target control sentence, and construct a schedule reminder sentence based on these words. In this embodiment, this No restrictions. , after constructing the schedule reminder sentence, when the target reminder time arrives, the target reminder sentence can be converted into the corresponding reminder voice for voice broadcast. Here, the target reminder time is the reminder time of the target schedule extracted from the target control sentence. .
通过本实施例,将与每个目标词语对应的事件槽位的组合,确定为事件槽位,可以提高事件槽位的确定精准性。Through this embodiment, the combination of event slots corresponding to each target word is determined as the event slot, which can improve the accuracy of determining the event slot.
下面结合可选示例对本公开实施例中的事件槽位的提取方法进行解释说明。在本可选示例提供的是一种使用句法分析日程领域中日程事项提取槽位的优化方案,在问答系统日程领域事件槽的提取过程中,引入句法分析来辅助提取日程领域的事件槽位,通过提取句子的核心动词,以核心动词为参考,借助动词紧邻名词的动宾关系、动词紧邻动词的并列关系来制定句法规则策略进行事件槽位提取,以提高事件槽位提取的精确度,增强日程事项抽槽的泛化性。The method for extracting event slots in the embodiment of the present disclosure will be explained below in conjunction with optional examples. This optional example provides an optimization solution that uses syntactic analysis to extract slots for schedule items in the schedule field. In the process of extracting event slots in the schedule field of the question and answer system, syntactic analysis is introduced to assist in extracting event slots in the schedule field. By extracting the core verb of the sentence, using the core verb as a reference, we use the verb-object relationship of the verb immediately adjacent to the noun and the parallel relationship of the verb immediately adjacent to the verb to formulate a syntactic rule strategy for event slot extraction to improve the accuracy of event slot extraction and enhance Generalizability of schedule item slotting.
结合图8,本可选示例中的事件槽位的提取方法的流程可以包括以下步骤:Combined with Figure 8, the process of the event slot extraction method in this optional example may include the following steps:
步骤S802,通过调用句法分析算法对句子进行解析,得到句子中的候选动词。Step S802: parse the sentence by calling a syntax analysis algorithm to obtain candidate verbs in the sentence.
调用句法分析算法,解析出句子的句法结构,可以得出句子结构中的候选动词,其他词可以围绕核心动词呈发散状。Call the syntactic analysis algorithm to analyze the syntactic structure of the sentence, and you can get the candidate verbs in the sentence structure, and other words can diverge around the core verb.
步骤S804,判断候选动词的相邻词语的词性以及相邻词语与候选动词之间形成的结构关系。Step S804: Determine the part-of-speech of adjacent words of the candidate verb and the structural relationship formed between the adjacent words and the candidate verb.
步骤S806,当相邻词语与候选动词之间形成的句法关系为动宾关系或者并列关系时,将候选动词以及与候选动词相邻的词语所在的槽位确定为事件槽位。Step S806: When the syntactic relationship formed between the adjacent words and the candidate verb is a verb-object relationship or a parallel relationship, the slot where the candidate verb and the words adjacent to the candidate verb are located are determined as event slots.
当紧邻动词名词组成的句法结构为动宾结构VOB(对应于上述动宾关系),如果两个VOB结构也紧邻,其最终事件槽位为两个VOB的组装,例如,“去体 育馆打篮球”,如果句子里没有动宾结构,但存在动词紧邻动词组成的COO并列关系时,也认为其为事项槽位的一种可能,如“出去散步”。When the syntactic structure composed of a verb and a noun immediately adjacent is a verb-object structure VOB (corresponding to the above verb-object relationship), if the two VOB structures are also adjacent, the final event slot is the assembly of the two VOBs, for example, "Go to the gym to play basketball." , if there is no verb-object structure in the sentence, but there is a COO parallel relationship consisting of a verb immediately adjacent to the verb, it is also considered to be a possibility of the matter slot, such as "go out for a walk".
通过本可选示例,将句法分析技术应用在复杂句子的事件槽位提取中,核心动词在句子里位置、句子中的动宾结构VOB、并列关系COO都能为日程领域事件槽位的提取提供一定辅助信息,使得该槽位提取算法的鲁棒性、精确度更高,引入句法结构特征信息后,使得在日程事项等复杂槽位中能够更加精确提取事件槽位。Through this optional example, syntactic analysis technology is applied to the event slot extraction of complex sentences. The position of the core verb in the sentence, the verb-object structure VOB in the sentence, and the parallel relationship COO can all provide information for the extraction of event slots in the schedule field. Certain auxiliary information makes the slot extraction algorithm more robust and accurate. The introduction of syntactic structure feature information enables more accurate extraction of event slots in complex slots such as schedule matters.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本公开并不受所描述的动作顺序的限制,因为依据本公开,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本公开所必须的。It should be noted that for the sake of simple description, the foregoing method embodiments are expressed as a series of action combinations. However, those skilled in the art should know that the present disclosure is not limited by the described action sequence. Because in accordance with the present disclosure, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not necessarily necessary for the present disclosure.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM(Read-Only Memory,只读存储器)/RAM(Random Access Memory,随机存取存储器)、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本公开各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation. Based on this understanding, the technical solution of the present disclosure can be embodied in the form of a software product in essence or that contributes to the existing technology. The computer software product is stored in a storage medium (such as ROM (Read-Only Memory, Read-only memory)/RAM (Random Access Memory, disk, optical disk), including a number of instructions to make a terminal device (can be a mobile phone, computer, server, or network device, etc.) to execute this Methods described in various embodiments are disclosed.
根据本公开实施例的另一个方面,还提供了一种用于实施上述事件槽位的提取方法的事件槽位的提取装置。图9是根据本公开实施例的一种可选的事件槽位的提取装置的结构框图,如图9所示,该装置可以包括:According to another aspect of the embodiment of the present disclosure, an event slot extraction device for implementing the above event slot extraction method is also provided. Figure 9 is a structural block diagram of an optional event slot extraction device according to an embodiment of the present disclosure. As shown in Figure 9, the device may include:
获取单元902,设置为获取待提取事件槽位的目标控制语句,其中,目标控制语句中包括多个词语;The acquisition unit 902 is configured to acquire the target control statement of the event slot to be extracted, where the target control statement includes multiple words;
分析单元904,与获取单元902相连,设置为对目标控制语句进行句法分析, 从多个词语中确定出一组候选词,其中,一组候选词中包含的候选词的数量小于多个词语包含的词语的数量;The analysis unit 904 is connected to the acquisition unit 902 and is configured to perform syntactic analysis on the target control statement and determine a group of candidate words from a plurality of words, wherein the number of candidate words contained in a group of candidate words is less than the number of candidate words contained in the plurality of words. the number of words;
第一确定单元906,与分析单元904相连,设置为根据一组候选词中的每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,其中,目标词语与目标词语的关联词语之间为指定句法关系;The first determination unit 906 is connected to the analysis unit 904 and is configured to determine the target word from a group of candidate words based on the syntactic relationship between each candidate word in the group of candidate words and the associated words of each candidate word, Among them, there is a specified syntactic relationship between the target word and the associated words of the target word;
第二确定单元908,与第一确定单元906相连,设置为将目标词语、以及目标词语的关联词语,确定为目标控制语句的目标事件槽位。The second determination unit 908 is connected to the first determination unit 906 and is configured to determine the target word and related words of the target word as the target event slot of the target control statement.
需要说明的是,该实施例中的获取单元902可以设置为执行上述步骤S202,该实施例中的分析单元904可以设置为执行上述步骤S204,该实施例中的第一确定单元906可以设置为执行上述步骤S206,该实施例中的第二确定单元908可以设置为执行上述步骤S208。It should be noted that the acquisition unit 902 in this embodiment can be configured to perform the above step S202, the analysis unit 904 in this embodiment can be configured to perform the above step S204, and the first determination unit 906 in this embodiment can be configured to The above-mentioned step S206 is executed, and the second determination unit 908 in this embodiment may be configured to execute the above-mentioned step S208.
通过上述模块,获取待提取事件槽位的目标控制语句,其中,目标控制语句中包括多个词语;对目标控制语句进行句法分析,从多个词语中确定出一组候选词,其中,一组候选词中包含的候选词的数量小于多个词语包含的词语的数量;根据一组候选词中的每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,其中,目标词语与目标词语的关联词语之间为指定句法关系;将目标词语、以及目标词语的关联词语,确定为目标控制语句的目标事件槽位,解决了相关技术中的事件槽位的提取方法存在由于难以穷举事件槽位所导致的事件槽位提取的准确性低的问题,提高了事件槽位提取的准确性。Through the above module, the target control sentence of the event slot to be extracted is obtained, where the target control sentence includes multiple words; the target control sentence is syntactically analyzed to determine a set of candidate words from the multiple words, among which a set of The number of candidate words contained in a candidate word is smaller than the number of words contained in multiple words; based on the syntactic relationship between each candidate word in a set of candidate words and the associated words of each candidate word, from a set of candidate words The target word is determined, in which the target word and the associated words of the target word are designated syntactic relationships; the target word and the associated words of the target word are determined as the target event slot of the target control statement, solving the problems in related technologies The event slot extraction method has the problem of low accuracy in event slot extraction due to the difficulty in exhaustively enumerating event slots. This method improves the accuracy of event slot extraction.
在一个示例性实施例中,第一确定单元包括:In an exemplary embodiment, the first determining unit includes:
第一确定模块,设置为根据一组候选动词中的每个候选动词与每个候选动词的关联词语之间的句法关系,从一组候选动词中确定出目标动词,其中,一组候选词包括一组候选动词,目标词语包括目标动词。The first determination module is configured to determine the target verb from a group of candidate verbs based on the syntactic relationship between each candidate verb in the group of candidate verbs and associated words of each candidate verb, where the group of candidate words includes A set of candidate verbs, the target word includes the target verb.
在一个示例性实施例中,分析单元包括:In an exemplary embodiment, the analysis unit includes:
第二确定模块,设置为对目标控制语句进行句法分析,确定目标控制语句包含的多个词语以及多个词语中的每个词语的词性;The second determination module is configured to perform syntactic analysis on the target control statement and determine multiple words contained in the target control statement and the part-of-speech of each word in the multiple words;
第三确定模块,设置为将多个词语中词性为动词的每个词语,确定为一个候选动词,得到一组候选动词。The third determination module is configured to determine each word whose part-of-speech is a verb among the plurality of words as a candidate verb, and obtain a group of candidate verbs.
在一个示例性实施例中,分析单元包括:In an exemplary embodiment, the analysis unit includes:
分析模块,设置为对目标控制语句进行句法分析,确定目标控制指令的核心动词;The analysis module is configured to perform syntactic analysis on the target control statement and determine the core verb of the target control instruction;
第四确定模块,设置为将多个词语中与核心动词之间具有指定句法关系的每个动词,分别确定为一个候选动词,得到一组候选动词;The fourth determination module is configured to determine each verb in multiple words that has a specified syntactic relationship with the core verb as a candidate verb, and obtain a set of candidate verbs;
查找模块,设置为查找与一组候选动词中的任一候选动词具有指定句法关系的动词;A search module configured to search for verbs that have a specified syntactic relationship with any candidate verb in a set of candidate verbs;
添加模块,设置为在查找到与任一候选动词具有指定句法关系的动词的情况下,将查找到的动词作为候选动词添加到一组候选动词中。The add module is configured to add the found verb as a candidate verb to a set of candidate verbs when a verb having a specified syntactic relationship with any candidate verb is found.
在一个示例性实施例中,每个候选动词的关联词语为每个候选动词的相邻词语,每个候选动词的相邻词语为位于每个候选动词之后、且与每个候选动词相邻的词语;第一确定模块包括:In an exemplary embodiment, the associated words of each candidate verb are adjacent words of each candidate verb, and the adjacent words of each candidate verb are located after each candidate verb and adjacent to each candidate verb. Word; the first determination module includes:
第一确定子模块,设置为将一组候选动词中与相邻词语之间满足第一筛选条件的候选动词,确定为目标动词,其中,第一筛选条件包括句法关系为指定句法关系。The first determination sub-module is configured to determine candidate verbs from a group of candidate verbs that meet the first filtering condition between adjacent words as the target verb, where the first filtering condition includes a syntactic relationship, which is a specified syntactic relationship.
在一个示例性实施例中,第一确定子模块包括:In an exemplary embodiment, the first determination sub-module includes:
第一确定子单元,设置为在一组候选动词中的任一候选动词、以及与任一候选动词的相邻词语之间的句法关系为动宾关系,并且任一候选动词的相邻词语的词性为名词的情况下,将任一候选动词确定为目标动词,其中,指定句法关系包括动宾关系,第一筛选条件还包括相邻词语的词性为名词;The first determination subunit is set so that the syntactic relationship between any candidate verb in a group of candidate verbs and the adjacent words of any candidate verb is a verb-object relationship, and the syntactic relationship between the adjacent words of any candidate verb is When the part-of-speech is a noun, any candidate verb is determined as the target verb, where the specified syntactic relationship includes a verb-object relationship, and the first filtering condition also includes that the part-of-speech of the adjacent word is a noun;
第二确定子单元,设置为在任一候选动词与任一候选动词的相邻词语之间的句法关系为并列关系的情况下,将任一候选动词确定为目标动词,其中,指定句法关系包括并列关系。The second determination subunit is configured to determine any candidate verb as the target verb when the syntactic relationship between any candidate verb and the adjacent words of any candidate verb is a parallel relationship, wherein the specified syntactic relationship includes parallelism relation.
在一个示例性实施例中,每个候选动词的关联词语为与每个候选动词具有句 法关系的词语;第一确定模块包括:In an exemplary embodiment, the associated words of each candidate verb are words that have a syntactic relationship with each candidate verb; the first determination module includes:
第二确定子模块,设置为将一组候选动词中与具有句法关系的词语之间满足第二筛选条件的候选动词,确定为目标动词,其中,第二筛选条件包括句法关系为指定句法关系。The second determination submodule is configured to determine candidate verbs from a group of candidate verbs that meet the second filtering condition with words that have syntactic relationships as target verbs, where the second filtering condition includes a syntactic relationship that is a specified syntactic relationship.
在一个示例性实施例中,第二确定子模块包括:In an exemplary embodiment, the second determination sub-module includes:
第三确定子单元,设置为在一组候选动词中的任一候选动词、以及与任一候选动词具有句法关系的词语之间的句法关系为动宾关系,并且与任一候选动词具有句法关系的词语的词性为名词的情况下,将任一候选动词确定为目标动词,其中,指定句法关系包括动宾关系,第二筛选条件还包括具有句法关系的词语的词性为名词;The third determination subunit is set so that the syntactic relationship between any candidate verb in a group of candidate verbs and the word that has a syntactic relationship with any candidate verb is a verb-object relationship, and has a syntactic relationship with any candidate verb. When the part-of-speech of the word is a noun, any candidate verb is determined as the target verb, where the specified syntactic relationship includes a verb-object relationship, and the second filtering condition also includes the part-of-speech of the word with the syntactic relationship being a noun;
第四确定子单元,设置为在任一候选动词和与任一候选动词具有句法关系的词语之间的句法关系为并列关系的情况下,将任一候选动词确定为目标动词,其中,指定句法关系包括并列关系。The fourth determination subunit is configured to determine any candidate verb as the target verb when the syntactic relationship between any candidate verb and a word that has a syntactic relationship with any candidate verb is a parallel relationship, wherein the syntactic relationship is specified Including parallel relationships.
在一个示例性实施例中,第二确定单元包括:In an exemplary embodiment, the second determining unit includes:
第五确定模块,设置为在目标词语存在多个的情况下,将每个目标词语、以及每个目标词语的关联词语,确定为与每个目标词语对应的事件槽位;The fifth determination module is configured to determine each target word and associated words of each target word as event slots corresponding to each target word when there are multiple target words;
第六确定模块,设置为将与每个目标词语对应的事件槽位的组合,确定为目标事件槽位。The sixth determination module is configured to determine the combination of event slots corresponding to each target word as the target event slot.
此处需要说明的是,上述模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在如图1所示的硬件环境中,可以通过软件实现,也可以通过硬件实现,其中,硬件环境包括网络环境。It should be noted here that the examples and application scenarios implemented by the above modules and corresponding steps are the same, but are not limited to the contents disclosed in the above embodiments. It should be noted that the above module, as part of the device, can run in the hardware environment as shown in Figure 1, and can be implemented by software or hardware, where the hardware environment includes a network environment.
根据本公开实施例的又一个方面,还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以用于执行本公开实施例中上述任一项事件槽位的提取方法的程序代码。According to yet another aspect of the embodiments of the present disclosure, a storage medium is also provided. Optionally, in this embodiment, the above storage medium can be used to execute the program code of any of the above event slot extraction methods in the embodiment of the present disclosure.
可选地,在本实施例中,上述存储介质可以位于上述实施例所示的网络中的多个网络设备中的至少一个网络设备上。Optionally, in this embodiment, the above storage medium may be located on at least one network device among multiple network devices in the network shown in the above embodiment.
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:Optionally, in this embodiment, the storage medium is configured to store program codes for performing the following steps:
S1,获取待提取事件槽位的目标控制语句,其中,目标控制语句中包括多个词语;S1, obtain the target control statement of the event slot to be extracted, where the target control statement includes multiple words;
S2,对目标控制语句进行句法分析,从多个词语中确定出一组候选词,其中,一组候选词中包含的候选词的数量小于多个词语包含的词语的数量;S2, perform syntactic analysis on the target control statement, and determine a set of candidate words from multiple words, where the number of candidate words included in a set of candidate words is smaller than the number of words included in multiple words;
S3,根据一组候选词中的每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,其中,目标词语与目标词语的关联词语之间为指定句法关系;S3: Determine the target word from a set of candidate words based on the syntactic relationship between each candidate word in the set of candidate words and the associated words of each candidate word, where the relationship between the target word and the associated words of the target word To specify syntactic relations;
S4,将目标词语、以及目标词语的关联词语,确定为目标控制语句的目标事件槽位。S4: Determine the target word and related words of the target word as the target event slot of the target control statement.
可选地,本实施例中的具体示例可以参考上述实施例中所描述的示例,本实施例中对此不再赘述。Optionally, for specific examples in this embodiment, reference may be made to the examples described in the above embodiments, which will not be described again in this embodiment.
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、ROM、RAM、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。Optionally, in this embodiment, the above-mentioned storage medium may include but is not limited to: U disk, ROM, RAM, mobile hard disk, magnetic disk or optical disk and other various media that can store program codes.
根据本公开实施例的又一个方面,还提供了一种用于实施上述事件槽位的提取方法的电子装置,该电子装置可以是服务器、终端、或者其组合。According to yet another aspect of the embodiments of the present disclosure, an electronic device for implementing the above event slot extraction method is also provided. The electronic device may be a server, a terminal, or a combination thereof.
图10是根据本公开实施例的一种可选的电子装置的结构框图,如图10所示,包括处理器1002、通信接口1004、存储器1006和通信总线1008,其中,处理器1002、通信接口1004和存储器1006通过通信总线1008完成相互间的通信,其中,Figure 10 is a structural block diagram of an optional electronic device according to an embodiment of the present disclosure. As shown in Figure 10, it includes a processor 1002, a communication interface 1004, a memory 1006 and a communication bus 1008. The processor 1002, the communication interface 1004 and memory 1006 complete communication with each other through communication bus 1008, where,
存储器1006,设置为存储计算机程序; Memory 1006 configured to store computer programs;
处理器1002,设置为执行存储器1006上所存放的计算机程序时,实现如下步骤:When the processor 1002 is configured to execute the computer program stored on the memory 1006, it implements the following steps:
S1,获取待提取事件槽位的目标控制语句,其中,目标控制语句中包括多个词语;S1, obtain the target control statement of the event slot to be extracted, where the target control statement includes multiple words;
S2,对目标控制语句进行句法分析,从多个词语中确定出一组候选词,其中,一组候选词中包含的候选词的数量小于多个词语包含的词语的数量;S2, perform syntactic analysis on the target control statement, and determine a set of candidate words from multiple words, where the number of candidate words included in a set of candidate words is smaller than the number of words included in multiple words;
S3,根据一组候选词中的每个候选词与每个候选词的关联词语之间的句法关系,从一组候选词中确定出目标词语,其中,目标词语与目标词语的关联词语之间为指定句法关系;S3: Determine the target word from a set of candidate words based on the syntactic relationship between each candidate word in the set of candidate words and the associated words of each candidate word, where the relationship between the target word and the associated words of the target word To specify syntactic relations;
S4,将目标词语、以及目标词语的关联词语,确定为目标控制语句的目标事件槽位。S4: Determine the target word and related words of the target word as the target event slot of the target control statement.
可选地,在本实施例中,通信总线可以是PCI(Peripheral Component Interconnect,外设部件互连标准)总线、或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图10中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。通信接口用于上述电子装置与其他设备之间的通信。Optionally, in this embodiment, the communication bus may be a PCI (Peripheral Component Interconnect, Peripheral Component Interconnect Standard) bus, or an EISA (Extended Industry Standard Architecture, Extended Industry Standard Architecture) bus, etc. The communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 10, but it does not mean that there is only one bus or one type of bus. The communication interface is used for communication between the above-mentioned electronic device and other equipment.
存储器可以包括RAM,也可以包括非易失性存储器(non-volatile memory),例如,至少一个磁盘存储器。可选地,存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include RAM or non-volatile memory, such as at least one disk memory. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.
作为一种示例,上述存储器1006中可以但不限于包括上述事件槽位的提取装置中的获取单元902、分析单元904、第一确定单元906以及第二确定单元908。此外,还可以包括但不限于上述事件槽位的提取装置中的其他模块单元,本示例中不再赘述。As an example, the memory 1006 may include, but is not limited to, the acquisition unit 902, the analysis unit 904, the first determination unit 906 and the second determination unit 908 in the event slot extraction device. In addition, it may also include but is not limited to other module units in the extraction device of the above-mentioned event slots, which will not be described again in this example.
上述处理器可以是通用处理器,可以包含但不限于:CPU(Central Processing Unit,中央处理器)、NP(Network Processor,网络处理器)等;还可以是DSP(Digital Signal Processing,数字信号处理器)、ASIC(Application Specific Integrated Circuit,专用集成电路)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)或 者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The above-mentioned processor can be a general-purpose processor, which can include but is not limited to: CPU (Central Processing Unit, central processing unit), NP (Network Processor, network processor), etc.; it can also be a DSP (Digital Signal Processing, digital signal processor) ), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
可选地,本实施例中的具体示例可以参考上述实施例中所描述的示例,本实施例在此不再赘述。Optionally, for specific examples in this embodiment, reference may be made to the examples described in the above embodiments, which will not be described again in this embodiment.
本领域普通技术人员可以理解,图10所示的结构仅为示意,实施上述事件槽位的提取方法的设备可以是终端设备,该终端设备可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌上电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图10其并不对上述电子装置的结构造成限定。例如,电子装置还可包括比图10中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图10所示的不同的配置。Those of ordinary skill in the art can understand that the structure shown in Figure 10 is only illustrative, and the device that implements the above event slot extraction method can be a terminal device, and the terminal device can be a smart phone (such as an Android phone, iOS phone, etc.), Tablet computers, handheld computers, and mobile Internet devices (Mobile Internet Devices, MID), PAD and other terminal devices. FIG. 10 does not limit the structure of the above-mentioned electronic device. For example, the electronic device may also include more or fewer components (such as network interfaces, display devices, etc.) than shown in FIG. 10 , or have a different configuration than that shown in FIG. 10 .
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、ROM、RAM、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing the hardware related to the terminal device through a program. The program can be stored in a computer-readable storage medium, and the storage medium can Including: flash disk, ROM, RAM, magnetic disk or optical disk, etc.
上述本公开实施例序号仅仅为了描述,不代表实施例的优劣。The above serial numbers of the embodiments of the present disclosure are only for description and do not represent the advantages and disadvantages of the embodiments.
上述实施例中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在上述计算机可读取的存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在存储介质中,包括若干指令用以使得一台或多台计算机设备(可为个人计算机、服务器或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。If the integrated units in the above embodiments are implemented in the form of software functional units and sold or used as independent products, they can be stored in the above computer-readable storage medium. Based on this understanding, the technical solution of the present disclosure is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, It includes several instructions to cause one or more computer devices (which can be personal computers, servers or network devices, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
在本公开的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present disclosure, each embodiment is described with its own emphasis. For parts that are not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.
在本公开所提供的几个实施例中,应该理解到,所揭露的客户端,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided by this disclosure, it should be understood that the disclosed client can be implemented in other ways. Among them, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the units or modules may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例中所提供的方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution provided in this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.
以上所述仅是本公开的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本公开原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本公开的保护范围。The above are only preferred embodiments of the present disclosure. It should be pointed out that for those of ordinary skill in the art, several improvements and modifications can be made without departing from the principles of the present disclosure. These improvements and modifications can also be made. should be regarded as the scope of protection of this disclosure.

Claims (18)

  1. 一种事件槽位的提取方法,包括:A method for extracting event slots, including:
    获取待提取事件槽位的目标控制语句,其中,所述目标控制语句中包括多个词语;Obtain the target control statement of the event slot to be extracted, wherein the target control statement includes multiple words;
    对所述目标控制语句进行句法分析,从所述多个词语中确定出一组候选词,其中,所述一组候选词中包含的候选词的数量小于所述多个词语包含的词语的数量;Perform syntactic analysis on the target control statement to determine a group of candidate words from the plurality of words, wherein the number of candidate words included in the group of candidate words is less than the number of words included in the plurality of words. ;
    根据所述一组候选词中的每个候选词与所述每个候选词的关联词语之间的句法关系,从所述一组候选词中确定出目标词语,其中,所述目标词语与所述目标词语的关联词语之间为指定句法关系;A target word is determined from the set of candidate words according to the syntactic relationship between each candidate word in the set of candidate words and an associated word of each candidate word, wherein the target word is related to the There is a specified syntactic relationship between related words describing the target word;
    将所述目标词语、以及所述目标词语的关联词语,确定为所述目标控制语句的目标事件槽位。The target word and related words of the target word are determined as the target event slot of the target control statement.
  2. 根据权利要求1所述的方法,其中,所述根据所述一组候选词中的每个候选词与所述每个候选词的关联词语之间的句法关系,从所述一组候选词中确定出目标词语,包括:The method according to claim 1, wherein the syntactic relationship between each candidate word in the set of candidate words and associated words of each candidate word is selected from the set of candidate words. Identify target words, including:
    根据一组候选动词中的每个候选动词与所述每个候选动词的关联词语之间的句法关系,从所述一组候选动词中确定出目标动词,其中,所述一组候选词包括所述一组候选动词,所述目标词语包括所述目标动词。A target verb is determined from the set of candidate verbs based on the syntactic relationship between each candidate verb in the set of candidate verbs and an associated word of each candidate verb, wherein the set of candidate words includes all Describe a group of candidate verbs, and the target word includes the target verb.
  3. 根据权利要求2所述的方法,其中,所述对所述目标控制语句进行句法分析,从所述多个词语中确定出一组候选词,包括:The method according to claim 2, wherein the step of performing syntactic analysis on the target control statement and determining a group of candidate words from the plurality of words includes:
    对所述目标控制语句进行句法分析,确定所述目标控制语句包含的所述多个词语以及所述多个词语中的每个词语的词性;Perform syntactic analysis on the target control statement to determine the plurality of words contained in the target control statement and the part-of-speech of each word in the plurality of words;
    将所述多个词语中词性为动词的每个词语,确定为一个候选动词,得到所述一组候选动词。Each word whose part-of-speech is a verb among the plurality of words is determined as a candidate verb, and the group of candidate verbs is obtained.
  4. 根据权利要求2所述的方法,其中,所述对所述目标控制语句进行句法分析,从所述多个词语中确定出一组候选词,包括:The method according to claim 2, wherein the step of performing syntactic analysis on the target control statement and determining a group of candidate words from the plurality of words includes:
    对所述目标控制语句进行句法分析,确定所述目标控制指令的核心动词;Perform syntactic analysis on the target control statement to determine the core verb of the target control instruction;
    将所述多个词语中与所述核心动词之间具有所述指定句法关系的每个动词,分别确定为一个候选动词,得到所述一组候选动词;Determine each verb in the plurality of words that has the specified syntactic relationship with the core verb as a candidate verb to obtain the set of candidate verbs;
    查找与所述一组候选动词中的任一候选动词具有所述指定句法关系的动词;finding a verb having the specified syntactic relationship with any candidate verb in the set of candidate verbs;
    在查找到与所述任一候选动词具有所述指定句法关系的动词的情况下,将查找到的动词作为候选动词添加到所述一组候选动词中。When a verb having the specified syntactic relationship with any of the candidate verbs is found, the found verb is added as a candidate verb to the set of candidate verbs.
  5. 根据权利要求2所述的方法,其中,所述每个候选动词的关联词语为所述每个候选动词的相邻词语,所述每个候选动词的相邻词语为位于所述每个候选动词之后、且与所述每个候选动词相邻的词语;The method according to claim 2, wherein the associated words of each candidate verb are adjacent words of each candidate verb, and the adjacent words of each candidate verb are located in each candidate verb. The words that follow and are adjacent to each candidate verb;
    所述根据一组候选动词中的每个候选动词与所述每个候选动词的关联词语之间的句法关系,从所述一组候选动词中确定出目标动词,包括:Determining a target verb from the group of candidate verbs based on the syntactic relationship between each candidate verb in the group of candidate verbs and associated words of each candidate verb includes:
    将所述一组候选动词中与相邻词语之间满足第一筛选条件的候选动词,确定为所述目标动词,其中,所述第一筛选条件包括句法关系为所述指定句法关系。A candidate verb among the group of candidate verbs that satisfies a first filtering condition between adjacent words is determined as the target verb, wherein the first filtering condition includes a syntactic relationship that is the specified syntactic relationship.
  6. 根据权利要求5所述的方法,其中,所述将所述一组候选动词中与相邻词语之间满足第一筛选条件的候选动词,确定为所述目标动词,包括:The method according to claim 5, wherein determining the candidate verb in the group of candidate verbs that satisfies the first filtering condition between adjacent words as the target verb includes:
    在所述一组候选动词中的任一候选动词、以及与所述任一候选动词的相邻词语之间的句法关系为动宾关系,并且所述任一候选动词的相邻词语的词性为名词的情况下,将所述任一候选动词确定为所述目标动词,其中,所述指定句法关系包括所述动宾关系,所述第一筛选条件还包括相邻词语的词性为名词;The syntactic relationship between any candidate verb in the group of candidate verbs and the adjacent words of any candidate verb is a verb-object relationship, and the part of speech of the adjacent words of any candidate verb is In the case of a noun, any candidate verb is determined as the target verb, wherein the specified syntactic relationship includes the verb-object relationship, and the first filtering condition also includes that the part of speech of the adjacent word is a noun;
    在所述任一候选动词与所述任一候选动词的相邻词语之间的句法关系为并列关系的情况下,将所述任一候选动词确定为所述目标动词,其中,所述指定句法关系包括所述并列关系。In the case where the syntactic relationship between any candidate verb and the adjacent words of the any candidate verb is a parallel relationship, the any candidate verb is determined as the target verb, wherein the specified syntax The relationship includes the parallel relationship.
  7. 根据权利要求2所述的方法,其中,所述每个候选动词的关联词语为与所述每个候选动词具有句法关系的词语;The method according to claim 2, wherein the associated words of each candidate verb are words that have a syntactic relationship with each candidate verb;
    所述根据一组候选动词中的每个候选动词与所述每个候选动词的关联词语之间的句法关系,从所述一组候选动词中确定出目标动词,包括:Determining a target verb from the group of candidate verbs based on the syntactic relationship between each candidate verb in the group of candidate verbs and associated words of each candidate verb includes:
    将所述一组候选动词中与具有句法关系的词语之间满足第二筛选条件的 候选动词,确定为所述目标动词,其中,所述第二筛选条件包括句法关系为所述指定句法关系。A candidate verb in the group of candidate verbs that satisfies a second filtering condition with a word that has a syntactic relationship is determined as the target verb, wherein the second filtering condition includes a syntactic relationship that is the specified syntactic relationship.
  8. 根据权利要求7所述的方法,其中,所述将所述一组候选动词中与具有句法关系的词语之间满足第二筛选条件的候选动词,确定为所述目标动词,包括:The method according to claim 7, wherein determining the candidate verb in the group of candidate verbs that satisfies the second filtering condition with words that have a syntactic relationship as the target verb includes:
    在所述一组候选动词中的任一候选动词、以及与所述任一候选动词具有句法关系的词语之间的句法关系为动宾关系,并且与所述任一候选动词具有句法关系的词语的词性为名词的情况下,将所述任一候选动词确定为所述目标动词,其中,所述指定句法关系包括所述动宾关系,所述第二筛选条件还包括具有句法关系的词语的词性为名词;The syntactic relationship between any candidate verb in the group of candidate verbs and the words that have a syntactic relationship with the any candidate verb is a verb-object relationship, and the words that have a syntactic relationship with the any candidate verb When the part of speech is a noun, any candidate verb is determined as the target verb, wherein the specified syntactic relationship includes the verb-object relationship, and the second filtering condition also includes the Part of speech is noun;
    在所述任一候选动词和与所述任一候选动词具有句法关系的词语之间的句法关系为并列关系的情况下,将所述任一候选动词确定为所述目标动词,其中,所述指定句法关系包括所述并列关系。In the case where the syntactic relationship between any candidate verb and a word having a syntactic relationship with the any candidate verb is a parallel relationship, the any candidate verb is determined as the target verb, wherein, The specified syntactic relationship includes the coordination relationship.
  9. 根据权利要求1至8中任一项所述的方法,其中,所述将所述目标词语、以及所述目标词语的关联词语,确定为所述目标控制语句的目标事件槽位,包括:The method according to any one of claims 1 to 8, wherein determining the target word and associated words of the target word as the target event slot of the target control statement includes:
    在所述目标词语存在多个的情况下,将每个所述目标词语、以及每个所述目标词语的关联词语,确定为与每个所述目标词语对应的事件槽位;When there are multiple target words, each target word and associated words of each target word are determined as event slots corresponding to each target word;
    将与每个所述目标词语对应的事件槽位的组合,确定为所述目标事件槽位。A combination of event slots corresponding to each target word is determined as the target event slot.
  10. 一种事件槽位的提取装置,包括:An event slot extraction device includes:
    获取单元,设置为获取待提取事件槽位的目标控制语句,其中,所述目标控制语句中包括多个词语;The acquisition unit is configured to acquire the target control statement of the event slot to be extracted, wherein the target control statement includes a plurality of words;
    分析单元,设置为对所述目标控制语句进行句法分析,从所述多个词语中确定出一组候选词,其中,所述一组候选词中包含的候选词的数量小于所述多个词语包含的词语的数量;An analysis unit configured to perform syntactic analysis on the target control statement and determine a group of candidate words from the plurality of words, wherein the number of candidate words included in the group of candidate words is less than the number of the plurality of words. the number of words included;
    第一确定单元,设置为根据所述一组候选词中的每个候选词与所述每个候选词的关联词语之间的句法关系,从所述一组候选词中确定出目标词语,其中,所述目标词语与所述目标词语的关联词语之间为指定句法关系;The first determination unit is configured to determine the target word from the group of candidate words based on the syntactic relationship between each candidate word in the group of candidate words and the associated words of each candidate word, wherein , there is a specified syntactic relationship between the target word and the associated words of the target word;
    第二确定单元,设置为将所述目标词语、以及所述目标词语的关联词语,确定为所述目标控制语句的目标事件槽位。The second determination unit is configured to determine the target word and related words of the target word as the target event slot of the target control statement.
  11. 根据权利要求10所述的装置,其中,所述第一确定单元包括:The device according to claim 10, wherein the first determining unit includes:
    第一确定模块,设置为根据一组候选动词中的每个候选动词与所述每个候选动词的关联词语之间的句法关系,从所述一组候选动词中确定出目标动词,其中,所述一组候选词包括所述一组候选动词,所述目标词语包括所述目标动词。The first determination module is configured to determine the target verb from the group of candidate verbs based on the syntactic relationship between each candidate verb in the group of candidate verbs and the associated words of each candidate verb, wherein: The set of candidate words includes the set of candidate verbs, and the target word includes the target verb.
  12. 根据权利要求11所述的装置,其中,所述分析单元包括:The device of claim 11, wherein the analysis unit includes:
    分析模块,设置为对所述目标控制语句进行句法分析,确定所述目标控制指令的核心动词;An analysis module configured to perform syntactic analysis on the target control statement and determine the core verb of the target control instruction;
    第四确定模块,设置为将所述多个词语中与所述核心动词之间具有所述指定句法关系的每个动词,分别确定为一个候选动词,得到所述一组候选动词;The fourth determination module is configured to determine each verb in the plurality of words that has the specified syntactic relationship with the core verb as a candidate verb, and obtain the group of candidate verbs;
    查找模块,设置为查找与所述一组候选动词中的任一候选动词具有所述指定句法关系的动词;A search module configured to search for verbs that have the specified syntactic relationship with any candidate verb in the set of candidate verbs;
    添加模块,设置为在查找到与所述任一候选动词具有所述指定句法关系的动词的情况下,将查找到的动词作为候选动词添加到所述一组候选动词中。An adding module is configured to add the found verb as a candidate verb to the group of candidate verbs when a verb having the specified syntactic relationship with any of the candidate verbs is found.
  13. 根据权利要求11所述的装置,其中,所述每个候选动词的关联词语为所述每个候选动词的相邻词语,所述每个候选动词的相邻词语为位于所述每个候选动词之后、且与所述每个候选动词相邻的词语;所述第一确定单元包括:The device according to claim 11, wherein the associated words of each candidate verb are adjacent words of each candidate verb, and the adjacent words of each candidate verb are located at the location of each candidate verb. words that follow and are adjacent to each candidate verb; the first determination unit includes:
    第一确定子模块,设置为将所述一组候选动词中与相邻词语之间满足第一筛选条件的候选动词,确定为所述目标动词,其中,所述第一筛选条件包括句法关系为所述指定句法关系。The first determination sub-module is configured to determine the candidate verbs in the group of candidate verbs that satisfy the first filtering condition between adjacent words as the target verb, wherein the first filtering condition includes a syntactic relationship of The specified syntactic relationship.
  14. 根据权利要求13所述的装置,其中,所述第一确定子模块包括:The device according to claim 13, wherein the first determining sub-module includes:
    第一确定子单元,设置为在所述一组候选动词中的任一候选动词、以及与所述任一候选动词的相邻词语之间的句法关系为动宾关系,并且所述任一候选动词的相邻词语的词性为名词的情况下,将所述任一候选动词确定为所述目标动词,其中,所述指定句法关系包括所述动宾关系,所述第一筛选条件还包括相邻词语的词性为名词;The first determination subunit is configured such that the syntactic relationship between any candidate verb in the group of candidate verbs and adjacent words to the any candidate verb is a verb-object relationship, and any candidate When the part of speech of the adjacent words of the verb is a noun, any candidate verb is determined as the target verb, wherein the specified syntactic relationship includes the verb-object relationship, and the first filtering condition also includes related The part-of-speech of the adjacent word is a noun;
    第二确定子单元,设置为在所述任一候选动词与所述任一候选动词的相邻词语之间的句法关系为并列关系的情况下,将所述任一候选动词确定为所述目 标动词,其中,所述指定句法关系包括所述并列关系。The second determination subunit is configured to determine any candidate verb as the target when the syntactic relationship between any candidate verb and the adjacent words of the any candidate verb is a parallel relationship. Verb, wherein the specified syntactic relationship includes the parallel relationship.
  15. 根据权利要求11所述的装置,其中,所述第一确定模块包括:The device according to claim 11, wherein the first determining module includes:
    第二确定子模块,设置为将所述一组候选动词中与具有句法关系的词语之间满足第二筛选条件的候选动词,确定为所述目标动词,其中,所述第二筛选条件包括句法关系为所述指定句法关系。The second determination sub-module is configured to determine the candidate verbs in the group of candidate verbs that meet the second filtering condition between words with syntactic relationships as the target verb, wherein the second filtering condition includes syntax A relation is the specified syntactic relation.
  16. 根据权利要求15所述的装置,其中,所述第二确定子模块包括:The device according to claim 15, wherein the second determination sub-module includes:
    第三确定子单元,设置为在所述一组候选动词中的任一候选动词、以及与所述任一候选动词具有句法关系的词语之间的句法关系为动宾关系,并且与所述任一候选动词具有句法关系的词语的词性为名词的情况下,将所述任一候选动词确定为所述目标动词,其中,所述指定句法关系包括所述动宾关系,所述第二筛选条件还包括具有句法关系的词语的词性为名词;The third determination subunit is configured such that the syntactic relationship between any candidate verb in the group of candidate verbs and the words that have a syntactic relationship with the any candidate verb is a verb-object relationship, and the syntactic relationship with any candidate verb is a verb-object relationship. When the part-of-speech of a word with a syntactic relationship between a candidate verb is a noun, any candidate verb is determined as the target verb, wherein the specified syntactic relationship includes the verb-object relationship, and the second filtering condition It also includes words with syntactic relationships whose part-of-speech is noun;
    第四确定子单元,设置为在所述任一候选动词和与所述任一候选动词具有句法关系的词语之间的句法关系为并列关系的情况下,将所述任一候选动词确定为所述目标动词,其中,所述指定句法关系包括所述并列关系。The fourth determination subunit is configured to determine the candidate verb as the candidate verb when the syntactic relationship between the candidate verb and the word having a syntactic relationship with the candidate verb is a parallel relationship. Describe the target verb, wherein the specified syntactic relationship includes the parallel relationship.
  17. 一种计算机可读的存储介质,所述计算机可读的存储介质包括存储的程序,其中,所述程序运行时执行权利要求1至9中任一项所述的方法。A computer-readable storage medium includes a stored program, wherein the method of any one of claims 1 to 9 is executed when the program is run.
  18. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为通过所述计算机程序执行权利要求1至9中任一项所述的方法。An electronic device includes a memory and a processor, a computer program is stored in the memory, and the processor is configured to execute the method according to any one of claims 1 to 9 through the computer program.
PCT/CN2022/096436 2022-04-29 2022-05-31 Event slot extraction method and apparatus, storage medium and electronic apparatus WO2023206703A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210468880.X 2022-04-29
CN202210468880.XA CN117010364A (en) 2022-04-29 2022-04-29 Method and device for extracting event slots, storage medium and electronic device

Publications (1)

Publication Number Publication Date
WO2023206703A1 true WO2023206703A1 (en) 2023-11-02

Family

ID=88517108

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096436 WO2023206703A1 (en) 2022-04-29 2022-05-31 Event slot extraction method and apparatus, storage medium and electronic apparatus

Country Status (2)

Country Link
CN (1) CN117010364A (en)
WO (1) WO2023206703A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460787A (en) * 2020-03-27 2020-07-28 深圳价值在线信息科技股份有限公司 Topic extraction method and device, terminal device and storage medium
CN112231494A (en) * 2020-12-16 2021-01-15 完美世界(北京)软件科技发展有限公司 Information extraction method and device, electronic equipment and storage medium
US20210357599A1 (en) * 2020-05-14 2021-11-18 Google Llc Systems and methods to identify most suitable grammar suggestions among suggestions from a machine translation model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460787A (en) * 2020-03-27 2020-07-28 深圳价值在线信息科技股份有限公司 Topic extraction method and device, terminal device and storage medium
US20210357599A1 (en) * 2020-05-14 2021-11-18 Google Llc Systems and methods to identify most suitable grammar suggestions among suggestions from a machine translation model
CN112231494A (en) * 2020-12-16 2021-01-15 完美世界(北京)软件科技发展有限公司 Information extraction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN117010364A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
JP7346609B2 (en) Systems and methods for performing semantic exploration using natural language understanding (NLU) frameworks
US11164568B2 (en) Speech recognition method and apparatus, and storage medium
JP2021018797A (en) Conversation interaction method, apparatus, computer readable storage medium, and program
US20210225380A1 (en) Voiceprint recognition method and apparatus
WO2019232991A1 (en) Method for recognizing conference voice as text, electronic device and storage medium
WO2017166650A1 (en) Voice recognition method and device
EP3779972A1 (en) Voice wake-up method and apparatus
US20160328467A1 (en) Natural language question answering method and apparatus
WO2019076286A1 (en) User intent recognition method and device for a statement
CN103309846B (en) A kind of processing method of natural language information and device
CN105354180B (en) A kind of method and system for realizing open Semantic interaction service
WO2018153273A1 (en) Semantic parsing method and apparatus, and storage medium
WO2020220914A1 (en) Voice question and answer method and device, computer readable storage medium and electronic device
CN108920649B (en) Information recommendation method, device, equipment and medium
KR102271361B1 (en) Device for automatic question answering
WO2017000809A1 (en) Linguistic interaction method
WO2021208392A1 (en) Voice skill jumping method for man-machine dialogue, electronic device, and storage medium
CN111090727A (en) Language conversion processing method and device and dialect voice interaction system
US10740401B2 (en) System for the automated semantic analysis processing of query strings
US11721328B2 (en) Method and apparatus for awakening skills by speech
KR20210060897A (en) Method and apparatus for processing speech
WO2018094952A1 (en) Content recommendation method and apparatus
JP2021174511A (en) Query analyzing method, device, electronic equipment, program, and readable storage medium
CN116797695A (en) Interaction method, system and storage medium of digital person and virtual whiteboard
WO2023206703A1 (en) Event slot extraction method and apparatus, storage medium and electronic apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22939536

Country of ref document: EP

Kind code of ref document: A1