WO2014033799A1 - 単語意味関係抽出装置 - Google Patents
単語意味関係抽出装置 Download PDFInfo
- Publication number
- WO2014033799A1 WO2014033799A1 PCT/JP2012/071535 JP2012071535W WO2014033799A1 WO 2014033799 A1 WO2014033799 A1 WO 2014033799A1 JP 2012071535 W JP2012071535 W JP 2012071535W WO 2014033799 A1 WO2014033799 A1 WO 2014033799A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- word
- similarity
- words
- semantic relationship
- characters
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/374—Thesaurus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Definitions
- the present invention relates to a technique for extracting a semantic relationship between words from text.
- the synonym dictionary and thesaurus are language resources for absorbing the fluctuation of the language expression in the document and solving the synonym problem, and are used in various language processing applications. Since it is highly valuable data, many dictionaries have been compiled by humans since ancient times.
- Non-Patent Document 1 discloses a context-based synonym extraction technique based on appearance context. There are also methods for dealing with notation fluctuations among synonyms.
- Non-Patent Document 2 discloses a notation-based synonym extraction technique for detecting katakana notation fluctuation based on pronunciation rules. There is also a synonym extraction technique that uses a pattern that explicitly indicates the relationship between words such as “C such as A or B”.
- Non-Patent Document 3 discloses a pattern-based synonym extraction technique using a pattern.
- the above synonym extraction technology is based on unsupervised learning, that is, a type of learning technology that does not use the correct answer given manually. Since unsupervised learning does not require the creation of correct answers, it is an advantage that the cost of manpower is low. However, large dictionaries created manually are now widely available, and these can be used as correct answers, and the benefits of unsupervised learning are reduced. On the other hand, in supervised learning, high accuracy can be obtained by using correct data manually.
- Non-Patent Document 5 a synonym extraction method by supervised learning is disclosed in Non-Patent Document 5.
- synonym extraction is performed by supervised learning using a synonym dictionary created manually as a correct answer. Specifically, the meaning of a word is expressed based on the context of the word, which will be described later, and learning is performed by using a synonym dictionary that is a correct answer, and synonyms are extracted.
- Non-Patent Document 6 discloses a technique for extracting upper / lower terms based on an existing thesaurus and context-based similarity between words.
- Non-Patent Document 4 discloses a technique for extracting upper / lower word relationships of words based on word inclusion relationships.
- Non-Patent Document 7 discloses a technique for extracting synonyms with high accuracy by using a technique for extracting synonyms by a pattern-based method when synonyms are extracted.
- Patent Document 1 discloses a technique for distinguishing synonyms from other similar words and dissimilar words by supervised ranking learning.
- An object of the present invention is to realize a word semantic relationship extraction technique that can distinguish and extract detailed word semantic relationship types in similar words with higher accuracy than conventional methods.
- an unsupervised learning approach such as Non-Patent Document 7
- it is difficult to achieve high accuracy because a manually created thesaurus cannot be used as correct answer data.
- there is no technique for determining a plurality of types of word semantic relationships such as synonyms, upper / lower terms, synonyms, and siblings with arbitrary details.
- synonym extraction is solved as a binary identification problem for determining whether or not it is a synonym, but word semantic relationships other than synonyms are extracted. I can't. Similar words other than synonyms are either recognized as dissimilar words by operating the classifier correctly or mistakenly recognized as synonyms.
- the word semantic relationship extraction technique disclosed in Patent Document 1 tries to distinguish between synonyms and other similar words by treating the problem as a ranking problem.
- 1 is given as the rank because it is very similar, and in the case of upper / lower terms and siblings, it is not as synonymous, but it is ranked as if it is somewhat similar. If 2 is assigned and none of them is given, it is considered that the similarity is low and 3 is assigned as a rank.
- it is not possible to distinguish similar words other than synonyms in more detail, such as upper / lower terms and siblings.
- the present invention has been made in order to solve the above-mentioned problems, and realizes high-precision processing by utilizing a thesaurus as a correct answer, and at the same time, can extract a plurality of types of word semantic relationships in detail.
- the purpose is to provide a relationship extraction method.
- Antonym A word pair indicating a pair of concepts. “Men” and “Woman” etc. (5) Siblings: word pairs that are not synonymous but have a common superordinate concept. “Router” and “Server”. (6) Related words: A word pair that is not similar and hierarchical but is conceptually associated. “Cells” and “Cytology”.
- FIG. 1 is a block diagram illustrating a configuration example of a computer system that implements the present embodiment.
- the computer system shown in FIG. 1 is used in the first embodiment of the present invention. Note that functions that are not used in some embodiments are also included.
- the word meaning relationship extraction device 100 includes a CPU 101, a main memory 102, an input / output device 103, and a disk device 110.
- the CPU 101 performs various processes by executing programs stored in the main memory 102. Specifically, the CPU 101 calls a program stored in the disk device 110 on the main memory 102 and executes it.
- the main memory 102 stores programs executed by the CPU 101, information required by the CPU 101, and the like.
- Information is input to the input / output device 103 from the user.
- the input / output device 103 outputs information in response to an instruction from the CPU 101.
- the input / output device 103 includes at least one of a keyboard, a mouse, and a display.
- the disk device 110 stores various information. Specifically, the disk device 110 includes an OS 111, a word semantic relationship extraction program 112, a text 113, a thesaurus 114, a similarity matrix 115, a context matrix 116, a part of speech pattern 117, a co-occurrence similarity table 118, an identification model 118, characters A similarity table 120 is stored.
- the OS 111 controls the entire processing of the word meaning relationship extraction apparatus 100.
- the word meaning relationship extraction program 112 is a program for extracting a word meaning relationship from the text 113 and the thesaurus 114, and is a feature vector extraction subprogram 1121, a correct answer label setting subprogram 1122, an identification model learning subprogram 1123, and an identification model application subprogram. 1124.
- the text 113 is a text to be input to the word meaning relationship extraction program 112 and does not have to be in a special format.
- a document including a tag such as an HTML document or an XML document
- the thesaurus 114 is a dictionary in which synonyms, upper / lower terms, and siblings created manually are stored.
- the similarity matrix 115 is a matrix that stores a feature vector related to a word pair extracted from text and a synonym dictionary, a label indicating whether or not it is a synonym, and the like.
- the context matrix 116 is a matrix that stores context information of words necessary for calculating context-based similarity.
- the identification model 118 is a model for identifying whether a word pair is a synonym learned from a similarity matrix.
- the identification model 118 is a model for identifying which word semantic relationship a word pair belongs to, learned from the similarity matrix.
- the character similarity table 119 is a table that stores relationships between characters having similar meanings.
- the feature vector extraction subprogram 1121 reads the text 113, extracts all the words in the text, calculates various similarities for an arbitrary set of words, and outputs it as a similarity matrix 115.
- a context matrix 116 that is necessary information is created in advance.
- the part-of-speech pattern 117 is used to create the context matrix 116.
- the correct answer label setting subprogram 1122 reads the thesaurus 114 as correct answer data and sets labels indicating correct answers and various types of word semantic relationships for each word pair in the similarity matrix 115. To do.
- the identification model learning subprogram 1123 reads the similarity matrix 115 and learns the identification model 118 for identifying the word semantic relationship type of the word pair.
- the identification model application subprogram 1124 reads the identification model 118 and gives a determination result of the word semantic relationship type to the word pair in the similarity matrix 115.
- ⁇ ⁇ ⁇ Consider arbitrary word pairs included in text data.
- the word pair is ⁇ computer, calculator>.
- various scales for determining what word semantic relationship a word pair has can be assumed.
- context-based similarity there is a method using the similarity between appearance contexts of words (hereinafter referred to as context-based similarity). Also, similarity based on notation such as focusing on the number of overlapping characters (hereinafter referred to as notation-based similarity) can be considered. Further, a pattern called a lexical syntactic pattern (hereinafter referred to as a pattern base similarity) can be used.
- each method there are various variations in each method.
- context-based similarity there are variations depending on how the word appearance context is defined or how the distance calculation method is defined.
- various measures are considered to be the feature of the word pair, and the word pair is expressed by a feature vector composed of values for each feature.
- a feature configuration method suitable for each word relation type will be described later.
- the word pair ⁇ computer, computer> has a feature 1 dimension value of 0.3, a feature 2 dimension value of 0.2, and a feature N dimension value of 0.8. It is expressed as a vector.
- the feature 1 is, for example, a score based on context similarity
- the feature 2 is a score based on notation-based similarity.
- the word semantic relationship between each word pair is judged using a thesaurus and is labeled. That is, if ⁇ computer, computer> is a synonym in the thesaurus, a label corresponding to the synonym is assigned to the similarity matrix, and if ⁇ computer, computer> is a broader / lowerer term, it corresponds to a broader / lowerer term Give the label you want. If it is not a similar word, a label indicating that it is a dissimilar word is assigned. Of the word semantic relationships in similar words, only the upper and lower words have a direction, and the others have no direction.
- word pairs are arranged in ascending order of letters, and both are treated as the same.
- the direction of the relationship is considered, and if the left word is the upper term, the upper ⁇
- the label for the synonym is 1, the label for the lower / higher word is 2, the label for the upper / lower word is 3, the label for the synonym is 4, the label for the sibling is 5, The label is -1, and the label of the unknown word pair is 0.
- a word pair is represented by a vector of feature values, and correct data is added to solve the problem as a multi-class (category) identification problem.
- the multi-class identification problem is a task for identifying which of three or more classes an unknown case belongs to, and a method of learning an identification model by supervised learning is known.
- the word semantic relationship types such as synonyms, broader / lower terms, synonyms, siblings, etc. are exclusive, and in principle, they do not belong to a plurality of categories at the same time except when the word is an ambiguous word. Therefore, by solving the word semantic relationship type as a multi-class identification problem, not only the detailed word semantic relationship types in similar words can be distinguished, but also the word semantic relationship, for example, synonym extraction accuracy can be improved. It becomes possible.
- the above is the basic concept of this embodiment.
- supervised learning is performed by using each asymmetric score as a feature.
- asymmetric score as features, if both scores are high, synonyms, if one is higher than the other, broader and lowerer terms, if both are moderately high, siblings, etc.
- a boundary can be set.
- the asymmetric similarity is a degree of similarity in which the value for word B when word A is the reference and the value for A when B is the reference are different when there is a word pair ⁇ A, B>.
- asymmetric similarity can be configured as follows. A ranking of similar words is generated based on A, and the ranking of B in the ranking is considered.
- (B) Notation-based method a technique for extracting high-order / low-order words having an inclusive relationship at a word level such as “circuit” and “electronic circuit” is used.
- a score that increases the score for such a word pair of a compound word and its main word is used as a feature amount.
- This feature is not universal because it cannot be extracted from the broader / lower terms of types such as “dog” and “animal”, but there are many broader / lower terms that have inclusive relations in technical terms. It becomes a powerful clue.
- Pattern-based method is a method most frequently used for identifying word pair types, and various word pair types can be extracted by devising a pattern to be extracted. For the broader and narrower terms, patterns such as “B such as A” and “B like A” are used.
- a known technique can be adopted as a technique for determining the positive / negative of a word. For example, a negative expression such as “to suffer” or a positive expression such as “to achieve” is extracted using a dictionary of positive terms and negative terms, and these words are included in the context. Based on the ratio, the positive / negative (negative positiveness) of the word is determined. As the antonym feature amount, it is assumed that the synonym degree is higher as the product of the positive degree of the word pair is negative and larger. A pair of positive words and negative words, for example, ⁇ heaven, evil>, is extracted with this feature amount alone, but by combining with other similarities, it is possible to identify an antonym.
- Kanji are ideograms, and many of the synonyms often include kanji that are synonymous. Since there are not so many kinds of kanji characters, it is considered possible to extract the synonyms by extracting the kanji pairs that are synonymous from the correct synonym data and using them as clues. However, an auxiliary condition is added because it cannot be said that it is a synonym only by including a kanji pair that is a synonym. In many of the opposite words, characters other than the opposite Kanji pair, such as “continuous win” and “continuous loss”, often coincide. Even if they do not match completely, they often contain kanji characters that have similar meanings such as “polar” and “severe”, such as “extreme cold” and “severe heat”.
- the feature amount is configured depending on whether or not a kanji pair that is an opposite is included and a kanji having the same or similar meaning is included in common.
- the same processing can be performed for a language composed of phonetic characters such as English.
- words in meaningful morpheme units it is possible to extract morphemes that are in a symmetrical relationship such as “for” and “back”, “pre” and “post”, and only kanji It is not limited to.
- Word pairs such as “Iraq” and “Afghanistan” are very similar in context-based similarity. However, proper nouns are not synonymous unless they point to the same thing. Thus, if both word pairs are proper nouns and do not indicate the same thing, it is determined that the two words are not synonymous.
- FIG. 4 shows a conceptual diagram of similar word extraction by unsupervised learning.
- the feature vector of each word pair corresponds to a certain point on the N-dimensional space represented by the features 1 to N, and is represented by a black circle in FIG. It is expected that black circles indicating word pairs belonging to each word relationship are distributed in close areas in the space.
- the score is calculated by a function for calculating the similarity, which corresponds to projecting each word pair to a one-dimensional straight line.
- Ranking is defined by projecting onto a one-dimensional straight line, and a threshold is provided to distinguish whether it is a similar word.
- the problems with the unsupervised method are that the projection function (similarity function) is determined manually, and it is difficult to correct by a correct answer, and the threshold value cannot be automatically determined.
- FIG. 5 shows a conceptual diagram of similar word extraction by binary supervised learning.
- binary supervised learning the most appropriate boundary for distinguishing between the two classes is automatically determined according to the correct answer data. In this way, the problem with the unsupervised approach has been solved, but only the two types can be distinguished, which is not suitable for the purpose of distinguishing many types of word relationships.
- FIG. 6 shows a conceptual diagram of similar word extraction by supervised ranking learning.
- Ranking learning unlike binary supervised learning, can handle classification into three or more classes. Based on the correct answer data, it learns the order of cases and the degree of similarity of word pairs in case of similar word extraction, so synonyms that are very similar, broadly similar or slightly similar words that are a little similar, not similar It is possible to distinguish dissimilar words. However, since only one-dimensional values of the degree of similarity are learned, it is impossible to distinguish word pairs that differ in similar manner, such as upper / lower terms, siblings, and synonyms.
- FIG. 7 shows a conceptual diagram of similar word extraction by multi-class supervised learning according to this embodiment.
- a class is assigned to each word semantic relationship, and a boundary defining an area to which a word pair of each word semantic relationship belongs is automatically determined.
- word pairs can be distinguished from a plurality of viewpoints, detailed word pair types in similar words can be distinguished.
- FIG. 8 is a flowchart of word semantic relationship extraction processing executed by the word semantic relationship extraction device according to the first embodiment of this invention.
- step 11 it is determined whether or not all word pairs have been processed. If completed, go to Step 17. If there is an unprocessed word pair, the process proceeds to step 12.
- step 12 it is determined whether or not the processing has been completed for all types of features. If completed, go to step 16. If there is an unprocessed feature, the process proceeds to step 13.
- step 13 the i-th word pair is acquired.
- word pairs can be acquired by, for example, preparing a whole word list by morphological analysis of text and acquiring a combination of two arbitrary words from the list.
- step 14 the j-th feature is calculated for the acquired i-th word pair. Details of the processing in step 14 will be described later.
- step 15 the process proceeds to step 15 and the feature calculation result is stored in the similarity matrix.
- An example of the similarity matrix is as described in FIG.
- step 16 a label is set in the similarity matrix. Labels are set by referring to the thesaurus.
- the thesaurus is data describing word pairs and their word relationship types.
- one word is stored in the headword column, the other is stored in the related word column, and the type of related word for the headword is stored in the type column.
- “computer” is found, “personal computer” is a related word, and “personal computer” is “computer” for a word pair having a higher-order / lower-order relationship such as ⁇ computer, personal computer>. "Is a" subordinate word "(more specific word).
- the thesaurus of FIG. 9 is assumed to hold data redundantly for the sake of dictionary lookup.
- the thesaurus heading field uses one word of the word pair, and search for the related word for the line where the heading matches. Specify the line to be executed.
- the thesaurus type field is acquired and a label is set.
- the label for the synonym is 1, the label for the lower / higher word is 2, the label for the upper / lower word is 3, the label for the synonym is 4, and the label for the sibling is 5. If the word pair does not exist in the thesaurus, the following processing is performed.
- step 17 the identification model is learned. From the similarity matrix, a multi-class identification model is learned only for rows whose labels are not 0.
- An arbitrary method can be used as a learning method for the multi-class identification model. For example, the One versus Rest (One-against-the-Rest) method disclosed in J. Weston and C. Watkins. Multi-class support vector machines. Royal Holloway Technical Report CSD-TR-98-04, 1998. Is used.
- step 18 word semantic relation extraction is performed from the value of the similarity matrix according to the identification model.
- feature vectors are input to a learned classifier to identify word semantic relationships.
- the determination result of the discriminator is stored in the determination result column of the similarity matrix.
- a label corresponding to the word semantic relationship is stored for the word pair whose label is “unknown”, that is, “0”.
- It can also be used for manual thesaurus error checking.
- a thesaurus can be efficiently checked by extracting only words having a determination result different from the label for word pairs to which labels other than “unknown” have already been assigned.
- step 14 various similarities are calculated as features for expressing word pairs.
- description will be made for each type of similarity.
- Context-based similarity is a method for calculating the similarity of word pairs based on the similarity of the context of words.
- the context of a word is a word in the vicinity of the part where the word appears in the text, a word string, or the like.
- Various contexts can be defined depending on what is defined as “neighbor”.
- an example using the following verb and the immediately preceding adjective / adjective verb as the occurrence context will be described below as a context, but other occurrence contexts may be used instead, or added / It is also possible to use in combination.
- the context-based similarity is calculated based on the context matrix 116.
- the context matrix includes a heading field and a context information field, and stores context information including a repetition of a combination of a context word string and its frequency for words in the heading field.
- Figure 10 shows an example of a context matrix.
- the example of FIG. 10 shows the case where the particle + predicate following the focused word is used as the context. For example, in “Computer”, “Start up” appears 15 times and “Connect” appears four times.
- context information of a row corresponding to any two words is acquired, and the similarity is calculated based on the frequency vector of the context word string.
- a method used for document search by a term vector model can be used, and is disclosed in, for example, Kita, Tsuda, and Tsurugi-min "Information Search Algorithm" Kyoritsu Publishing (2002). The method can be used.
- the similarity s is calculated by the similarity calculation method of the following equation.
- the similarity between two context information of two words in a word set which is calculated based on one of the asymmetric word sets, is used as the similarity of the word set.
- Two kinds of similarities calculated based on the other are calculated. In other words, by using two asymmetric scores as features, if both scores are high, they are synonyms, if one is higher than the other, broader / lower terms, if both are moderately high, etc. Thus, it becomes possible to set the boundary.
- Context-matrix can be created by applying a known method such as applying a part-of-speech pattern to a morpheme analysis result or performing syntax analysis after morphological analysis of the text.
- the notation-based similarity is calculated for a set of words based on character information.
- synonyms are particularly different notations such as “computer” and “computer”, as disclosed in Non-Patent Document 2
- the ratio of overlapping characters can be used as similarity.
- the different word is katakana in principle, but even in the case of word pairs consisting of kanji characters, if the meanings are similar, the same characters like "analysis” and “analysis”, “trust” and “trust” Often included.
- the similarity based on the overlapping ratio of characters is referred to as a character overlapping degree.
- the character duplication degree works effectively by combining with different kinds of similarities such as context-based similarity.
- (A) Character overlap The character overlap can be calculated by various methods. Here, as an example, the number of characters included in common between two words is counted. A method of calculation by normalizing the character string length of the shorter word will be described. When a plurality of the same characters are included, m corresponds to one, and when n is included in the other word, there is an m-to-n correspondence. In such a case, it is assumed that the smaller number of characters m or n overlaps.
- step 1411 it is checked whether all characters of word i have been processed. If so, go to Step 1415. If there is an unprocessed character, the process proceeds to step 1412. In step 1412, it is checked whether all characters of word j have been processed. If so, the process proceeds to step 1411. If there is an unprocessed character, the process proceeds to step 1413.
- step 1413 the mth character of the word i and the nth character of the word j are compared to check whether they match. If they match, the process proceeds to step 1414. If not, the process proceeds to step 1412. In step 1414, a flag is set for each of the mth character of word i and the nth character of word j. Thereafter, the process proceeds to Step 1412.
- step 1415 the number of characters with flags of word i and word j are counted, and the smaller one is set as the number of matching characters. For example, assuming that “window” and “window” are to be processed, the three characters “c”, “n”, and “do” match. As for “c”, two characters are included in the “window”, so that 4 characters are flagged in the “window”, and 3 characters are flagged in the “window”. Therefore, it is assumed that the three characters match.
- a character string to be normalized which has a common partial character string length from the beginning of two words as a degree of duplication and a common partial character string length from the end of two words as a degree of duplication Variations such as taking the length as the average of the both and the longer are considered.
- the weight when the characters match can be changed based on the frequency of the characters.
- IDF Inversed Document Frequency
- IDF Inversed Document Frequency
- step 1421 word pairs that are synonyms are acquired from the synonym dictionary.
- step 1422 character pairs made up of characters extracted from one word of the word pair and characters extracted from the other word are acquired for all combinations. For example, in the case of a word pair in which “respect” and “respect” are synonyms, “respect” / “reward”, “respect” / “reel”, “rear” / “reward”, “reel” / “reel” 4 types of character pairs are acquired.
- step 1423 the process proceeds to step 1423, and the frequency of characters included in all words in the synonym dictionary is calculated.
- step 1424 character similarity is calculated for all character pairs.
- the character similarity is obtained by dividing the frequency of a character pair by the frequency of two characters constituting the character pair (Dice coefficient). Self-mutual information amount or the like may be used as the similarity.
- step 1425 with respect to the similarity calculated in step 1424, the similarity for the same character and the similarity for different characters are normalized. Specifically, the average AS of similarity for the same character and the average AD of similarity for different characters are respectively calculated. For the same character, 1.0 is set regardless of the calculated similarity. For different characters, the value obtained by multiplying the value calculated in step 1424 by AD / AS is used as the final similarity.
- FIG. An example of a character similarity table is shown in FIG. It is possible to calculate the similar character overlap degree using the character similarity table.
- the similar character overlap degree may be calculated in the same manner as the character overlap degree. In the case of different characters, the number of characters is added by 1 when the characters match in the character overlap, whereas in the case of the similar character overlap, the similar character table is referred to. It is a point to add character similarity. When the characters match, 1.0 is stored in the similar character table, and thus the character overlap is the same.
- Non-Patent Document 4 the degree of similarity obtained by a method using similarity between morphemes (word partial character strings) having similar meanings or a method using word inclusion relationships as disclosed in Non-Patent Document 4 can be used.
- the word “silver” and “line” are combined into a set and “throw”, “fund”, “silver”, and “line”.
- the product set (matched characters) has 2 elements
- the union has 4 elements
- the Jaccard coefficient is 0.5.
- the Jaccard coefficient is symmetric.
- “bank” is the top of “investment bank”. It expresses that it is a word. In this way, it is possible to extract a detailed word semantic relationship with high accuracy by configuring a set of asymmetric feature amounts and using both as feature amounts.
- Pattern-based similarity uses patterns that explicitly indicate word semantic relationships such as “B like A” and “C like A or B”.
- a word pair that matches the pattern is obtained by collating with a predetermined pattern and character string, or a morphological analysis result.
- the number of extracted word pairs is aggregated, and statistical processing such as normalization is performed to obtain the value of the feature dimension. Since the calculation method of the pattern base similarity is disclosed in Non-Patent Document 3, description thereof is omitted.
- Two types are calculated: a feature value calculated based on one of the word pairs and a feature value calculated based on the other.
- patterns for extracting upper / lower terms such as “B like A” and “B like A” have directionality. That is, when “B like A” is a natural expression, “A like B” is never used.
- the word pairs ⁇ A, B> and ⁇ B, A> are not distinguished, and the upper / lower terms and the lower / greater terms are used as labels.
- a parenthesis expression such as “customer relationship management (CRM)” is an expression that often indicates a synonym and is effective. However, it is not necessarily used only with synonyms. For example, it may be used for nouns and their attributes, such as “Company A (Tokyo)”. In such a case as well, in the case of synonyms, the expressions outside the parentheses can be exchanged, and there is no directionality, and in the case of attribute expressions, the expressions outside the parentheses and the parentheses cannot be exchanged.
- the synonym case and the attribute case can be distinguished by using both the feature amount indicating that “A (B)” has appeared and the feature amount indicating that “B (A)” has appeared.
- Parallel expressions such as “A and B” and “A and B” have essentially no direction, but they cannot be processed accurately unless the sentence structure is correctly analyzed.
- “to” is not a particle indicating parallelism, but may be erroneously processed as a parallel particle. Even in such a case, it is possible to extract only word pairs that are truly synonymous by configuring the feature amount in consideration of whether there is an expression such as “contract and company A”.
- an additional information source such as a manually created thesaurus is used as a correct answer, and at the same time, different types such as a context base, a notation base, and a pattern base are used.
- FIG. 14 is a schematic diagram of the content cloud system.
- the content cloud system includes an Extract Transform Load (ETL) 2703 module, a storage 2704, a search engine 2705 module, a metadata server 2706 module, and a multimedia server 2707 module.
- the content cloud system operates on a general computer including one or more CPUs, memories, and storage devices, and the system itself is composed of various modules.
- each module may be executed by an independent computer.
- each storage is connected to the module via a network or the like, and is realized by distributed processing in which data communication is performed via them.
- the application program 2701 sends a request to the content cloud system via a network or the like, and the content cloud system sends information corresponding to the request to the application 2701.
- the content cloud system targets data in any format such as audio data 2701-1, medical data 2701-2, and mail data 2701-3 as input.
- the various data are, for example, call center call voice, mail data, document data, and the like, and may be structured or not.
- Data input to the content cloud system is temporarily stored in various storages 2702.
- the ETL 2703 in the content cloud system monitors the storage.
- the information extraction processing module corresponding to the data is operated, and the extracted information (metadata) is stored in the content storage. Archived in 2704 and saved.
- the ETL 2703 includes, for example, a text index module, an image recognition module, and the like.
- metadata include time, an N-gram index, an image recognition result (object name), an image feature amount and its related word, This includes speech recognition results.
- these information extraction modules all programs for extracting some information (metadata) can be used, and publicly known techniques can be adopted. Therefore, description of various information extraction modules is omitted here.
- the metadata may be compressed in data size by a data compression algorithm.
- a process of registering the data file name, data registration date, original data type, metadata text information, etc. in the Relational Data Base (RDB) may be performed.
- the search engine 2705 searches the text based on the index created by the ETL 2703 and transmits the search result to the application program 2701.
- a publicly known technique can be applied to the search engine and its algorithm.
- the search engine may include a module that searches not only text but also data such as images and sounds.
- the metadata server 2706 manages the metadata stored in the RDB. For example, in ETL2702, if the file name of data, the date of data registration, the type of original data, metadata text information, etc. are registered in the RDB, if a request is received from the application 2701, Information in the database is transmitted to the application 2701.
- the multimedia server 2707 pieces of metadata extracted by the ETL 2703 are associated with each other, structured in a graph format, and stored.
- association mapping the original voice file, image data, related words, and the like are expressed in a network format with respect to the voice recognition result “apple” stored in the content storage 2704.
- the multimedia server 2707 transmits meta information corresponding to the request to the application 2701. For example, when there is a request for “apple”, related meta information such as an image of an apple, an average market price, and an artist's song name is provided based on the constructed graph structure.
- the thesaurus is used as follows.
- the first pattern is to use it for searching metadata.
- a speech recognition result is expressed by metadata such as “apple” and a query such as “apple” is entered, the query can be searched by converting it into a synonym using a thesaurus. can do.
- the given metadata is not consistent, “Ringo” is given to some data and “Apple” is given to other data, the same metadata is given. It becomes possible to handle.
- the second pattern is to use when assigning metadata, particularly when assigning metadata using text information.
- Image metadata is obtained by statistically processing the words contained in the text, but it is known that the accuracy decreases due to a problem called sparseness that the amount of data is insufficient and cannot be accurately statistically processed. It has been.
- a thesaurus By using a thesaurus, it is possible to avoid such a problem, and it is possible to extract metadata with high accuracy.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
(1)同義語:同じ意味を持つ単語であり、テキスト中での置き換えが可能である単語のペア。「コンピュータ」と「電子計算機」など。
(2)上位/下位語:一方が他方の上位概念であるような単語ペア。「コンピュータ」と「サーバ」など。
(3)部分/全体語:一方が他方の一部であるような単語ペア。「帽子」と「つば」など。
(4)対義語:対となる概念を示す単語ペア。「男」と「女」など。
(5)兄弟語:同義ではないが、共通の上位概念を持つ単語ペア。「ルータ」と「サーバ」など。
(6)関連語:類似しておらず、階層的でもないが、概念的に連想される単語ペア。「細胞」と「細胞学」など。
(1)上位・下位語
(a)文脈ベース方式
単純な文脈ベース方式では、ある単語ペアに関する類似度がスカラ値で与えられ、数値が大きい場合には(狭義の)同義語、中程度以下の場合には、同義語以外の類似語のいずれかだと考える。よって、上位・下位語、対義語、兄弟語の区別を行うことが困難である。
本実施例では、「回路」と「電子回路」のような単語レベルでの包含関係にある上位・下位語を抽出する技術を用いる。このような複合語とその主辞となる単語の単語ペアに対してスコアが高くなるようなスコアを特徴量として用いる。この特徴量は、「犬」と「動物」のような種類の上位・下位語は抽出できず汎用的ではないが、専門用語では包含関係を持つ上位・下位語が多く存在し、実用上は強力な手掛かりとなる。
パターンベース方式は、単語ペア種別の識別に最も多用される方式であり、抽出するパターンを工夫することによって、様々な単語ペア種別を抽出することができる。上位・下位語については、「A等のB」、「AのようなB」等のパターンを用いる。
(a)文脈ベース方式
文脈ベースの特徴量では、対義語の抽出は難しい。対義語は、ある1つの属性を除き、他の属性が全て一致している単語ペアであり、文脈上は非常に類似していることが理由である。本実施形態では、一部の対義語を抽出するための特徴量として以下に述べる特徴量を使用する。対義語の中には、「天国」と「地獄」、「善」と「悪」のように一方がポジティブ、他方がネガティブな意味を持つものが多く存在する。そこで、単語がポジティブな意味を持つか、ネガティブな意味を持つかを文脈によって判定し、単語ペアがポジティブ/ネガティブな単語の組である場合にスコアが大きくなる量を考え、対義語であるかどうかを示す特徴量として使用する。単語のポジティブさ、ネガティブさを判定する技術は、公知の技術を採用可能である。一例としては、「を被る」のようなネガティブな表現、「を達成する」のようなポジティブな表現をポジティブ用語、ネガティブ用語の辞書を用いて抽出し、これらの語が文脈に含まれている割合に基づいて、単語のポジティブ/ネガティブさ(マイナスのポジティブ度)を判定する。対義語特徴量としては、単語ペアのポジティブ度の積がマイナスで大きいほど対義語度が高いと考えることとする。この特徴量だけでは、ポジティブな単語とネガティブな単語のペア、例えば<天国、悪>、が抽出されるが、他の類似度と組み合わせることで、対義語の識別が可能となる。
漢字は表意文字であり、対義語の多くは、対義である漢字を含むことが多い。漢字はそれほど種類が多くないことから、正解の対義語データから、対義である漢字ペアを抽出し、これを手掛かりとすることで、対義語を抽出することが可能であると考えられる。ただし、対義である漢字ペアを含むかどうかだけでは、対義語であるとは言えないため、補助的な条件を加える。対義語の多くは、「連勝」と「連敗」のように対義である漢字ペア以外の文字が一致している場合が多い。また、完全には一致していなくても、「極寒」と「酷暑」のように、「極」と「酷」のように似た意味の漢字を含むことが多い。よって、対義である漢字ペアを含み、かつ同じあるいは類似した意味を持つ漢字を共通に含むかどうかによって特徴量を構成する。また、英語のような表音文字からなる言語に対しても、同様な処理が可能となる。すなわち、単語を意味のある形態素単位で考えることで、”for”と”back”や、”pre”と”post”のような対義関係にある形態素を抽出することが可能であり、漢字のみに限定するものではない。
「や」、「と」などの並列助詞は、類似語抽出において最も基本的に用いられるパターンである。通常、同義語が抽出できると考えられがちだが、実際には、「男と女」、「日本や中国」のように、対義語や兄弟語を導く場合が多く、逆に厳密な意味での同義語には使われない。例えば、表記揺れは最も厳密な意味での同義語だが、「コンピュータやコンピューター」のような言い方は、通常用いられない。そこで、並列表現のパターンを対義語、兄弟語抽出のための特徴量として導入する。
(a)文脈ベース
非対称な類似度の両方が中程度に高い場合が、兄弟語になると考えられる。
(b)表記ベース
兄弟語のみを抽出するための特徴量は特に追加しなかった。
(c)パターンベース
対義語と同じパターンを用いた。兄弟語に固有のパターンは使用していない。
単語ペアに関する特徴量ではないが、単語が固有名詞であるかどうかは重要な情報である。「イラク」と「アフガニスタン」のような単語ペアは、文脈ベース類似度では非常に類似している。しかしながら、固有名詞の場合には、指しているものが同じでなければ、同義語とは言えない。よって、単語ペアが両方とも固有名詞の場合であって、同じものを示さないときには、2つの単語を同義語でないと判定する。
文脈ベース類似度は、単語の文脈の類似性によって単語ペアの類似度を計算する方法である。ある単語の文脈とは、その単語がテキスト中に出現している箇所の「近傍」の単語、あるいは単語列等のことである。何をもって「近傍」と定義するかによって、様々な文脈が定義できる。代表的な方法として、以下では、文脈として、後続する動詞及び直前に出現する形容詞・形容動詞を出現文脈として用いる例を説明するが、これ以外の出現文脈を代替して使用する、あるいは追加・組み合わせて使用することも可能である。また、文脈同士の類似度計算式にも様々な方法が存在する。
以下では、表記ベース類似度を計算する方法について説明する。表記ベース類似度は、単語の組に対し、文字の情報に基づいて類似度を計算する。同義語が特に、「コンピュータ」と「コンピューター」のような異表記語の場合、非特許文献2に開示されているように、多くの文字が重複していることから文字の重複している割合は類似度として用いることができる。異表記語は原則的にカタカナ語であるが、漢字からなる単語ペアでも、意味が類似している場合に、「分析」と「解析」、「信頼」と「信用」のように同じ文字が含まれることは多い。以下では、文字の重複割合に基づく類似度を文字重複度と呼ぶ。漢字からなる単語の場合、特に2文字単語のような文字数が短い単語の場合は、「分析」と「透析」のように同じ文字を含んでいても意味が異なる単語が多く存在する。本実施例では、文脈ベース類似度のような異なる種類の類似度と組み合わせることによって、文字重複度が有効に作用する。
文字の重複度は、様々な方法で計算することができるが、ここでは一例として2個の単語間で共通に含まれている文字をカウントし、2個の単語のうち短い方の単語の文字列長で正規化することで計算する方法を説明する。同じ文字が複数含まれている場合には、一方にm個、他方の単語にn個含まれている場合には、m対nの対応関係となる。このような場合は、m又はnの小さい方の個数の文字が重複したものとする。
同義語辞書から文字の類似度を学習し、類似文字も含めて文字の重複度を計算する。文字の類似度の計算方法について、図12に示すフローチャートを用いて説明する。
パターンベース類似度は、「AのようなB」、「AやBなどのC」のような単語意味関係を明示的に示すパターンを使用する。予め定められたパターンと文字列、あるいは形態素解析結果と照合することによって、パターンと合致する単語ペアを取得する。抽出した単語ペアの数を集計し、正規化などの統計処理を行い素性の次元の値とする。パターンベース類似度の計算方法は、非特許文献3に開示されているので、説明は省略する。
101 CPU
102 主メモリ
103 入出力装置
110 ディスク装置
111 OS
112 単語意味関係抽出プログラム
1121 素性ベクトル抽出サブプログラム
1122 正解ラベル設定サブプログラム
1123 識別モデル学習サブプログラム
1124 識別モデル適用サブプログラム
113 テキスト
114 シソーラス
115 類似度行列
116 文脈行列
117 品詞パターン
118 識別モデル
119 文字類似度テーブル
Claims (6)
- テキストから抽出した単語の組に対してそれぞれ異なる複数種類の類似度を要素とする素性ベクトルを生成する手段と、
既知の辞書を参照し、前記素性ベクトルに対して単語意味関係を示すラベルを付与する手段と、
前記ラベルが付与された複数の素性ベクトルに基づいて単語意味関係を識別するために用いる単語意味関係識別用データを多カテゴリの識別問題として学習する手段と、
前記学習した単語意味関係識別用データに基づいて、任意の単語の組に対して単語意味関係を識別する手段と、を備えることを特徴とする単語意味関係抽出装置。 - 請求項1に記載の単語意味関係抽出装置であって、
前記素性ベクトルを生成する手段は、
注目する単語の前記テキスト中における出現箇所の近傍の単語を当該注目する単語の文脈情報として抽出する手段と、
前記単語の組の類似度として、当該単語の組の2つの単語の文脈情報同士の類似度であって、単語の組の一方を基準に計算した類似度と他方を基準にして計算した類似度の2種類の類似度を計算する手段と、を備えることを特徴とする単語意味関係抽出装置。 - 請求項1に記載の単語意味関係抽出装置であって、
前記素性ベクトルを生成する手段は、
前記単語の組の2つの単語に含まれる文字同士の対応関係を同一の文字であるか、文字の意味が類似しているかどうかに基づいて計算する手段と、
前記単語の組の類似度として、前記文字同士の対応関係に基づいた類似度であって、単語の組の一方を基準に計算した類似度と他方を基準にして計算した類似度の2種類の類似度を計算する手段と、を備えることを特徴とする単語意味関係抽出装置。 - 請求項1に記載の単語意味関係抽出装置であって、
前記素性ベクトルを生成する手段は、
予め格納された、単語間の関係を示すパターンによって単語の組を抽出する手段と、
前記抽出された単語の組の頻度に基づいた統計量を素性の値とする手段と、を有し、
前記単語の組の一方を基準にして算出された素性の値と他方を基準にして算出された素性の値の2種類を算出することを特徴とする単語意味関係抽出装置。 - 請求項1に記載の単語意味関係抽出装置であって、
前記単語意味関係は、前記単語の組を構成する2つの単語が、同義語であるか、上位・下位語であるか、対義語であるか、兄弟語であるか、あるいはそれらの何れでもないか、であることを特徴とする単語意味関係抽出装置。 - 請求項1に記載の単語意味関係抽出装置であって、
前記単語の組を構成する2つの単語が固有名詞の場合であって、前記2つの単語が同じものを示さないときには、前記2つの単語を同義語でないと判定する手段を備えることを特徴とする単語意味関係抽出装置。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/423,142 US20150227505A1 (en) | 2012-08-27 | 2012-08-27 | Word meaning relationship extraction device |
PCT/JP2012/071535 WO2014033799A1 (ja) | 2012-08-27 | 2012-08-27 | 単語意味関係抽出装置 |
JP2014532583A JP5936698B2 (ja) | 2012-08-27 | 2012-08-27 | 単語意味関係抽出装置 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2012/071535 WO2014033799A1 (ja) | 2012-08-27 | 2012-08-27 | 単語意味関係抽出装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014033799A1 true WO2014033799A1 (ja) | 2014-03-06 |
Family
ID=50182650
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/071535 WO2014033799A1 (ja) | 2012-08-27 | 2012-08-27 | 単語意味関係抽出装置 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20150227505A1 (ja) |
JP (1) | JP5936698B2 (ja) |
WO (1) | WO2014033799A1 (ja) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106469144A (zh) * | 2016-08-29 | 2017-03-01 | 东软集团股份有限公司 | 文本相似度计算方法及装置 |
CN107301248A (zh) * | 2017-07-19 | 2017-10-27 | 百度在线网络技术(北京)有限公司 | 文本的词向量构建方法和装置、计算机设备、存储介质 |
JP2018088101A (ja) * | 2016-11-28 | 2018-06-07 | 富士通株式会社 | 同義表現抽出装置、同義表現抽出方法、及び同義表現抽出プログラム |
CN109408824A (zh) * | 2018-11-05 | 2019-03-01 | 百度在线网络技术(北京)有限公司 | 用于生成信息的方法和装置 |
WO2019082362A1 (ja) | 2017-10-26 | 2019-05-02 | 三菱電機株式会社 | 単語意味関係推定装置および単語意味関係推定方法 |
JP2019149097A (ja) * | 2018-02-28 | 2019-09-05 | 株式会社日立製作所 | 語彙間関係性推測装置および語彙間関係性推測方法 |
CN110287337A (zh) * | 2019-06-19 | 2019-09-27 | 上海交通大学 | 基于深度学习和知识图谱获取医学同义词的系统及方法 |
US10437932B2 (en) | 2017-03-28 | 2019-10-08 | Fujitsu Limited | Determination method and determination apparatus |
WO2020040883A1 (en) * | 2018-08-22 | 2020-02-27 | Ebay Inc. | Conversational assistant using extracted guidance knowledge |
CN110852056A (zh) * | 2018-07-25 | 2020-02-28 | 中兴通讯股份有限公司 | 一种获取文本相似度的方法、装置、设备及可读存储介质 |
CN111046657A (zh) * | 2019-12-04 | 2020-04-21 | 东软集团股份有限公司 | 一种实现文本信息标准化的方法、装置及设备 |
CN111144129A (zh) * | 2019-12-26 | 2020-05-12 | 成都航天科工大数据研究院有限公司 | 一种基于自回归与自编码的语义相似度获取方法 |
JP2020190970A (ja) * | 2019-05-23 | 2020-11-26 | 株式会社日立製作所 | 文書処理装置およびその方法、プログラム |
CN113836939A (zh) * | 2021-09-24 | 2021-12-24 | 北京百度网讯科技有限公司 | 基于文本的数据分析方法和装置 |
Families Citing this family (134)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10706373B2 (en) | 2011-06-03 | 2020-07-07 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US9460078B2 (en) * | 2012-12-06 | 2016-10-04 | Accenture Global Services Limited | Identifying glossary terms from natural language text documents |
KR20240132105A (ko) | 2013-02-07 | 2024-09-02 | 애플 인크. | 디지털 어시스턴트를 위한 음성 트리거 |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
KR101772152B1 (ko) | 2013-06-09 | 2017-08-28 | 애플 인크. | 디지털 어시스턴트의 둘 이상의 인스턴스들에 걸친 대화 지속성을 가능하게 하기 위한 디바이스, 방법 및 그래픽 사용자 인터페이스 |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
DE112014003653B4 (de) | 2013-08-06 | 2024-04-18 | Apple Inc. | Automatisch aktivierende intelligente Antworten auf der Grundlage von Aktivitäten von entfernt angeordneten Vorrichtungen |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
CN110797019B (zh) | 2014-05-30 | 2023-08-29 | 苹果公司 | 多命令单一话语输入方法 |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
JP6352695B2 (ja) * | 2014-06-19 | 2018-07-04 | 株式会社東芝 | 文字検出装置、方法およびプログラム |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
CN105630763B (zh) * | 2014-10-31 | 2019-08-02 | 国际商业机器公司 | 用于提及检测中的消歧的方法和系统 |
CN105824797B (zh) * | 2015-01-04 | 2019-11-12 | 华为技术有限公司 | 一种评价语义相似度的方法、装置和系统 |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US9898458B2 (en) * | 2015-05-08 | 2018-02-20 | International Business Machines Corporation | Generating distributed word embeddings using structured information |
US9672814B2 (en) | 2015-05-08 | 2017-06-06 | International Business Machines Corporation | Semi-supervised learning of word embeddings |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
CN108027823B (zh) * | 2015-07-13 | 2022-07-12 | 帝人株式会社 | 信息处理装置、信息处理方法以及计算机可读取的存储介质 |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
WO2017189768A1 (en) * | 2016-04-26 | 2017-11-02 | Ponddy Education Inc. | Affinity knowledge based computational learning system |
CN107402933A (zh) * | 2016-05-20 | 2017-11-28 | 富士通株式会社 | 实体多音字消歧方法和实体多音字消歧设备 |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
KR102565274B1 (ko) * | 2016-07-07 | 2023-08-09 | 삼성전자주식회사 | 자동 통역 방법 및 장치, 및 기계 번역 방법 및 장치 |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | USER INTERFACE FOR CORRECTING RECOGNITION ERRORS |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
DK180048B1 (en) | 2017-05-11 | 2020-02-04 | Apple Inc. | MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770428A1 (en) | 2017-05-12 | 2019-02-18 | Apple Inc. | LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK201770411A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | MULTI-MODAL INTERFACES |
US10311144B2 (en) * | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US20180336275A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Intelligent automated assistant for media exploration |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
DK179549B1 (en) | 2017-05-16 | 2019-02-12 | Apple Inc. | FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES |
CN107729509B (zh) * | 2017-10-23 | 2020-07-07 | 中国电子科技集团公司第二十八研究所 | 基于隐性高维分布式特征表示的篇章相似度判定方法 |
CN107977358A (zh) * | 2017-11-23 | 2018-05-01 | 浪潮金融信息技术有限公司 | 语句识别方法及装置、计算机存储介质和终端 |
CN107992472A (zh) * | 2017-11-23 | 2018-05-04 | 浪潮金融信息技术有限公司 | 句子相似度计算方法及装置、计算机存储介质和终端 |
US10685183B1 (en) * | 2018-01-04 | 2020-06-16 | Facebook, Inc. | Consumer insights analysis using word embeddings |
JP6509391B1 (ja) * | 2018-01-31 | 2019-05-08 | 株式会社Fronteo | 計算機システム |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
DK179822B1 (da) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US11076039B2 (en) | 2018-06-03 | 2021-07-27 | Apple Inc. | Accelerated task performance |
US11138278B2 (en) * | 2018-08-22 | 2021-10-05 | Gridspace Inc. | Method for querying long-form speech |
CN110209810B (zh) * | 2018-09-10 | 2023-10-31 | 腾讯科技(深圳)有限公司 | 相似文本识别方法以及装置 |
CN109284490B (zh) * | 2018-09-13 | 2024-02-27 | 长沙劲旅网络科技有限公司 | 一种文本相似度计算方法、装置、电子设备及存储介质 |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
CN109754159B (zh) * | 2018-12-07 | 2022-08-23 | 国网江苏省电力有限公司南京供电分公司 | 一种电网运行日志的信息提取方法及系统 |
US11640422B2 (en) * | 2018-12-21 | 2023-05-02 | Atlassian Pty Ltd. | Machine resolution of multi-context acronyms |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | USER ACTIVITY SHORTCUT SUGGESTIONS |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
DK201970511A1 (en) | 2019-05-31 | 2021-02-15 | Apple Inc | Voice identification in digital assistant systems |
US11227599B2 (en) | 2019-06-01 | 2022-01-18 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
JP7343311B2 (ja) * | 2019-06-11 | 2023-09-12 | ファナック株式会社 | 文書検索装置及び文書検索方法 |
JP7316165B2 (ja) * | 2019-09-20 | 2023-07-27 | 株式会社日立製作所 | 情報処理方法および情報処理装置 |
WO2021056255A1 (en) | 2019-09-25 | 2021-04-01 | Apple Inc. | Text detection using global geometry estimators |
CN111259655B (zh) * | 2019-11-07 | 2023-07-18 | 上海大学 | 一种基于语义的物流智能客服问题相似度计算方法 |
CN113302683B (zh) * | 2019-12-24 | 2023-08-04 | 深圳市优必选科技股份有限公司 | 多音字预测方法及消歧方法、装置、设备及计算机可读存储介质 |
CN111160012B (zh) * | 2019-12-26 | 2024-02-06 | 上海金仕达卫宁软件科技有限公司 | 医学术语识别方法、装置和电子设备 |
CN113282779A (zh) | 2020-02-19 | 2021-08-20 | 阿里巴巴集团控股有限公司 | 图像搜索方法、装置、设备 |
CN111539213B (zh) * | 2020-04-17 | 2022-07-01 | 华侨大学 | 一种多源管理条款的语义互斥的智能检测方法 |
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11038934B1 (en) | 2020-05-11 | 2021-06-15 | Apple Inc. | Digital assistant hardware abstraction |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
CN113763061B (zh) * | 2020-06-03 | 2024-07-19 | 北京沃东天骏信息技术有限公司 | 相似物品聚合的方法和装置 |
US20230274085A1 (en) * | 2020-06-30 | 2023-08-31 | National Research Council Of Canada | Vector space model for form data extraction |
CN111813896B (zh) * | 2020-07-13 | 2022-12-02 | 重庆紫光华山智安科技有限公司 | 文本三元组关系识别方法、装置、训练方法及电子设备 |
US12106051B2 (en) | 2020-07-16 | 2024-10-01 | Optum Technology, Inc. | Unsupervised approach to assignment of pre-defined labels to text documents |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
CN112183088B (zh) * | 2020-09-28 | 2023-11-21 | 云知声智能科技股份有限公司 | 词语层级确定的方法、模型构建方法、装置及设备 |
CN112507114A (zh) * | 2020-11-04 | 2021-03-16 | 福州大学 | 一种基于词注意力机制的多输入lstm_cnn文本分类方法及系统 |
US11941357B2 (en) | 2021-06-23 | 2024-03-26 | Optum Technology, Inc. | Machine learning techniques for word-based text similarity determinations |
KR102579908B1 (ko) * | 2021-07-26 | 2023-09-18 | 네이버 주식회사 | 비지도 대조 학습을 통해 데이터를 분류할 수 있는 방법, 컴퓨터 장치, 및 컴퓨터 프로그램 |
US12118350B1 (en) | 2021-09-30 | 2024-10-15 | Amazon Technologies, Inc. | Hierarchical clustering for coding practice discovery |
US11989240B2 (en) | 2022-06-22 | 2024-05-21 | Optum Services (Ireland) Limited | Natural language processing machine learning frameworks trained using multi-task training routines |
US12112132B2 (en) | 2022-06-22 | 2024-10-08 | Optum Services (Ireland) Limited | Natural language processing machine learning frameworks trained using multi-task training routines |
US12111869B2 (en) | 2022-08-08 | 2024-10-08 | Bank Of America Corporation | Identifying an implementation of a user-desired interaction using machine learning |
CN116975167B (zh) * | 2023-09-20 | 2024-02-27 | 联通在线信息科技有限公司 | 基于加权Jaccard系数的元数据分级方法及系统 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007011775A (ja) * | 2005-06-30 | 2007-01-18 | Nippon Telegr & Teleph Corp <Ntt> | 辞書作成装置、辞書作成方法、プログラム及び記録媒体 |
JP2011118526A (ja) * | 2009-12-01 | 2011-06-16 | Hitachi Ltd | 単語意味関係抽出装置 |
JP2011175497A (ja) * | 2010-02-25 | 2011-09-08 | Nippon Telegr & Teleph Corp <Ntt> | データ抽出装置、データ抽出方法、及びプログラム |
Family Cites Families (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4849898A (en) * | 1988-05-18 | 1989-07-18 | Management Information Technologies, Inc. | Method and apparatus to identify the relation of meaning between words in text expressions |
US5559940A (en) * | 1990-12-14 | 1996-09-24 | Hutson; William H. | Method and system for real-time information analysis of textual material |
EP0494573A1 (en) * | 1991-01-08 | 1992-07-15 | International Business Machines Corporation | Method for automatically disambiguating the synonymic links in a dictionary for a natural language processing system |
US6810376B1 (en) * | 2000-07-11 | 2004-10-26 | Nusuara Technologies Sdn Bhd | System and methods for determining semantic similarity of sentences |
US7548863B2 (en) * | 2002-08-06 | 2009-06-16 | Apple Inc. | Adaptive context sensitive analysis |
JP4525154B2 (ja) * | 2004-04-21 | 2010-08-18 | 富士ゼロックス株式会社 | 情報処理システム及び情報処理方法、並びにコンピュータ・プログラム |
JP4426479B2 (ja) * | 2005-02-18 | 2010-03-03 | 東芝情報システム株式会社 | 単語階層関係解析装置及びそれに用いる方法、単語階層関係解析プログラム |
JP2006285419A (ja) * | 2005-03-31 | 2006-10-19 | Sony Corp | 情報処理装置および方法、並びにプログラム |
CN100592293C (zh) * | 2007-04-28 | 2010-02-24 | 李树德 | 基于智能本体的知识搜索引擎及其实现方法 |
US7899666B2 (en) * | 2007-05-04 | 2011-03-01 | Expert System S.P.A. | Method and system for automatically extracting relations between concepts included in text |
US7962507B2 (en) * | 2007-11-19 | 2011-06-14 | Microsoft Corporation | Web content mining of pair-based data |
US8306983B2 (en) * | 2009-10-26 | 2012-11-06 | Agilex Technologies, Inc. | Semantic space configuration |
US8874432B2 (en) * | 2010-04-28 | 2014-10-28 | Nec Laboratories America, Inc. | Systems and methods for semi-supervised relationship extraction |
WO2011153392A2 (en) * | 2010-06-03 | 2011-12-08 | Thomson Licensing | Semantic enrichment by exploiting top-k processing |
EP2588970A1 (en) * | 2010-06-29 | 2013-05-08 | Springsense Pty Ltd | Method and system for determining word senses by latent semantic distance |
JP5544602B2 (ja) * | 2010-11-15 | 2014-07-09 | 株式会社日立製作所 | 単語意味関係抽出装置及び単語意味関係抽出方法 |
US9037452B2 (en) * | 2012-03-16 | 2015-05-19 | Afrl/Rij | Relation topic construction and its application in semantic relation extraction |
US20140015855A1 (en) * | 2012-07-16 | 2014-01-16 | Canon Kabushiki Kaisha | Systems and methods for creating a semantic-driven visual vocabulary |
US20140067368A1 (en) * | 2012-08-29 | 2014-03-06 | Microsoft Corporation | Determining synonym-antonym polarity in term vectors |
-
2012
- 2012-08-27 JP JP2014532583A patent/JP5936698B2/ja not_active Expired - Fee Related
- 2012-08-27 US US14/423,142 patent/US20150227505A1/en not_active Abandoned
- 2012-08-27 WO PCT/JP2012/071535 patent/WO2014033799A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007011775A (ja) * | 2005-06-30 | 2007-01-18 | Nippon Telegr & Teleph Corp <Ntt> | 辞書作成装置、辞書作成方法、プログラム及び記録媒体 |
JP2011118526A (ja) * | 2009-12-01 | 2011-06-16 | Hitachi Ltd | 単語意味関係抽出装置 |
JP2011175497A (ja) * | 2010-02-25 | 2011-09-08 | Nippon Telegr & Teleph Corp <Ntt> | データ抽出装置、データ抽出方法、及びプログラム |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106469144A (zh) * | 2016-08-29 | 2017-03-01 | 东软集团股份有限公司 | 文本相似度计算方法及装置 |
JP2018088101A (ja) * | 2016-11-28 | 2018-06-07 | 富士通株式会社 | 同義表現抽出装置、同義表現抽出方法、及び同義表現抽出プログラム |
US10437932B2 (en) | 2017-03-28 | 2019-10-08 | Fujitsu Limited | Determination method and determination apparatus |
CN107301248A (zh) * | 2017-07-19 | 2017-10-27 | 百度在线网络技术(北京)有限公司 | 文本的词向量构建方法和装置、计算机设备、存储介质 |
JPWO2019082362A1 (ja) * | 2017-10-26 | 2020-02-27 | 三菱電機株式会社 | 単語意味関係推定装置および単語意味関係推定方法 |
US11328006B2 (en) | 2017-10-26 | 2022-05-10 | Mitsubishi Electric Corporation | Word semantic relation estimation device and word semantic relation estimation method |
WO2019082362A1 (ja) | 2017-10-26 | 2019-05-02 | 三菱電機株式会社 | 単語意味関係推定装置および単語意味関係推定方法 |
JP2019149097A (ja) * | 2018-02-28 | 2019-09-05 | 株式会社日立製作所 | 語彙間関係性推測装置および語彙間関係性推測方法 |
CN110852056A (zh) * | 2018-07-25 | 2020-02-28 | 中兴通讯股份有限公司 | 一种获取文本相似度的方法、装置、设备及可读存储介质 |
WO2020040883A1 (en) * | 2018-08-22 | 2020-02-27 | Ebay Inc. | Conversational assistant using extracted guidance knowledge |
US11238508B2 (en) | 2018-08-22 | 2022-02-01 | Ebay Inc. | Conversational assistant using extracted guidance knowledge |
CN109408824A (zh) * | 2018-11-05 | 2019-03-01 | 百度在线网络技术(北京)有限公司 | 用于生成信息的方法和装置 |
CN109408824B (zh) * | 2018-11-05 | 2023-04-25 | 百度在线网络技术(北京)有限公司 | 用于生成信息的方法和装置 |
JP2020190970A (ja) * | 2019-05-23 | 2020-11-26 | 株式会社日立製作所 | 文書処理装置およびその方法、プログラム |
CN110287337A (zh) * | 2019-06-19 | 2019-09-27 | 上海交通大学 | 基于深度学习和知识图谱获取医学同义词的系统及方法 |
CN111046657A (zh) * | 2019-12-04 | 2020-04-21 | 东软集团股份有限公司 | 一种实现文本信息标准化的方法、装置及设备 |
CN111046657B (zh) * | 2019-12-04 | 2023-10-13 | 东软集团股份有限公司 | 一种实现文本信息标准化的方法、装置及设备 |
CN111144129A (zh) * | 2019-12-26 | 2020-05-12 | 成都航天科工大数据研究院有限公司 | 一种基于自回归与自编码的语义相似度获取方法 |
CN111144129B (zh) * | 2019-12-26 | 2023-06-06 | 成都航天科工大数据研究院有限公司 | 一种基于自回归与自编码的语义相似度获取方法 |
CN113836939A (zh) * | 2021-09-24 | 2021-12-24 | 北京百度网讯科技有限公司 | 基于文本的数据分析方法和装置 |
CN113836939B (zh) * | 2021-09-24 | 2023-07-21 | 北京百度网讯科技有限公司 | 基于文本的数据分析方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
JP5936698B2 (ja) | 2016-06-22 |
US20150227505A1 (en) | 2015-08-13 |
JPWO2014033799A1 (ja) | 2016-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5936698B2 (ja) | 単語意味関係抽出装置 | |
Jung | Semantic vector learning for natural language understanding | |
JP5356197B2 (ja) | 単語意味関係抽出装置 | |
US9792277B2 (en) | System and method for determining the meaning of a document with respect to a concept | |
US8972408B1 (en) | Methods, systems, and articles of manufacture for addressing popular topics in a social sphere | |
US20150120738A1 (en) | System and method for document classification based on semantic analysis of the document | |
US11657076B2 (en) | System for uniform structured summarization of customer chats | |
US20160155058A1 (en) | Non-factoid question-answering system and method | |
Mohamed et al. | A hybrid approach for paraphrase identification based on knowledge-enriched semantic heuristics | |
Gaur et al. | Semi-supervised deep learning based named entity recognition model to parse education section of resumes | |
Zhang et al. | Natural language processing: a machine learning perspective | |
JP2006244262A (ja) | 質問回答検索システム、方法およびプログラム | |
JP2011118689A (ja) | 検索方法及びシステム | |
US20200272696A1 (en) | Finding of asymmetric relation between words | |
Han et al. | Text Summarization Using FrameNet‐Based Semantic Graph Model | |
Dhole | Resolving intent ambiguities by retrieving discriminative clarifying questions | |
Karpagam et al. | A framework for intelligent question answering system using semantic context-specific document clustering and Wordnet | |
Wang et al. | A joint chinese named entity recognition and disambiguation system | |
Han et al. | Text summarization using sentence-level semantic graph model | |
Sultana et al. | Identifying similar sentences by using n-grams of characters | |
Kalender et al. | THINKER-entity linking system for Turkish language | |
Gao et al. | Exploiting linked open data to uncover entity types | |
Xu et al. | Incorporating Feature-based and Similarity-based Opinion Mining-CTL in NTCIR-8 MOAT. | |
Nishy Reshmi et al. | Textual entailment classification using syntactic structures and semantic relations | |
Mishra et al. | Identifying and Analyzing Reduplication Multiword Expressions in Hindi Text Using Machine Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12883859 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2014532583 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14423142 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12883859 Country of ref document: EP Kind code of ref document: A1 |