KR101805607B1 - Method for making abstracts from Voice of Customer data - Google Patents

Method for making abstracts from Voice of Customer data Download PDF

Info

Publication number
KR101805607B1
KR101805607B1 KR1020160008005A KR20160008005A KR101805607B1 KR 101805607 B1 KR101805607 B1 KR 101805607B1 KR 1020160008005 A KR1020160008005 A KR 1020160008005A KR 20160008005 A KR20160008005 A KR 20160008005A KR 101805607 B1 KR101805607 B1 KR 101805607B1
Authority
KR
South Korea
Prior art keywords
lsp
voc
sentence
data
sentences
Prior art date
Application number
KR1020160008005A
Other languages
Korean (ko)
Other versions
KR20170088095A (en
Inventor
김현태
고준호
김문종
안영민
장정훈
Original Assignee
주식회사 와이즈넛
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 와이즈넛 filed Critical 주식회사 와이즈넛
Priority to KR1020160008005A priority Critical patent/KR101805607B1/en
Publication of KR20170088095A publication Critical patent/KR20170088095A/en
Application granted granted Critical
Publication of KR101805607B1 publication Critical patent/KR101805607B1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems

Abstract

There is provided a method for effectively generating a summary composed of sentences having important significance in VOC data. A method for generating a summary from the VOC data comprises the steps of: (a) constructing LSP knowledge in advance by defining a concept, a semantic feature, and an LSP for the VOC; and (b) (C) calculating the importance of sentences using the detected LSPs and conceptual and semantic qualities associated with the detected LSPs; and (d) Extracting a predetermined number of sentences in order of importance from the sentences constituting the VOC data, and generating a summary sentence.

Description

Method for generating abstract from voice data of a customer {

The present invention relates to a method of processing voice data of a customer (hereinafter referred to as "VOC") data, and more particularly to a method of processing VOC To a method for generating a summary from data.

In general, many companies operate a call center and provide various response services such as complaints, requirements, and inquiries about products or services from customers. Rather than stop at the level of simply providing problem solving for every item that is received, we will use the voice of customer (VOC) collected from this response service effectively to improve the overall quality of the products or services provided by the company. Has come.

Specifically, the VOC is conducted in a conversation format between the agent and the customer. From the VOC analysis, the service can be improved by grasping the customer's dissatisfaction or needs. However, there is a difference depending on the scope of the project, but the amount of VOC data received through the call center is very large and it is very difficult to analyze it systematically and collectively. It takes hundreds to thousands of calls every day to call centers and it takes a lot of time and manpower to check daily VOC data.

Recently, there has been an attempt to summarize the original text of VOC data in order to efficiently manage VOC data. For example, there are various methods based on ontology construction, based on keyword extraction, or calculating the similarity between words appearing in a sentence. However, it is difficult to grasp the exact meaning of the above-mentioned methods, which requires a complex relationship definition or the original text, compared to the knowledge to be constructed. Especially, in the case of generating a summary by using a sentence having a high frequency of occurrence, it is very inappropriate to summarize a conversational text such as VOC. For example, a sentence such as "Hello Hello" is the most frequently occurring sentence in the VOC data, but it is meaningless in summary sentences.

Thus, according to the method of generating a summary from the conventional VOC data, it is difficult to provide a meaningful summary because it extracts a summary by simply analyzing keywords and association.

SUMMARY OF THE INVENTION It is an object of the present invention to provide a method of extracting only sentences having important meaning among VOC data and effectively generating summary sentences.

The problems to be solved by the present invention are not limited to the above-mentioned problems, and other problems not mentioned can be clearly understood by those skilled in the art from the following description.

According to another aspect of the present invention, there is provided a method for generating a summary from a customer's voice (VOC) data using a lexical meaning pattern (LSP), the method comprising: (a) Constructing the LSP knowledge in advance by defining a concept, a semantic feature, and an LSP; (b) analyzing the morphemes of the sentences constituting the input VOC data and detecting LSPs matching the respective sentences from the LSP knowledge; (c) calculating importance of the sentence by using the detected LSP, a concept and a semantic feature associated with the detected LSP; And (d) generating a summary sentence by extracting a predetermined number of sentences in order of importance from among the sentences constituting the VOC data.

The step (a) may include defining the concept as a set to which the LSPs belong; Collecting VOC sample data and classifying it according to the concept; Constructing a semantic feature dictionary in which one or more entries having the same meaning are grouped into one set as a basic unit constituting the meaning of the concept; And constructing the concept, the semantic feature and the LSP knowledge defined by the LSP.

The method may further include, before the step (b), recognizing the voice data from the input VOC data and converting the voice data into a text sentence.

Wherein the step (c) comprises the steps of: using the LSP, the semantic qualities, and the respective weights representing the degree of necessity for generating the summary statement, and the positive negative level obtained by quantifying the strength of the positive or negative expression of the sentence, The importance can be calculated.

The importance may be proportional to the sum of all the weights of the concept, the LSP, and the semantic qualities included in the sentence.

In the step (d), the extracted sentences may be arranged in the order of the original text of the VOC data.

If it is difficult to grasp the meaning of the summary, it is possible to extract a sentence in the original text arranged before, after, or after at least one of the extracted sentences, and add the extracted sentence to the summary.

Summing and normalizing the importance of the sentences constituting the VOC data to calculate an average importance of the VOC data; And calculating the average importance for each VOC data for a plurality of VOC data belonging to the same category and comparing the calculated average importance with each other.

The details of other embodiments are included in the detailed description and drawings.

As described above, according to the method for generating summary texts from VOC data according to the present invention, it is possible to efficiently extract sentences having important meaning by generating numerical significance for each sentence constituting VOC data, thereby generating a summary sentence.

In addition, since the importance is calculated based on the LSP of the sentence, the importance can be consistently evaluated for the sentence belonging to the specific pattern.

Furthermore, since the importance of the sentence is calculated by weighting not only the LSP but also the related semantic qualities and concepts individually, the meaning of the sentence can be grasped more accurately and a summary sentence can be generated.

1 is a block diagram schematically showing a configuration of a VOC summarizing apparatus according to an embodiment of the present invention.
FIG. 2 is a flowchart sequentially illustrating a method for generating a summary from VOC data using an LSP according to an embodiment of the present invention. Referring to FIG.
FIG. 3 is a flowchart specifically illustrating a step of constructing VOC-related LSP knowledge of FIG.
FIG. 4 is a diagram exemplifying a screen configuration of an administrator terminal in defining the concept of FIG. 3. FIG.
FIG. 5 is an exemplary diagram illustrating a semantic feature dictionary table defining semantic features in constructing the semantic feature dictionary of FIG. 3. FIG.
FIG. 6 is an exemplary diagram illustrating the configuration of an entry table for the semantic qualification "meeting (4469)" in FIG.
7 is a diagram exemplarily showing a configuration of an LSP construction table generated according to the method of the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention and the manner of achieving them will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Is provided to fully convey the scope of the invention to those skilled in the art, and the invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout the specification.

Each block described above may represent a module, segment, or portion of code that includes one or more executable instructions for executing the specified logical function (s). It should also be noted that in some alternative implementations, the functions mentioned in the blocks may occur out of order. For example, two blocks that are shown one after the other may actually be executed substantially concurrently, or the blocks may sometimes be performed in reverse order according to the corresponding function.

The Voice of Customer (VOC) manages the processing status in real time from receipt of customer complaints received at the management system call center until the processing is completed, A customer management system that improves service. In the present invention, VOC data refers to a data file storing conversation contents between a customer and an agent in a management system call center, and may be composed of voice data or text data. One VOC data refers to data generated between a customer and an agent, and a summary-generating process can be performed on a plurality of VOC data belonging to the same category.

A VOC summarizing apparatus according to an embodiment of the present invention will be described with reference to FIG. 1 is a block diagram schematically showing a configuration of a VOC summarizing apparatus according to an embodiment of the present invention.

As shown in FIG. 1, the VOC abstracting apparatus 100 of the present invention includes a voice recognition unit 10 for recognizing voice data, a text conversion unit 20 for converting voice data into text data, An LSP detecting unit 40 for obtaining an LSP matching the sentence constituting the VOC data, an importance calculating unit 30 for calculating the importance of the sentence based on the LSP, 50, a summary generating unit 60 for generating a summary from predetermined high-importance sentences, and a DB 70 for storing data necessary for the functions and actions of the components.

Specifically, the voice recognition unit 10 recognizes the voice from the call between the agent and the customer, converts the voice into data, and stores the voice in the DB 70. [ The speech recognition unit 10 may include a filter for processing noise that may occur in a call and for correct text conversion. Further, the voice of the agent and the voice of the customer can be discriminated and recognized.

The text conversion unit 20 converts the speech data into text and stores it in the DB. The text conversion unit 20 may include voice time tags for synchronizing voice data and speaker data for distinguishing between the agent and the voice of the customer in the converted data. In addition, the text conversion unit 20 may include a function of correcting the spacing in the converted sentence and automatically recognizing the boundary between the sentences in order to improve the accuracy of the text conversion.

The LSP knowledge construction unit 30 establishes and defines LSP knowledge about the VOC data. For example, the LSP knowledge construction unit 30 constructs the LSP knowledge by collecting a large number of VOC sample data and structuring the VOC data LSP by classifying the data into concepts, meaning qualities, and vocabulary entries. The LSP knowledge building unit 30 builds knowledge about a specific vocabulary or expression that appears mainly in the VOC. How to build VOC-related LSP knowledge will be explained in detail later.

The LSP detecting unit 40 analyzes the morpheme of the text sentence of the VOC data to be analyzed and detects the LSP matching the sentence from the LSP knowledge.

For the sentence, the importance calculating unit 50 calculates the importance based on the LSP, the semantic qualities, the weight for the concept, and the positive negative level of the sentence. The weight and positive negative level may be defined differently according to the category of the domain including the VOC as a preset value by the user.

The summary generator 60 extracts the sentences constituting the VOC data by a predetermined number in descending order of importance, and arranges the extracted sentences in the order of the sentences of the original text to generate a summary sentence. The number of sentences extracted for the summary can be changed by the user's definition. If it is difficult to grasp the meaning due to insufficient probability between the extracted sentences, the summary generator 60 generates a summary sentence arranged before, after, or after the sentence on the original sentence for at least one of the extracted sentences, . ≪ / RTI >

Hereinafter, a method of generating a summary of VOCs using the LSP according to an embodiment of the present invention will be described in detail with reference to FIG. 2 to FIG. 2 is a flowchart sequentially illustrating a method of generating a summary from VOC data according to an exemplary embodiment of the present invention.

The LSP knowledge building unit 30 constructs the LSP knowledge about the VOC data (S100). Specifically, the LSP structure is constructed from the collected VOC sample data to construct LSP knowledge, which will be described in detail with reference to FIG. 3 to FIG. FIG. 3 is a flowchart specifically illustrating a step of constructing VOC-related LSP knowledge of FIG. FIG. 4 is a diagram exemplifying a screen configuration of an administrator terminal in defining the concept of FIG. 3. FIG. FIG. 5 is an exemplary diagram illustrating a semantic feature dictionary table defining semantic features in constructing the semantic feature dictionary of FIG. 3. FIG. FIG. 6 is an exemplary diagram illustrating the configuration of an entry table for the semantic qualification "meeting (4469)" in FIG. 7 is a diagram exemplarily showing a configuration of an LSP construction table generated according to the method of the present invention.

The LSP knowledge construction method according to an embodiment of the present invention is also a text analysis and LSP dictionary construction process. Each of these steps can be performed by an administrator terminal, which is a computing system in which a hardware / software module is built.

First, the LSP knowledge building unit 30 defines and constructs a concept (S101). The concept is defined as the meaning of a sentence interpreted through semantic analysis of the sentence. Concepts can be expressed as a set of LSPs belonging to each other, and LSPs can be managed more easily by grouping LSPs that can analyze similar texts into one concept.

Concepts can also have a hierarchical structure. As shown in the concept generation screen 200 of FIG. 4, a plurality of concepts form a hierarchical structure. For example, if a large category of navigation concept is defined, search and TV concepts are defined in the subcategories, and map and path concepts are defined and registered under the search concept. This concept can be defined by classifying the semantics of the sentence, from the meaning expression of a large category to the detailed meaning expression. The concept of the present embodiment preferably includes at least one LSP.

For example, in the case of a VOC related to a shopping mall, concepts such as product refund, return, and inquiry can be constructed separately, and the sentence matched to the LSP belonging to the refund concept includes the meaning of the refund .

In order to construct the LSPs belonging to each concept, it is necessary to acquire the VOC sample data to be the target. The VOC sample data is preferably text data, and in the case of voice data, it can be used through voice-text conversion. VOC sample data is collected and classified according to the concept (S102). The more sample data is collected, the more elaborate concept and LSP construction becomes possible. This has a direct impact on the accuracy of VOC summaries. The collected sample data is classified according to the concept of construction. If the collected sample data is difficult to classify into a specific concept, that is, there is no concept corresponding to the collected sample data, the concept can be added or modified .

For the sake of conceptual understanding and explanation of LSP knowledge construction, sample data such as the following sentence is illustrated:

(A) "Is there a good meeting place nearby?"

(B) "Let me have a good restaurant to meet in Gangnam"

(C) Show "Baseball Channel"

In order to accurately understand and analyze Korean sentence structures and components, it is necessary to structure vocabularies that have the same meaning but have the same meaning. To this end, the LSP knowledge construction unit 30 defines a basic unit constituting the meaning of the concept as a semantic feature, and constructs a semantic feature dictionary (S103).

The semantic feature is one of the basic units of the LSP, and the semantic feature dictionary is a set of one or more entries having the same meaning.

As for the sentence of the sample data, (A) sentence consists of semantic qualities such as "request", "place", "meeting". Each semantic feature may include, for example, an entry such as "request (inform) "," place (restaurant) ", & (B) The sentence consists of semantic qualities such as "region", "meeting", "place" and (C) the sentence consists of semantic qualities such as "sports", "channel" The concept covering these sentences can be seen as "navigation". Eventually, from several sample sentences, this concept can consist of semantic qualities such as "request", "place", "meeting", "region", "sports", "channel"

In FIG. 4, the map concept under the navigation concept means a request for searching for a place, and the concept may be composed of semantic qualities such as "place", "request", "meeting", " The path concept under the navigation concept may have additional "path" semantic qualities instead of "meeting" semantic qualities.

The meaning qualities will be described in detail with reference to FIG. Let us explain as an example the meaning (220) "meeting" of the semantic feature dictionary table 210 (4469).

In a sentence, "discussion", "discussion", "meeting +", "meeting", "meeting", "promise", "talk", "meeting" have the same meaning. Therefore, these words can be grouped into entries of the semantic feature 220 of "meeting", and classified into one entry as in the entry table 230 of FIG. 6, It can be structured to be a set of subclasses.

These semantic qualities play the same role as dictionaries and semantic qualities can be a set of vocabulary entries because they add vocabulary entries with the same semantics to the defined semantic qualities. The semantic qualities of the keywords and the semantic qualities of the narrative expressions may be included in the domain.

In LSP, the symbol "@" is used to express the semantic feature as "@meeting". These semantic qualities serve as a kind of lexical variable, and the lexical item can be substituted for the lexical item. Once the construction of the semantic feature dictionary is completed, it is used to construct LSP knowledge for the sample data collected and classified (S104).

When constructing LSP knowledge, it is possible to use not only semantic qualities, but also expressions such as phrases, morphemes, syllables, dictionaries, variables based on various grammar expressions, and various operators. As described above, in the present invention, the LSPs must belong to an arbitrary concept.

By constructing the semantic feature dictionary (S103) as described above, the LSP expressing one representative sentence pattern can recognize sentences as many as the combination of semantic qualities and entries constituting the LSP.

7, the LSP construction table 240 according to an exemplary embodiment of the present invention is a part of LSPs of representative sentence patterns related to the sample data examples (A), (B), and (C) . The basic structure of an LSP includes vocabulary, parts of speech, and morphemes. Table 1 below describes the meanings of the symbols (operators and parts of speech) used to express the LSP in FIG.

Operator grammar meaning ; [ stmt 1 ] [; stmt n ]
[( W 1 )] [( W 2 )]
[ W 1 ] [; W 2 ]
-OR
{} { stmt 1 } { stmt 2 } - qualifiers that distinguish one expression unit () () - Qualifiers that distinguish priority and units = [ value- stmt = [ stmt ] [= stmt ] +] - an operator that specifies the category of values / [ L ]? [/ Pos ] - express the part of the morpheme + [ stmt 1 ] + [ stmt 2 ] - Check for morphological bonding to left / right
- Confirm the combination of the rightmost and the leftmost morpheme in the expression on the right and the expression on the right.
^ ^ [C min ~ C max ]
Cmin ≤ Cmax , Cmin ≥ 0
^: = ^ 0 ~ 8, ^ ~% d: = ^ 0 ~% d
^% d: = ^ 0 ~% d, ^% d ~: = ^% d ~ 8
- Replace all L , W, P over C min ~ C max times
- You can substitute L and P in the case of being enclosed by {} qualifiers, but not W.
- When used in the front or back of a word, it is used as a wildcard that can be matched to a word.
# #C min ~ C max
Cmin ≤ Cmax , Cmin ≥ 1
#: # # 0 ~ 8, # ~% d: = # 0 ~% d
#% d: = # 0 ~% d, #% d ~: = #% d ~ 8
- Replace all L , W , P over C min ~ C max times
- You can substitute L and P in the case of being enclosed by {} qualifiers, but not W.
- When used in the front or back of a word, it is used as a wildcard that can be matched to a word.
[[?] or [{?}] or [ stmt ]? ] - Meaning to apply or not to all expressions
- replace all expressions once or implicitly
! [! [ stmt ] or! [ W | L | P ] or! ( Stmt | W )] - Expressions that deny the original meaning of expression or word \ \ [ character ] where
Character ∈ {(,), {,}, =, +, *, #, @,?, &,!, \, ~}
- literally means the value to be interpreted correctly
- Literal \ is applied to one character
* [ stmt 1 ] * - Repeating expressions
- The cardinality of * is the same as ^ or #
[] POSIX character class
[: alpha:]
[: digit:]
[: lower:]
[: upper:]
PERL character class
[A-Za-z0-9]
[! "# $% &'() * +,. / :; <=>? @ \ ^ _` {|} ~ -]
Partial representation of POSIX and PERL character classes
$ ^ $, # $ - It comes after ^ or #
- Perform post-processing if $ exists

Once the VOC-related LSP knowledge is constructed in this way, a basic knowledge building process for generating a summary from VOC data is completed. Hereinafter, the process of generating the summary text from the VOC data to be processed will be described in detail.

Referring to FIG. 2, the voice recognition unit 10 recognizes voice data from a call between an agent and a customer, and the text conversion unit 20 converts the voice data into text data (S110). One VOC data may be composed of a plurality of text sentences.

Then, the LSP detecting unit 40 analyzes the morpheme of the text sentence (S120). A morpheme is the smallest unit of a grammatical element that has a meaning. A sentence is the minimum unit that represents the finished content when expressing thoughts or feelings in words. It is a principle to have subjects and predicates, but sometimes they may be omitted. In some cases, a sentence can include a phrase or a clause. A phrase is a unit of two or more words that form a sentence component. The phrase refers to a lump that has no relation to the subject and the adjective. , Adjective, and adverb. A clause is a unit that has a subject and a predicate but can not be used independently, and there are noun clauses, narrative syllables, quasi-clauses, adverbs, and quotations.

Using the following example 1 as an example, we explain how to analyze the morpheme in detail.

Example 1

Can you pass the pepper?

When the LSP detecting unit 40 analyzes the morpheme of the example sentence 1, the result of the example sentence 2 shown below is obtained.

Example 2

Pepper / NNG / MO / VV + EM / EM / VX + City / EP + EM / SC

Then, the LSP detecting unit 40 detects an LSP matching the text sentence from the LSP knowledge based on the morpheme (S130). Example statement 2 is expressed in LSP as in the example sentence 3 below.

Example sentence 3

(/ NN_) (/ MA) * 2? (3) + (EP / EP) + (/ EM)? / SC

The importance calculating unit 50 defines a weight W for a LSP, a semantic feature and a concept for one text sentence, and a positive negative level (neg) of the sentence (S140) (S150). The weight (W) indicates the degree to which the expression (LSP, semantic feature or concept) in the VOC analysis is needed for generating a summary. For example, when analyzing VOC data related to a shopping mall, the expression related to the product sold in the shopping mall may have a higher weight. Positive negation level (neg) is a representation of the strength of positive negation at a certain level by normalizing the strength of the positive or negative expression of the sentence using the semantic qualities and the weight (W) of the LSP. For example, if a customer is very satisfied or dissatisfied with a product, the sentence is made up of strong expressions that reveal these emotions, and negative levels are high. Also, if there is no positive or negative expression in the sentence and it is composed of plain content, positive negative level has low value.

The importance of one sentence can be calculated according to Equation 1 below.

[Equation 1]

Figure 112016007390167-pat00001

In Equation 1, the importance f is a function of α, β, γ, and δ, α is the number of concepts matched in one sentence, β is the number of LSPs matched in one sentence, Denotes the number of matching semantic qualities, and δ corresponds to a category constant that can be specified by the user according to the VOC category. N k denotes the number of times the k th concept is matched to one sentence, n h denotes the number of times that the h th LSP is matched to one sentence, n j denotes the number of times that the j th semantic feature matches the one sentence . Further, W indicates the weight and k_concept, W h _ LSP of the k-th concept, refers to the weight of the h-th LSP, and W j _ SF refers to the weight of the j-th semantic features. The weight can be arbitrarily set by the user, for example, between 1 and 10 depending on the relationship with the VOC population, summary suitability, and the like. Qualitatively looking at Equation 1, the importance f of a sentence is proportional to the sum of all the weights of the concept, LSP, and semantic qualities contained in a sentence.

The positive negativity level (neg) in Equation 1 can be defined by Equation 2 below.

[Equation 2]

Figure 112016007390167-pat00002

In the Formula 2 M h is a set of weight values of semantic features contained in the h-th LSP matched to the sentence, and, avg (M h) is the average of the set of M h, W h _ LSP is a h-th LSP , Where β is the number of LSPs matched to a sentence, E is 1 if the sentence is positive, -1 if the sentence is negative, and 0 if the sentence is neutral.

For example, let's estimate the negative level (neg) for the example sentence "No TV." This example can be matched to an LSP called "@ commodity + / J_ @ complaints". Assuming that the "@ commodity" semantic qualification has a weight of 4, the meaning of "@ dissatisfaction" is a weight of 10, and the corresponding LSP has a weight of 10 and only one LSP matches the example sentence, Negative level (neg) has a value of 1.857.

The above process is repeated to calculate the importance for all sentences constituting the VOC data. The summary generating unit 60 extracts a predetermined number of sentences in the order of the calculated importance, and generates a summary sentence (S160). In order to construct a summary, the extracted sentences are preferably arranged in the order of the original text. For example, if three sentences are extracted in order of importance, the fifth sentence has the highest importance, followed by the tenth sentence, then the first sentence, Let's call it a sentence. If you extract the top 3 sentences of importance, the 5th, 10th, and 1st sentences are extracted in order, but when you generate the summary, the 1st, 5th, 10th sentence order. By arranging the extracted sentences in the same way as the arrangement of the original text, the meaning of the entire text can be grasped easily by the summary text alone.

If there is a lack of probability between the extracted sentences and it is difficult to grasp the meaning only by the summary, at least one of the extracted sentences can be added to the summary sentence placed before, after, or after the sentence in the original text have. For example, in the above example, the fourth and sixth sentences placed before and after the fifth sentence can be added to the summary sentence to generate a summary sentence from the first through the fourth through the fifth through the sixth through the tenth sentences.

In this way, a summary composed of at least one sentence is generated for one VOC data. In some cases, morpheme analysis can be used to exclude unnecessary vocabulary or phrases defined in a sentence from the summary.

One VOC data is usually composed of a plurality of original sentences. After the importance of all sentences is added, the average importance of the corresponding VOC data can be calculated by normalizing (S170). Therefore, it is possible to calculate the average importance for each VOC data for a plurality of VOC data belonging to the same category, and to compare the VOC data with each other, so that the importance priority can be set between VOC data.

For reference, the LSP pre-construction and Korean machine translation methods according to various preferred embodiments of the present invention may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape, optical recording media such as CD-ROMs, DVDs, magneto-optical media such as floptical disks, A hard disk drive, a flash memory, and the like. Examples of program instructions include high-level language code that can be executed by a computer using an interpreter, as well as machine accords such as those produced by a compiler. A hardware device may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

While the present invention has been described in connection with what is presently considered to be practical exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, You will understand. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive.

10: voice recognition unit 20: text conversion unit
30: LSP knowledge construction unit 40: LSP detection unit
50: importance calculating unit 60:
70: DB 100: VOC summary device
200: concept generation screen 210: semantic qualification dictionary table
220: Meaning qualities 230: Entry table
240: LSP building table

Claims (8)

CLAIMS What is claimed is: 1. A method for generating a summary from a customer's voice (VOC) data using a lexical semantic pattern (LSP), the method comprising:
(a) establishing LSP knowledge in advance by defining concept, semantic qualities and LSP for VOC;
(b) analyzing the morphemes of the sentences constituting the input VOC data and detecting LSPs matching the respective sentences from the LSP knowledge;
(c) calculating importance of the sentence by using the detected LSP, a concept and a semantic feature associated with the detected LSP; And
(d) extracting a predetermined number of sentences in order of importance from the sentences constituting the VOC data to generate a summary sentence.
The method of claim 1, wherein the step (a)
Defining the concept as a set to which the LSPs belong; Collecting VOC sample data and classifying it according to the concept;
Constructing a semantic feature dictionary in which one or more entries having the same meaning are grouped into one set as a basic unit constituting the meaning of the concept; And
Constructing the concept, the semantic feature and the LSP knowledge defined by the LSP.
2. The method of claim 1, wherein before step (b)
Recognizing speech data from the input VOC data and converting the speech data into a text sentence.
The method according to claim 1,
Wherein the step (c) comprises the steps of: using the LSP, the semantic qualities, and the respective weights representing the degree of necessity for generating the summary statement, and the positive negative level obtained by quantifying the strength of the positive or negative expression of the sentence, A method for generating a summary from VOC data that yields importance.
5. The method of claim 4,
Wherein the importance is generated from VOC data proportional to the sum of all the weights of the concept, the LSP and the semantic qualities included in the sentence.
The method according to claim 1,
In the step (d), the extracted sentence is generated from VOC data arranged in the order of the original text of the VOC data.
The method according to claim 1,
Extracting in-text sentences arranged before, after, or after at least one of the extracted sentences and adding the extracted in-sentences to the summary sentence when it is difficult to understand the meaning of the summary sentence.
The method according to claim 1,
Summing and normalizing the importance of the sentences constituting the VOC data to calculate an average importance of the VOC data; And
Further comprising calculating the average importance for each VOC data for a plurality of VOC data belonging to the same category, and comparing the average importance with each other, and generating a summary from the VOC data.
KR1020160008005A 2016-01-22 2016-01-22 Method for making abstracts from Voice of Customer data KR101805607B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020160008005A KR101805607B1 (en) 2016-01-22 2016-01-22 Method for making abstracts from Voice of Customer data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020160008005A KR101805607B1 (en) 2016-01-22 2016-01-22 Method for making abstracts from Voice of Customer data

Publications (2)

Publication Number Publication Date
KR20170088095A KR20170088095A (en) 2017-08-01
KR101805607B1 true KR101805607B1 (en) 2017-12-06

Family

ID=59650363

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020160008005A KR101805607B1 (en) 2016-01-22 2016-01-22 Method for making abstracts from Voice of Customer data

Country Status (1)

Country Link
KR (1) KR101805607B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190112367A (en) 2018-03-26 2019-10-07 주식회사 와이즈넛 Method for extracting major semantic feature from voice of customer data, and data concept classification method using thereof

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102266061B1 (en) * 2019-07-16 2021-06-17 주식회사 한글과컴퓨터 Electronic device capable of summarizing speech data using speech to text conversion technology and time information and operating method thereof
KR102332268B1 (en) * 2019-11-08 2021-11-29 주식회사 엘지유플러스 Customer Consultation Summary Apparatus and Method
CN112347241A (en) * 2020-11-10 2021-02-09 华夏幸福产业投资有限公司 Abstract extraction method, device, equipment and storage medium
KR102445748B1 (en) * 2020-12-18 2022-09-21 주식회사 와이즈넛 The pattern recognition method of text sentences using language resources

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
김문종 외 2명, ‘구문 의미 이해 기반의 VOC 요약 시스템’, 한국정보과학회 학술발표논문집, pp.805~807, 2015년 6월.*
장동현, 맹성현, ‘텍스트 구성요소 판별기법과 지질을 이용한 문서 요약 시스템의 개발 및 평가’, 정보과학논문지, 소프트웨어 및 응용, 27(6), pp.678~689, 2000년 6월.*

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190112367A (en) 2018-03-26 2019-10-07 주식회사 와이즈넛 Method for extracting major semantic feature from voice of customer data, and data concept classification method using thereof

Also Published As

Publication number Publication date
KR20170088095A (en) 2017-08-01

Similar Documents

Publication Publication Date Title
Katz et al. ConSent: Context-based sentiment analysis
KR101805607B1 (en) Method for making abstracts from Voice of Customer data
CN107480143B (en) Method and system for segmenting conversation topics based on context correlation
CN102866989B (en) Viewpoint abstracting method based on word dependence relationship
Baldwin et al. Extracting the unextractable: A case study on verb-particles
KR101723862B1 (en) Apparatus and method for classifying and analyzing documents including text
US20160299955A1 (en) Text mining system and tool
CN109710744B (en) Data matching method, device, equipment and storage medium
CN114580382A (en) Text error correction method and device
JP2001075966A (en) Data analysis system
CN110096599B (en) Knowledge graph generation method and device
KR20090004216A (en) System and method for classifying named entities from speech recongnition
CN112860896A (en) Corpus generalization method and man-machine conversation emotion analysis method for industrial field
Arai et al. Grammar fragment acquisition using syntactic and semantic clustering
CN107526721A (en) A kind of disambiguation method and device to electric business product review vocabulary
CN107632974B (en) Chinese analysis platform suitable for multiple fields
KR102206781B1 (en) Method of fake news evaluation based on knowledge-based inference, recording medium and apparatus for performing the method
Ahmad et al. Urdu speech and text based sentiment analyzer
WO2014002774A1 (en) Synonym extraction system, method, and recording medium
CN116628173B (en) Intelligent customer service information generation system and method based on keyword extraction
CN109800430B (en) Semantic understanding method and system
JP2011065380A (en) Opinion classification device and program
Kasmuri et al. Subjectivity analysis in opinion mining—a systematic literature review
KR101837003B1 (en) Method for monitoring online communities
KR102497539B1 (en) An ontology based knowledge base construction method using semantic role labeling

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right