CN110543559A

CN110543559A - Method for generating interview report, computer-readable storage medium and terminal device

Info

Publication number: CN110543559A
Application number: CN201910577275.4A
Authority: CN
Inventors: 谭浩; 李文良; 彭盛兰
Original assignee: Individual
Current assignee: Individual
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2019-12-06

Abstract

The application discloses a semantic unit clustering method, which comprises the following steps: acquiring a plurality of semantic units; and determining one or more cluster centers based on the plurality of semantic units. Determining one or more cluster centers based on the plurality of semantic units comprises: one or more cluster centers are determined from the plurality of semantic units based on the similarity between each two of the plurality of semantic units. Determining one or more cluster centers based on the similarity between each two of the plurality of semantic units comprises: sequentially selecting each of the plurality of semantic units as candidate semantic units; and for each candidate semantic unit: similarity between each candidate semantic unit and each of the remaining semantic units of the plurality of semantic units is calculated, respectively, and if there is at least one semantic unit in the remaining semantic units having a similarity higher than a predetermined threshold, each candidate semantic unit is determined as a clustering center.

Description

Method for generating interview report, computer-readable storage medium and terminal device

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method, a computer-readable storage medium, and a terminal device for generating an interview report.

Background

Interviewing is an important part in design research and plays an important role in various industries. Interview modes are various, for example, user interview, expert interview, joint interview, outdoor interview and the like, and in design research, the interview can be used for providing insight into the real ideas of users and social industry trends.

Traditional interviews are usually developed by two or more persons as a unit, and some persons are responsible for communicating with users and recording and supplementing the voice. In order to dig out the real idea in the deep heart of the user in the interview, the interview usually starts from detailed calling and warming-up questions, and the question sequence is adjusted according to the actual answer of the user in the interview, so that a recorder needs to quickly extract effective information in the interview and record and mark the focus in real time. After the interview is finished, the voice is converted into characters through the whole interview process of recording and replaying, and detailed analysis is performed by combining with an interview record. And analyzing related contents such as the requirements of the insight users, pain points of products, industry opportunity points and the like according to actual interview targets, wherein the steps are completed by interview personnel manually, about 5-6 hours are needed to complete the analysis work of interview within 1 hour, the interview cost is high, the analysis work is complicated and time-consuming, and the method is a very remarkable problem in interview.

Disclosure of Invention

the present application is directed to solving at least one of the problems in the prior art. To this end, it is an object of the present application to propose a method of generating interview reports that can reduce the cost and time of demand analysis in interviews, and is simpler.

A second object of the present application is to provide a computer-readable storage medium.

A third objective of the present application is to provide a terminal device.

In order to achieve the above object, a first aspect of the present application provides a semantic unit clustering method, including: acquiring a plurality of semantic units; and determining one or more cluster centers based on the plurality of semantic units.

In some embodiments, determining one or more cluster centers based on the plurality of semantic units comprises: determining the one or more cluster centers from the plurality of semantic units by an AP algorithm.

In some embodiments, determining one or more cluster centers based on the plurality of semantic units comprises: determining each of the plurality of semantic units as a cluster center.

In some embodiments, determining one or more cluster centers based on the plurality of semantic units comprises: determining the one or more clustering centers from the plurality of semantic units based on a similarity between the plurality of semantic units.

In some embodiments, determining one or more cluster centers based on the similarity between the plurality of semantic units comprises: sequentially selecting each semantic unit from the plurality of semantic units as a candidate semantic unit; and for each candidate semantic unit: calculating a similarity between each of the candidate semantic units and each of remaining semantic units of the plurality of semantic units, respectively, and determining each of the candidate semantic units as a cluster center if there is at least one semantic unit in the remaining semantic units having a similarity higher than a predetermined threshold.

In some embodiments, separately calculating a similarity between the each candidate semantic unit and each of the remaining semantic units of the plurality of semantic units comprises: calculating a candidate semantic vector of each candidate semantic unit; and respectively calculating the similarity between the candidate semantic vector of each candidate semantic unit and the semantic vector of each of the remaining semantic units.

In some embodiments, calculating the candidate semantic vector for each candidate semantic unit comprises: acquiring a characteristic semantic unit table, wherein the characteristic semantic unit table comprises one or more characteristic semantic units; respectively determining the association degree of each candidate semantic unit and each characteristic semantic unit; and generating the candidate semantic vector according to the association degree of each candidate semantic unit and each characteristic semantic unit.

In some embodiments, the degree of association is proportional to a frequency of occurrence of each feature semantic unit in the each candidate semantic unit.

In some embodiments, calculating the candidate semantic vector for each candidate semantic unit comprises: assigning a sub-semantic unit vector to each of one or more sub-semantic units in the each candidate semantic unit; inputting all sub-semantic unit vectors into a predetermined prediction model to output a target vector; and designating the target vector as the candidate semantic vector.

In some embodiments, computing the candidate semantic vector for each of the candidate semantic units further comprises assigning an identity vector to each of the candidate semantic units, and inputting all sub-semantic unit vectors into a predetermined predictive model to output the target vector comprises inputting the identity vector and all sub-semantic unit vectors together into a predetermined predictive model to output the target vector.

In some embodiments, the semantic unit clustering method further includes: ranking the one or more cluster centers.

in some embodiments, ordering the one or more cluster centers comprises: for each cluster center in the one or more cluster centers, respectively calculating the similarity between the semantic unit corresponding to each cluster center and each of the remaining semantic units in the plurality of semantic units; and sorting all the cluster centers based on the number of semantic units with the similarity higher than a preset threshold value.

In some embodiments, separately calculating the similarity between the semantic unit corresponding to each cluster center and each of the remaining semantic units in the plurality of semantic units comprises: and respectively calculating the similarity between the semantic vector of the semantic unit corresponding to each clustering center and the semantic vector of each of the rest semantic units in the plurality of semantic units.

Another aspect of the application provides a computer readable storage medium comprising a program which, when executed by a processor, performs the semantic unit clustering method according to any one of the above aspects.

Another aspect of the present application provides a semantic unit clustering apparatus, including: a semantic unit obtaining component configured to obtain a plurality of semantic units; and a cluster center determination component configured to determine one or more cluster centers based on the plurality of semantic units.

in some embodiments, the cluster center determination component comprises: a cluster center determination module configured to determine the one or more cluster centers from the plurality of semantic units through an AP clustering algorithm.

In some embodiments, the cluster center determination component comprises: a cluster center determination module configured to determine each of the plurality of semantic units as a cluster center.

In some embodiments, the cluster center determination component comprises: a cluster center determination module configured to determine the one or more cluster centers from the plurality of semantic units based on a similarity between each two of the plurality of semantic units.

In some embodiments, the cluster center determination module comprises: a candidate semantic unit selection module configured to select each of the plurality of semantic units as a candidate semantic unit in sequence; a similarity calculation module configured to calculate, for each candidate semantic unit, a similarity between the each candidate semantic unit and each of remaining semantic units of the plurality of semantic units, respectively; and a cluster center determination module configured to determine each candidate semantic unit as a cluster center when at least one semantic unit with a similarity higher than a predetermined threshold exists in the remaining semantic units.

In some embodiments, the similarity calculation module comprises: a candidate semantic vector calculation module configured to calculate a candidate semantic vector for each of the candidate semantic units; and a semantic vector similarity calculation module configured to calculate a similarity between the candidate semantic vector of each candidate semantic unit and the semantic vector of each of the remaining semantic units, respectively.

in some embodiments, the candidate semantic vector calculation module comprises: a feature semantic unit obtaining module configured to obtain a feature semantic unit table, wherein the feature semantic unit table includes one or more feature semantic units; a relevancy determination module configured to determine relevancy of each candidate semantic unit and each feature semantic unit respectively; and a candidate semantic vector generation module configured to generate the candidate semantic vector by the association degree of each candidate semantic unit with each feature semantic unit.

in some embodiments, the candidate semantic vector calculation module comprises: a sub-semantic unit vector allocation module configured to allocate a sub-semantic unit vector for each of one or more sub-semantic units of the each candidate semantic unit; a target vector calculation module configured to input all sub-semantic unit vectors into a predetermined prediction model to output a target vector; and a candidate semantic vector designation module configured to designate the target vector as the candidate semantic vector.

In some embodiments, the candidate semantic vector calculation module further comprises an identity vector assignment module configured to assign an identity vector to the each candidate semantic unit; wherein the target vector calculation module is further configured to input all sub-semantic unit vectors into a predetermined predictive model to output a target vector comprises inputting the identity vector and all sub-semantic unit vectors together into a predetermined predictive model to output a target vector.

In some embodiments, the semantic unit clustering device further includes: an ordering component configured to order the one or more cluster centers.

In some embodiments, the ordering component comprises: a similarity calculation module configured to: for each cluster center in the one or more cluster centers, respectively calculating the similarity between the semantic unit corresponding to each cluster center and each of the remaining semantic units in the plurality of semantic units; and a cluster center ordering module configured to order all cluster centers based on the number of semantic units with similarity higher than a predetermined threshold.

In some embodiments, the similarity calculation module is further configured to calculate a similarity between the semantic vector of the semantic unit corresponding to each cluster center and the semantic vector of each of the remaining semantic units in the plurality of semantic units, respectively.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The above and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flow diagram of a method of generating an interview report according to one embodiment of the present application;

FIG. 2 is a schematic diagram of a process of training a neural network model according to one embodiment of the present application;

FIG. 3 is a schematic diagram for further illustration of the training process in FIG. 2;

FIG. 4 is a block diagram of a terminal device according to one embodiment of the present application;

FIG. 5 is a flow diagram of an intelligent speech material requirement extraction process according to one embodiment of the present application;

FIG. 6 is a flow diagram of a process for building a demand judgment model and a feature dictionary according to one embodiment of the present application;

FIG. 7 is a block diagram of the constituent modules of an intelligent speech profile requirement extraction system according to one embodiment of the present application;

FIG. 8 is a flow diagram of an availability determination process according to one embodiment of the present application;

FIG. 9 is a flow diagram of a process for building an availability determination model according to one embodiment of the present application;

FIG. 10 is a block diagram of an availability extraction system according to one embodiment of the present application;

FIG. 11 is a flow diagram of a method of speech content analysis according to an embodiment of the present application;

FIG. 12 is a flow diagram of a semantic element clustering method according to one embodiment of the present application;

FIG. 13 is a flow diagram of determining one or more cluster centers based on similarity between a plurality of semantic units, pairwise, according to an embodiment of the present application;

FIG. 14 is a flow diagram for separately calculating a similarity between each candidate semantic unit and each of the remaining semantic units of the plurality of semantic units according to one embodiment of the present application;

FIG. 15 is a flow diagram of computing a candidate semantic vector for each candidate semantic unit according to one embodiment of the present application;

FIG. 16 is a flow diagram of computing a candidate semantic vector for each candidate semantic unit according to another embodiment of the present application;

FIG. 17 is a schematic diagram of a semantic unit clustering apparatus according to one embodiment of the present application.

Detailed Description

Embodiments of the present application will be described in detail below, which are exemplary and are described in detail below with reference to the accompanying drawings.

the method for generating the interview report integrates artificial intelligence and design research technology, an autonomous interview research process is built in fewer steps and less time, designers and other people can be helped to finish the interview data collection, analysis and report processes autonomously, the labor cost of interview analysis is reduced, the time consumed by interview analysis is shortened, and the method is simple and easy to implement.

A method of generating an interview report according to an embodiment of the first aspect of the present application is described below with reference to fig. 1-3.

Fig. 1 is a flowchart of a method of generating an interview report according to an embodiment of the present application, which includes steps S1 to S4, as shown in fig. 1.

And step S1, acquiring interview linguistic data.

specifically, in the embodiment, the method for generating an interview report according to the embodiment of the present application may be loaded on a terminal device, such as a smart phone, a tablet computer, or a notebook computer, in the form of an application program, a human-computer interaction window of the terminal device may provide an application program icon associated with the method according to the embodiment of the present application, in response to a program start instruction, a human-computer interaction interface provides a recording start trigger unit, in response to a trigger instruction of the recording start trigger unit, the interview corpus may be collected by a recording module of the terminal device itself, in short, when the interview starts, recording may be started by operating the recording start trigger unit, and then the interview corpus may be obtained.

In other embodiments, the interview may also be recorded by other recording devices, for example, the interview corpus is recorded by a microphone or a recording device such as a microphone array, a recorder, a recording pen, and the like, and the interview corpus is transmitted to a terminal device loaded with an application program of the method according to the embodiment of the present application, so as to obtain the interview corpus, and then the interview corpus is autonomously analyzed, which may also shorten the time for interview analysis and save the labor cost.

And step S2, responding to the interview report generation instruction, analyzing the interview language material according to the neural network model, and obtaining the performance information of the interviewee.

specifically, after the interview is completed, the human-computer interaction interface of the terminal device provides an interview report generation triggering unit, or sets a corresponding key or a corresponding knob, when the interview report generation triggering unit receives a triggering instruction, that is, the interview report generation instruction is received, and the processor of the terminal device can autonomously analyze the interview forecast to generate the interview report.

in some embodiments, a series of processing is required on the original interview corpus to facilitate subsequent analysis using a neural network model, such as preprocessing the original corpus and word segmentation, wherein the preprocessing may include, for example, recognizing the interview corpus to convert into text data, splitting and cleansing the text data, and the like.

Specifically, the neural network has a mechanism for training and learning continuous optimization results, and a neural network model can be constructed through neural network training. The training of the neural network model will be briefly explained.

Fig. 2 is a schematic diagram of a neural network model training process according to an embodiment of the present application, and as shown in fig. 2, an algorithm optimization interface such as the expert system in fig. 2 may be opened for an expert user, a main function of the expert system is to construct a training set for a core feature neural network of a project according to a customized function of a product, and when a neural network model is generated by training feature parameters, a final result is determined by contents of the training set. Inputting expert prior knowledge into an expert system to construct, inputting the content of a training set into a neural network algorithm model, enabling an expert to evaluate and correct the output result of the neural network algorithm model based on the training set, further optimizing the expert knowledge, further inputting the optimized knowledge into the expert system, further correcting the training set, inputting the corrected knowledge into the neural network algorithm model again, and sequentially and circularly iterating until the output result of the neural network model is closer to the optimal solution.

further, fig. 3 is a schematic diagram for further explanation of the neural network model training process in fig. 2, specifically, a training set may be constructed by a team, and in combination with linguistic, psychological and design methods, the constituent features of an original database are mined, and a standard manner of a training text is determined according to a constraint condition of a neural network specified by a research result. For example, as shown in fig. 3, for the expert system side, the expert prior knowledge is input into the expert knowledge entry module, conditions of feature determination, such as whether the determination is made or not or the degree of determination, are defined, key information, such as setting keywords, is labeled, and description information, such as tone or grammar, is labeled, so as to construct a training set. And for the algorithm model side, inputting the content of the training set into the algorithm model for training, for example, performing semantic analysis and generating an analysis result, correcting the analysis result based on the evaluation of an expert on the analysis result, supplementing the training set, and continuously optimizing the content of the training set, thereby constructing the required neural network model.

In the embodiment of the application, a neural network model can be constructed through the above process based on the characteristics of interview language material and expert knowledge, the interview language material is analyzed through the neural network model, various kinds of performance information of interviewees included in the interview language material, such as emotion information, demand information, advance information on industry trends, dispute information on certain problems, availability information and the like, are obtained, and different neural network models are constructed for obtaining different performance information of the interviewees.

Step S3, generating interview report according to the interviewee performance information.

For example, in a user research on a certain product, it is desirable to determine the needs of the user or the attitude of the product through user research analysis to improve the product. Specifically, when the user is interviewed, the recording trigger unit is operated to start recording, interview language materials are obtained, after interviewing is finished, inputting an interview report generation instruction, analyzing interview linguistic data through a neural network model, obtaining a text keyword set to form integral description of interview, obtaining emotion information, demand information and/or availability information of interviewees for the product, judging the satisfied and/or dissatisfied place of the product, the function that the user likes or wants the product, and the like according to the emotional information, the demand information and/or the availability information, therefore, the interview report comprising the user preference and the product advantages and disadvantages is generated based on the analysis result, and the interview report can be referred to perfect the product and improve the product performance.

In some embodiments, a demand refers to a desire or desire that a user proposes from their own perspective. Through the requirement class linguistic data, information such as user motivation, functions which users like or hope products to have, suggestions or opinions on the products and the like can be obtained, and therefore the requirements of guiding product design, insights on industry markets and the like are achieved. An example of a requirement class statement is: i want the home appliances to be consistent with the home decoration style.

In some embodiments, usability refers to the perceived effectiveness, efficiency, and satisfaction of a user when using a product to achieve a particular goal in a particular usage scenario. In particular, Effectiveness (effectivenesss) refers to how correctly and completely a user accomplishes a particular goal; efficiency (Efficiency) refers to the Efficiency of a user to accomplish a particular goal, which is inversely proportional to the resources consumed (e.g., time); satisfaction (Satisfaction) refers to the subjective level of Satisfaction experienced by a user when using a product. In some embodiments, availability has five indicators, respectively, learning ease, memorability, fault tolerance, interaction efficiency, and user satisfaction. The product has high usability only if the product reaches good water quality on each index. By analyzing and extracting the availability of the interview corpus, the advantages and the disadvantages of the product can be obtained, so that the product is optimized, and the performance is improved. An example of an availability statement is: the speech recognition effect of this intelligent sound is not good at all. Another example of an availability statement is: i feel that this intelligent stereo is still very easy to get on hand. In some embodiments, the availability information expresses a user's experience with the product, such as a positive or negative attitude held by the user with respect to availability or ease of use of the product, or a suggestion or opinion of the user for an improvement in some aspect of the product.

In some embodiments, emotion analysis refers to a process of analyzing, processing, inducing and reasoning subjective texts with emotion colors, and user preferences, emotional attitudes to products and environments, and living states of users can be obtained by analyzing interview original texts in the products.

In some embodiments, the demand information may include product demand information and personal demand information. Specifically, the product requirement information mainly expresses the requirement of the user for the function or the feature of a specific product, or the function that the user likes or wants the product to have, for example, the user wants a specific mobile phone product to have a narrow frame, a dual camera, a dual card dual standby function, and the like. The personal requirement information mainly expresses the requirements of the user, and the evaluation object of the information is not limited to a specific product generally and can be any aspect, such as that the user wants a new mobile phone, a pair of new sneakers, a movie ticket and the like. In some embodiments, the sentiment polarity information may include product sentiment polarity information and personal sentiment polarity information. Specifically, the emotion polarity information of the product mainly expresses the likes and dislikes of the specific functions or characteristics of the specific product by the user or the places satisfied or unsatisfied with the product by the user, such as the user likes the specific shell color of the specific mobile phone product, does not like the bang-bang screen, does not like the protruding camera, and the like. The personal emotion polarity information mainly expresses the likes and dislikes of other objects by the user, and the evaluation object of the information is not limited to a specific product in general and may be any aspect, such as that the user likes basketball, likes digital products, dislikes watching movies, and the like.

According to the method for generating the interview report, the interview language material can be automatically analyzed according to the neural network model in response to the interview report generation instruction, the performance information of the interviewee is obtained, and the interview report is generated according to the performance information of the interviewee, so that the interview report can be generated by one key, manpower and material resources can be saved, the interview cost is reduced, and the interview time consumption is shortened.

In some embodiments, generating an interview report based on interviewee performance information includes: calculating the similarity of at least one type of sentences among sentences required by interviewees, non-required sentences of interviewees, positive emotion sentences, negative emotion sentences and usability sentences; clustering according to the similarity, and obtaining a clustering center; and generating an interview report according to statement semantics contained in the clustering center.

Specifically, after obtaining the interview corpus including the performance information of the interviewee, sentence vectors of various types of performance information are calculated to form a sentence vector group, similarity calculation is performed on the sentence vectors in the sentence vector group, for example, the HANLP algorithm is adopted for similarity calculation, clustering is further calculated according to the similarity, for example, the AP clustering algorithm or other algorithms can be adopted for clustering, and a clustering center is determined.

Further, keywords can be extracted and emphasis tagged for interview corpora, emphasis can be more easily found when translating text, reports, and other materials. Specifically, a mark trigger unit may be provided through a human-computer interaction interface of the terminal device, for example, a highlight (highlight) button is set, or a corresponding touch or mechanical key is set on the terminal device. When the keywords are marked, performing weight calculation and sequencing on sentences in the text data, obtaining preselected keywords according to a sequencing result, filtering stop words included in the preselected keywords according to a stop word list, obtaining the keywords in the text data, outputting the keywords, and forming a text keyword set to integrally describe interviews. The stop words can be manually input, and the generated stop words form a stop word list. When the key points are marked, the weight of the sentences corresponding to the key point marking instructions in the text data is increased in response to the key point marking instructions, so that the sentences are marked as key sentences, and the key points can be found more easily during subsequent browsing.

Specifically, a word segmentation suite can be adopted to perform lexical analysis on the text data to obtain short sentences and remove the mood words in the short sentences. Some word segmentation kits use a prefix dictionary to realize efficient word graph scanning, generate directed acyclic graphs formed by all possible generated words of Chinese characters in sentences, then use dynamic programming to search a maximum probability path and find out a maximum segmentation combination based on word frequency, and for unknown words, use an HMM model based on word forming capability of the Chinese characters and use a Viterbi algorithm. For the extraction of keywords, a plurality of meaningful words or phrases are automatically extracted from a given piece of text data. In some embodiments, a TextRank algorithm can be adopted, the TextRank algorithm is a graph-based sorting algorithm for text data, the text data is divided into a plurality of constituent units, a graph model is established, important components in the text are sorted by using a voting mechanism, and keyword extraction can be realized only by using a single document.

For example, the basic steps of extracting keywords may include: (1) dividing the given text data T according to the complete sentence; (2) for each sentence, performing word segmentation and part-of-speech tagging, filtering out stop words, and only reserving words with specified part-of-speech, such as nouns, verbs and adjectives, namely, reserved preselected keywords; (3) constructing a preselected keyword graph G which is (V, E), wherein V is a node set and consists of preselected keywords generated in the step (2), then constructing an edge between any two points by adopting a co-occurrence relation, wherein the edge exists between the two nodes only when the corresponding vocabularies co-occur in a window with the length of K, and the K represents the size of the window, namely the maximum number of K words co-occur; (4) iteratively propagating the weight of each node until convergence; (5) sorting the node weights in a reverse order to obtain the most important T words as preselected keywords; (6) and (5) obtaining the most important T words, marking in the original text, and combining into a multi-word keyword if adjacent phrases are formed.

The following is exemplified, and exemplified as follows.

The original text short sentence is: the clock can be used as an alarm clock when the user gets up in the morning, and the weather of the user can be asked at any time

The preselected keywords are: morning, getting up, alarm clock, weather, can, time

The stop words are: can, at the same time

The final keywords are: morning, getting up, alarm clock, weather

In short, the method for generating the interview report in the embodiment of the application can realize the key point marking of one or one segment of text data by one key, so that the key point can be conveniently known in the subsequent processing of the text data, the manual operation steps are fewer, the method is simple and convenient, and the labor time is saved.

Further, in some embodiments, the interview report may be further generated into visual information and provided to the user. For example, interview reports are presented to users in bar or pie charts or line graphs or various combinations so that the users can more intuitively understand the content and key information of the interview reports.

In some embodiments, after the interview is completed, an editing triggering unit can be further provided on the human-computer interaction interface of the terminal device, and the interview report can be edited in response to an editing instruction, for example, the interviewee information is added, or all converted text data, results and report content are modified, so that the method is more flexible. Further, the user can label the interview report result according to the self requirement, after the user finishes editing, the terminal device records the editing content, acquires the label data in the editing content of the user, compares the label information with a set threshold value, and feeds the label data back to the corpus database of the neural network model when the label data reaches the preset label threshold value, or feeds the label data back to the corpus of the neural network model at preset time intervals, such as 5 days or 15 days, according to a set time period, so as to optimize the neural network model, namely, the neural network model has a self-adaptive function, so that the result obtained through analysis of the neural network model is closer to the result expected by the user.

In some embodiments, after generating the interview report, an output trigger unit may be provided on the human-computer interaction interface of the terminal device, and the interview report is output to the mobile terminal, such as a smart phone, a personal computer, or a notebook computer, in response to the output instruction, so that the interview report can be conveniently checked and exported anytime and anywhere, and convenience is provided.

based on the method for generating an interview report of the above embodiment, the second aspect of the present application also provides a computer storage medium storing computer-executable instructions configured to perform the method for generating an interview report of the above embodiment.

Based on the method for generating an interview report of the above embodiment, a terminal device according to an embodiment of the third aspect of the present application is described below.

Fig. 4 is a block diagram of a terminal device according to an embodiment of the present application, and as shown in fig. 4, the terminal device 100 of the embodiment of the present application includes a processor 10 and a memory 20.

Wherein the memory 20 is communicatively connected to the processor 10, the memory 20 stores instructions executable by the processor 10, and the instructions, when executed by the processor 10, cause the processor 10 to execute the method of generating an interview report of the above embodiment. The method for generating the interview report can refer to the description of the above embodiments.

Specifically, the terminal device 100 may include, but is not limited to, a mobile terminal such as a smartphone, a personal computer, or a tablet computer, where the terminal device 100 may set a trigger unit, for example, a human-computer interaction interface is provided, an interview report generation trigger unit is provided on the human-computer interaction interface, or a touch key or a mechanical key is provided, and when a trigger instruction is received, the above method for generating an interview report is autonomously performed, so that a one-key operation is implemented, simplicity and efficiency are achieved, an autonomous interview research process can be constructed with fewer operation steps and less time, interview cost is reduced, and time consumed by interview is shortened.

The general process of generating interview reports for embodiments of the present application is described above. The method for generating the interview report in the embodiment of the application is described below by taking the example of analyzing the interview corpus and obtaining the emotion information and the demand information of the interviewee through the neural network model, that is, further describing the emotion analysis algorithm and the demand extraction algorithm and further describing the clustering process in detail.

in the embodiment of the application, the performance information of the interviewee can comprise one or two of demand information and emotion information. In some embodiments, for the requirement information, after the original corpus information is preprocessed, the interview corpus is input into the first neural network model, a statement reflecting the requirement of the interviewee and a statement not required by the interviewee in the interview corpus are obtained, that is, the requirement information of the interviewee is extracted, for example, an SVM (Support Vector Machine) classifier can be used to implement classification analysis of the requirement and the non-requirement. Further, in some embodiments, the interviewee non-demand sentences are considered as neutral sentences, the interviewee non-demand sentences can be input into the second neural network model, and sentences reflecting the interviewee polar emotion in the interviewee non-demand sentences are extracted; and then inputting the sentences of the polar emotion into the third neural network model to obtain positive emotion sentences and negative emotion sentences in the sentences of the polar emotion. For example, two TextCNN classifiers in series are used to implement two classifications of polar and neutral sentiment sentences and further classification of the polar sentiment sentences, wherein the two TextCNN classifiers are trained with different corpora.

The following describes in detail a process of extracting a requirement from an intelligent speech material according to an embodiment of the present application with reference to fig. 5 to 7.

FIG. 5 is a flow chart illustrating a process for intelligent speech profile requirement extraction according to an embodiment of the present application.

At step 101, voice data is acquired. Specifically, in step 101, a voice signal is acquired through a microphone or a recording device such as a microphone array, a recorder, a recording pen, and the like.

In step 102, the acquired voice data is preprocessed to obtain text data for analysis. The preprocessing is a process of loading voice data into a memory and adding and deleting part of words in the voice data according to needs. The pretreatment comprises the following steps: recognizing, namely recognizing voice data into character data to form text data; splitting, namely splitting a long sentence into short sentences according to punctuation marks which represent intervals in the long sentence separated by periods in the text data; and the purification means removing invalid contents which are irrelevant to the interview contents in the original voice data in the text data. In some embodiments, preprocessing includes comment cleansing, part-of-speech participles, part-of-speech tagging, and syntactic dependency analysis.

In some embodiments, speech recognition is the conversion of lexical content in speech uttered by a person into text that can be read in by a computer. For example, a small segment of a speech wave may be formed with a certain duration (e.g., 0.05 second), each small segment being referred to as a frame, resulting in a certain number of frames (e.g., 20) within a certain time (e.g., 1 second). Information reflecting the essential features of speech is extracted from each frame (redundant information in speech signals that is not useful for speech recognition is removed while achieving dimension reduction). Then extracting the characteristics of each frame waveform to obtain the characteristic vector of the frame. Phonemes are the smallest units of speech that are divided according to the natural properties of the speech. States refer to units of speech that are finer than phonemes, and a phoneme is generally divided into three states. Speech recognition is achieved by recognizing frames as states, combining the states into phonemes, and then combining the phonemes into words.

In some embodiments, speech recognition may include real-time speech recognition and offline speech recognition. In the interview, the voice can be converted into words in real time for the interviewer to view. After the interview ends, the entire interview process can be reviewed through playback.

Specifically, a third-party speech-to-text Development platform can be used, and the function of converting the interview corpus into text data can be completed based on a Software Development Kit (sdk) of the platform through downloading. In the embodiment of the application, the real-time speech recognition and the offline speech recognition can be performed in parallel, the result of converting speech into text can be viewed in real time in the interview, and the whole interview process can be reviewed through playback after the interview is finished.

In some embodiments, prior to speech recognition, the interviewer or interviewee may use generic words, also establishing a domain-specific thesaurus. The user can improve the accuracy of the voice recognition result by storing some unusual professional vocabularies into the word stock and uploading the word stock to the voice recognition module. For example, the term "tidal current" means the movement of water currents caused by tides, the trend of fashion trends in the field of sociology, the distribution of voltage, current and power in various parts of a power grid in the power industry, and further, for example, in the interview field, unusual professional words such as research, package, and the like can be uploaded to a word bank of a speech-to-text platform, and the words can be recognized more effectively when text conversion is performed. In short, according to the domain to which the interview subject belongs, the corresponding special domain word bank is adopted, so that the recognition accuracy of the professional wordings can be further improved.

in some embodiments, text data corresponding to the speech material can be obtained through a speech recognition process. In some embodiments, the format of the text data may include, but is not limited to, Microsoft (MICROSOFT) OFFICE documents such as TXT, WORD, EXCEL, etc., or WPS format documents, or documents in other formats for WORD processing.

In some embodiments, splitting refers to splitting a long sentence into short sentences according to punctuation marks representing intervals in the long sentence separated by periods in the text data so as to analyze the short sentences independently. In some embodiments, the method uses "," or "; "to split the long sentence. In some embodiments, the phrase may be, for example, "say, i feel that the smart speaker should be simple and elegant, not colorful, and simple and convenient to use". After splitting, a short sentence can be formed as follows: 1. to name a few; 2. i feel that the intelligent sound box is simple and elegant; 3. the color is not colorful; 4. the use is simple and convenient. Compared with a long sentence, the short sentence is used as an analysis object, so that the system operation process can be simplified, the operation amount is reduced, and the efficiency is improved.

In some embodiments, cleansing refers to removing invalid content in the original speech material in the text data that is not related to the content of the speech data. In some embodiments, the presence of an adversary assist word or sigh in the text data is inevitable due to the spoken nature of the interview. In some embodiments, there may be scrambling during the conversion of speech into text. These mood assist words or messy codes are not related to the interview content and are also ineffective for demand extraction.

specifically, according to the characteristics of interview colloquialization, lexical analysis is carried out on the text of interview through a word segmentation tool, word content and part of speech in each short sentence are determined, and the language words in the sentences are removed to determine more meaningful words in each sentence, so that the interview analysis is facilitated. The word segmentation tool comprises various word segmentation tools such as ancient word segmentation, Yaha word segmentation, Jieba word segmentation, Qinghua THULAC and the like, and by taking a lexical analysis interface to call the Jieba word segmentation as an example, the original sentence after the sentence segmentation is as follows: welcome you to the interview we experienced this time with the smart speaker. The sentence which is obtained by performing lexical analysis through jieba word segmentation and removing the mood words is as follows: welcome you to the interview we experienced this time with the smart speaker user. Thus, a phrase expression having an actual meaning can be obtained.

In step 103, each short sentence in the preprocessed text data is participled. Word segmentation is the segmentation of a series of consecutive character strings into individual words according to certain logic. In some embodiments, word segmentation may be performed using a maximum matching method, an inverse maximum matching method, a two-way matching method, a best matching method, an association-backtracking method, and the like. In some embodiments, the user may select an exact word segmentation, or may select a list of all possible words that may appear. After word segmentation, each text is a corpus of text consisting of words (words) separated by spaces.

In step 104, sentence vectors corresponding to each short sentence in the segmented text data are obtained based on the feature dictionary. In some embodiments, the feature dictionary is a two-dimensional matrix composed of feature words and feature values. The characteristic words are words with high possibility of expressing that the target corpus can be positioned as requirements in the corpus. The feature value is a mathematical expression of the likelihood that the feature word is located as a requirement. The corpus can be a database containing all corpora, such as a 98-year-old daily corpus, or an existing corpus in a special domain can be used. The sentence vector is a matrix corresponding to each short sentence in the text data and composed of search results for words in the short sentence in the corpus and the feature dictionary.

In some embodiments, for each word in the segmented phrase, a look-up is made in the corpus and the feature dictionary, respectively. If the word does not appear in the corpus, the search result is 0; if the word appears in the corpus but does not appear in the feature dictionary, the search result is 1; if the word appears in the feature dictionary, the search result is 2; thus, a sentence vector corresponding to the short sentence is formed. Because the phrases for demand analysis have definite characteristics and large granularity (namely, the phrases are either demand phrases or non-demand phrases, and the phrases which cannot be clearly classified are few), practice tests show that the sentence vector forming mode has a good classification effect.

At step 105, the sentence vector is input to the demand judgment model. In some embodiments, the demand judgment model is configured to output a judgment result according to an input semantic unit vector (e.g., sentence vector). In some embodiments, the determination result may be a value indicating whether the statement belongs to a demand statement.

In step 106, it is determined into which category the short sentence is classified according to the output result of the demand judgment model. In some embodiments, the output result may be a 1 or a 0. When the output result is 1, the short sentence is divided into demand sentences; when the output result is 0, the sentence is judged as a non-demand sentence. In some embodiments, the output result may be: the possibility that the phrase is divided into the demand phrase is, for example, 0.7, and the possibility that the phrase is divided into the non-demand phrase is, for example, 0.3, and therefore, the phrase is finally judged as the demand phrase.

In some embodiments, all sentences that express the requirement are clustered. The detailed steps of the clustering process can be referred to the description of the embodiments below.

In some embodiments, a polar sentiment analysis is performed for the portion determined to be a non-demand statement. The granularity of the polar emotion analysis is finer than that of the demand analysis, the accuracy rate of the SVM cannot reach more than 90%, and therefore the polar emotion analysis adopts a Convolutional Neural Network (CNN) classifier. In some embodiments, an availability analysis may be performed for portions that are determined to be non-demand statements.

In some embodiments, a demand refers to a desire for something or something that the interviewee exhibits during the interview process. For example, when the interview object is a smart speaker, the extracted demand can be expressed as: the response speed is higher; suggesting to optimize the appearance design; the appearance is fashionable and elegant; the color is not colorful; the lines of the box are smooth, and the like. After clustering, the topic of the interview can be obtained as follows: the response speed is higher; it is suggested to optimize the design. In some embodiments, during interviews, the user may inadvertently give information that is partially or completely related to the current interview subject, such as the user may rate competitors of the interview subject or reveal other desired information that is not related to the area in which the interview subject is located.

In some embodiments, polar emotions refer to the emotional tendency of the interviewee, which can be divided into positive, negative, and neutral, for example. Specifically, the positive emotion is to indicate: the product has the advantages that the interviewer expresses favorite and satisfied contents; negative emotions are the problem of expressing the weakness and usability of the product, and the interviewer expressing content of aversion and dissatisfaction; neutral emotion is the content of expressing a neutral position. For example, when the interview object is a smart speaker, the polar emotion may include: it can bring some convenience to me to some extent (positive emotion); in fact i do not do it through it (negative sentiment); i feel the first aspect (neutral emotion). Analyzing polar emotions using non-demand statements rather than the entire interview content can improve the efficiency and accuracy of polar emotion analysis. In some embodiments, the emotional tendencies may be for the product itself or for other aspects outside of the product. For example, during an interview, a user may inadvertently give information that is partially or completely related to the current interview subject, such as the user may rate competitors of the interview subject or reveal emotional propensity information that is otherwise unrelated to the area in which the interview subject is located.

The construction and training process of the requirement judgment model and the feature dictionary will be described in detail below with reference to fig. 6.

FIG. 6 is a flow diagram illustrating a process of building and/or training a demand judgment model and a feature dictionary according to an embodiment of the present description. In some embodiments, the construction and/or training process is derived manually. In some embodiments, this construction and/or training process is accomplished by a computer program.

At step 201, voice data is acquired. At step 202, the speech data is pre-processed. Steps 201 to 202 are similar to steps 101 to 102 described above. In step 203, feature labeling is performed on the text data using an expert labeling database. For example, the annotated text data may be: x: a statement. Where X may be 0 (representing no demand) or 1 (representing demand). In step 204, the tagged text data is segmented. Step 204 is similar to step 104 described above.

At step 205, the segmented text data is input into a classifier for training a need determination model. In some embodiments, the classifier employs a Support Vector Machine (SVM) classifier. The SVM classifier is a classical two-classification model, has obvious classification effect on more obvious characteristics, and has obvious analysis effect on the demand with larger granularity. The basic model of an SVM classifier is a linear classifier defined over a feature space that maximizes the distance between the two classes. The SVM classifier may also include a kernel function that has the effect of transforming low-dimensional data into high-dimensional data. By introducing a kernel function, the inseparable problem can be converted into a separable problem, which makes it a substantially non-linear classifier that can be adapted to linearly inseparable data.

In some embodiments, the method selects a linear kernel function, e.g., k (x1, x2) ═ x1Tx 2. In some embodiments, other kernel functions may be selected, such as an aggregation kernel function, a radial basis kernel function, or other non-linear kernel functions, depending on the size of the text data and other factors.

In some embodiments, the method finally obtains the demand judgment model and a series of feature words through a series of operation processes such as Lagrangian dual operation. The requirement judging model is a set of algorithms which can be executed by a computer, the input of the algorithm is a sentence vector, and the output of the algorithm is the category to which the sentence belongs.

In some embodiments, the requirement determination model may determine whether a statement belongs to a requirement based on dependency parsing. In some embodiments, the dependency syntax analysis may include one or more rules. For example, for a statement satisfying one or more rules, it may be determined that it is a demand statement and the output result is 1, otherwise, the output result is 0. In some embodiments, each rule may be given a certain weight, and finally all the rules are combined to calculate the probability or reference value that the statement becomes a demand statement. In some embodiments, by applying the rules, the object of the viewpoint keyword can be further calculated to obtain a demand object value list, the number of emotional tendency degree adverbs can be counted to obtain a demand degree value list, and finally the demand object value list and the demand degree value list are combined to generate an improved demand list.

In some embodiments, feature words are extracted and a requirement judgment model is constructed by identifying dependencies of the requirement statements. Wherein, the characteristic word is the object indicated by the expressed viewpoint, and is generally a noun or an action noun or a verb; a concept word is a term of a expressed concept, typically an adjective, adverb, or verb. In some embodiments, dependencies between words may include a predicate (Subject-Verb, SBV), a move-Object (Verb-Object, VOB), a move-complement (Verb-Object, CMP), a core (Head, HED), or a parallel (Coordinate, COO). Furthermore, the words may also carry modifiers. The relationship of the core word to its modifiers may include an Attribute (ATT) or an Adverbial (ADV).

For example, when the requirement phrase satisfies the SBV, CMP or ATT relationship, the noun (or verb) in the phrase is a feature word, and the corresponding adjective is a viewpoint word. For example, in the requirement statement "workflow to go" the dependency of the short sentence is the cardinal predicate, so where "workflow" is the feature word and "go" is the term of view. In the requirement sentence "want the management part of the document to be better used", the dependency relationship is a dynamic complement relationship, and therefore "the management part of the document" is a characteristic word and "better used" is a viewpoint word. For example, when two adjacent words in a short sentence satisfy the ADV relationship, then the two words are respectively a modifier and a viewpoint. For example, the phrase "better-used," the two words satisfying the ADV relationship, identifies "better-used" as the term of view and "better" as the modifier. For example, when two adjacent nouns (or verbs + nouns) in a short sentence satisfy the ATT relationship, the two words constitute a noun phrase, which is a modifier and a feature word. For example, in a nominal phrase "management part of a document," where "management part" is a feature word and "document" is a modifier word. In some embodiments, the higher the number of repetitions of a feature word or keyword, the higher the attention, and when the emotional expression is devaluation, the higher the product demand.

In some embodiments, feature words are extracted and a requirement model is constructed by identifying words in the requirement clause that directly express the requirements of the user. Specifically, words (verbs ) indicating "increase" or "decrease" in the demand sentence and feature words (nouns ) indicating the subject are identified. Words indicating "increase" include: growth, supplement, expansion, filling, promotion, add, expand, add, strengthen, expand, add, supplement, increase, add, fill, and the like. Words indicating "decrease" include: cutting, omitting, weakening, deleting, weakening, reducing, shrinking, lightening, reducing, eliminating, cutting, shrinking, eliminating, reducing, weakening and the like. For example, for the demand phrase "increase some resolution bars," increase and resolution may be identified, where the verb is "increase" and the feature word is "resolution. For the requirement phrase "delete some unnecessary flows," delete "and" flow "can be identified, where the verb is" delete "and the feature word is" flow ".

In some embodiments, feature words are extracted and a requirement model is constructed by identifying words in the requirement clause that indirectly express the requirements of the user. The embodiment judges the requirement through the repetitive emphasis of the user, for example, extracts the characteristic words through recognizing degree adverbs or frequency adverbs or punctuation marks. In some embodiments, the feature words may be extracted by identifying a structure of a word (noun) plus a word (adverb) representing an emphasized repetition n times (n is a positive integer) plus a viewpoint word. For example, words that indicate emphasis include: fixed, frequent, ten-thousandths, tens of millions, tenths, too many, quick, frequent, stiff, very, general, regular, extra, more, perhaps, just, bright, slight, tasteful, just, temporary, far, just, intentional, residential, none, forever, mandatory, etc. For example, in a phrase "the interface is designed to be too much and too ugly! In "medium," the word "too" indicating emphasis is recognized and the number of repetitions n is 5, so the recognition feature word is "interface design" and the viewpoint word is "ugly. In some embodiments, feature words may also be extracted by identifying a structure of a word (noun) plus a viewpoint word (verb or adjective) plus a punctuation mark repeated n times (n is a positive integer). The punctuation mark may include, "! ",". "," … ","? ", etc. For example, the short sentence "dark green" is really unsightly. . . . . . "in, since the repeated punctuation marks are recognized". And the repetition number n is 5, and it is recognized that the feature word is "dark green", and the viewpoint word is "unsightly".

Through the processing, a series of feature words, viewpoint words and requirement judgment models can be finally obtained. In some embodiments, the frequency of the feature words may also be counted. The higher the number of repetitions of the feature word, the higher the attention. When the emotional polarity is derogative, the higher the demand is.

In some embodiments, the frequency of the viewpoint words may also be counted and a viewpoint value given. The viewpoint value represents the emotional polarity of the viewpoint, and its specific numerical value is the range of [ -1,1 ]. Wherein negative numbers represent negative emotions, positive numbers represent positive emotions, and the greater the absolute value, the more pronounced the emotion polarity.

In step 207, the method builds a feature dictionary from a series of feature words. In some embodiments, a Chi-Squared Test is used to construct the feature dictionary. Chi-square test is a commonly used hypothesis testing method based on the distribution of X2, whose invalid hypothesis H0 is: the observed frequency was not different from the expected frequency. The specific process is as follows: first, assuming that H0 holds, an X2 value is calculated based on this preamble, which represents the degree of deviation between the observed value and the theoretical value. From the X2 distribution and degrees of freedom, the probability P of obtaining the current statistics and more extreme cases with the H0 assumption being true can be determined. If the P value is small, the deviation degree of the observed value and the theoretical value is too large, the invalid assumption should be rejected, and the two comparison quantities have obvious difference and independence; otherwise, the invalid assumption cannot be rejected, i.e., independence between the two comparison quantities cannot be considered. Chi-square test is mostly used for feature extraction in natural language processing.

in some embodiments, the degree of influence of these feature words on the demand judgment, that is, the probability that a feature word can be determined as a demand, is calculated, and this probability is referred to as a feature value. If the characteristic value of a certain characteristic word is lower than a threshold value, discarding the characteristic word; otherwise, the feature word is retained. And sorting the characteristic values, and adding the characteristic words corresponding to the characteristic values with the highest rank into the characteristic dictionary to form the characteristic dictionary.

FIG. 7 is a diagram illustrating the components of an intelligent speech material requirement extraction system according to an embodiment of the present disclosure. Referring to fig. 7, the system includes a recording module 301, a speech recognition module 302, a corpus preprocessing module 303, a word segmentation module 304, and a requirement determination module 305. The recording module 301 is used to obtain voice data. The voice recognition module 302 is configured to pre-process voice data to obtain text data; the corpus preprocessing module 303 is configured to perform word segmentation processing on each sentence in the text data; the word segmentation module is used for comparing the text data after word segmentation processing with the feature dictionary to obtain a sentence vector corresponding to the text data; the requirement judging module 304 is configured to judge whether a short sentence corresponding to the sentence vector is a requirement sentence or a non-requirement sentence from the input sentence vector. For the function and implementation of each module in the device, reference may be made to the implementation of the corresponding step in the previous method embodiment. Details are omitted here for simplicity.

The following describes in detail the usability judgment process in the intelligent voice data according to the embodiment of the present application with reference to fig. 8 to 10.

fig. 8 shows a flow chart of a process for intelligent voice data availability determination according to an embodiment of the application.

In step 501, voice data is acquired. Specifically, in step 501, a voice signal is obtained through a microphone or a recording device such as a microphone array, a recorder, a recording pen, and the like.

In step 502, the acquired voice data is preprocessed to obtain text data for analysis. The preprocessing is a process of loading voice data into a memory and adding and deleting part of words in the voice data according to needs. The pretreatment comprises the following steps: recognizing, namely recognizing voice data into character data to form text data; splitting, namely splitting a long sentence into short sentences according to punctuation marks which represent intervals in the long sentence separated by periods in the text data; and the purification means removing invalid contents which are irrelevant to the interview contents in the original voice data in the text data. In some embodiments, the pre-processing may include extraneous symbol filtering and non-core component filtering.

In step 503, a word segmentation process is performed on each short sentence in the preprocessed text data. In some embodiments, the part of speech of each word may be tagged.

In step 504, a sentence vector corresponding to each short sentence in the segmented text data is obtained based on the feature dictionary. In some embodiments, the feature dictionary is a two-dimensional matrix composed of feature words and feature values. The characteristic words are words in the corpus with a high probability of expressing that the target corpus can be positioned as usability. The feature value is a mathematical expression of the likelihood that the feature word is located as being available.

In some embodiments, for each word in the segmented phrase, a look-up is made in the corpus and the feature dictionary, respectively. If the word does not appear in the corpus, the search result is 0; if the word appears in the corpus but does not appear in the feature dictionary, the search result is 1; if the word appears in the feature dictionary, the search result is 2; thus, a sentence vector corresponding to the short sentence is formed. In some embodiments, the availability judgment model may include a plurality of rules, and for a statement satisfying one or more rules, the statement may be judged as an availability statement, and the output result is 1; otherwise, it can be judged as a non-availability statement, and the output result is 0.

At step 505, a sentence vector is input to the usability judgment model. In some embodiments, the usability judgment model is configured to output a judgment result according to an input semantic unit vector (e.g., sentence vector). In some embodiments, the determination result may be a value indicating whether the statement belongs to an availability statement.

In step 506, it is determined into which category the short sentence is classified according to the output result of the usability judgment model. In some embodiments, the output result may be a 1 or a 0. When the output result is 1, the short sentence is divided into usability sentences; when the output result is 0, the sentence is judged as a non-availability sentence. In some embodiments, the output result may be as follows: the possibility that the phrase is divided into an availability statement is, for example, 0.7, and the possibility that the phrase is divided into a non-availability statement is, for example, 0.3, and therefore, the phrase is finally judged as an availability statement.

in some embodiments, all sentences that express usability are clustered. The detailed steps of the clustering process can be referred to the description of the embodiments below.

The construction and/or process of the usability judgment model and the feature dictionary will be described in detail below in conjunction with fig. 9.

FIG. 9 illustrates a flow diagram of a usability determination model and feature dictionary construction and/or process in accordance with an embodiment of the present description. In some embodiments, the construction and/or process is derived manually. In some embodiments, the building and/or the process is performed by a computer program.

In step 601, voice data is obtained. At step 602, the speech data is pre-processed. Steps 601 to 602 are similar to steps 501 to 502 described above. In step 603, feature labeling is performed on the text data using an expert labeling database. For example, the annotated text data may be: x: a statement. Where X may be 0 (representing a non-availability statement) or 1 (representing an availability statement). In step 604, the tagged text data is segmented. Step 604 is similar to step 504 described above.

At step 605, the segmented text data is input into a classifier for training a usability judgment model. In some embodiments, the classifier employs a Support Vector Machine (SVM) classifier. The SVM classifier is a classical binary classification model, has obvious classification effect on more obvious characteristics, and has obvious usability analysis effect on larger granularity. The basic model of an SVM classifier is a linear classifier defined over a feature space that maximizes the distance between the two classes. The SVM classifier may also include a kernel function that has the effect of transforming low-dimensional data into high-dimensional data. By introducing a kernel function, the inseparable problem can be converted into a separable problem, which makes it a substantially non-linear classifier that can be adapted to linearly inseparable data.

In some embodiments, the method finally obtains the usability judgment model and a series of feature words through a series of operation processes such as lagrangian dual operation. The usability judgment model is a set of algorithms which can be executed by a computer, the input of the usability judgment model is a sentence vector, and the output of the usability judgment model is the category to which the sentence belongs.

In some embodiments, the usability determination model is constructed by identifying dependencies of a statement, i.e., performing Dependency grammar (Dependency Parsing) analysis on the statement. Specifically, the components of a sentence may be divided into subjects, predicates, objects, determinants, subjects, complements, and the like. The relationships among the components mainly include a main-meaning relationship (SBV), a moving-object relationship (VOB), a centering relationship (ATT), an intermediate relationship (ADV), a dynamic compensation relationship (CMP), and a parallel relationship (COO). Dependency grammar (DP) refers to a method of revealing a syntactic structure of a language unit by analyzing Dependency relationships between components within the language unit, i.e., recognizing grammatical components in a sentence and analyzing relationships between the components. Specifically, dependency parsing identifies grammatical components "principal predicate object, shape complement" in a sentence, and analyzes the relationship between the components.

the most critical step of usability utterance analysis is how to express the opinion of the rating holder in a structured way, and < rating object, rating phrase > can be regarded as one rating unit. The evaluation object can be noun phrases, verb phrases and single sentence phrases, and is mainly located at a subject position, an object position and a verb position of a complementary structure. The evaluation phrases are mainly located at the positions of the predicates, verbs of the bingo structure and complements. The evaluation phrase is expressed as a group of phrases which appear continuously, and can be formed by combining degree adverbs, negative adverbs and evaluation words, or can be noun phrases, adjective phrases, verb phrases or single sentence phrases formed by combining the three phrases. The usability evaluation unit can be extracted as long as the main predicate, the animated object and the animated complement in the sentence are recalled by applying the corresponding rules.

in some embodiments, in the case where an SBV is present in a sentence, and the part-of-speech of a modifier in a dependency relationship pair is a noun, an abbreviation, or a foreign word and the part-of-speech of a core word is a verb, if only an SBV is present in the sentence, the evaluation object and the evaluation phrase are in the subject and predicate positions, respectively, that is, < modifier of SBV, core word of SBV > is extracted as the usability evaluation unit, for example, < stability, improvement >. If the SBV and the VOB exist in the sentence, where the core word of the SBV is the core word of the VOB, then < the modifier of the SBV, the core word of the VOB, and the modifier of the VOB > are extracted as the usability evaluation unit, for example, < evaluation box, no feature >. If there are SBV and CMP in the sentence, where the core word of the SBV is the core word of the CMP, then < modifier of the SBV, core word of the CMP, and modifier of the CMP > < page, load slow > are extracted.

In some embodiments, in the case where an SBV is present in a sentence, and the part-of-speech of a modifier in a dependency relationship is a noun, an abbreviation, or a foreign word, and the part-of-speech of a core word is an adjective, a word modifying the noun, or an idiom, if only an SBV is present in the sentence, the evaluation object and the evaluation phrase are in the subject and predicate positions, respectively, i.e., < modifier of SBV, core word of SBV > e.g., < interface, good-look >. If SBV and COO exist in the sentence, and the core word of SBV is the core word of COO, and the part of speech of the modifier in the relation pair of COO is adjective, word or idiom of modified noun, < modifier of SBV, core word of SBV and modifier of COO >, < page turning, slow card >,. If only VOBs exist in the sentence, and the part of speech of the modifier in the relation pair is noun, abbreviation or foreign word and the part of speech of the core word is verb, < modifier of VOB, core word of VOB >, < evaluation box, none >, for example, is extracted.

Through the above processing, a series of evaluation objects and evaluation phrases, usability evaluation units, and usability judgment models can be finally obtained.

in step 607, the method builds a feature dictionary from a series of feature words. In some embodiments, a Chi-Squared Test is used to construct the feature dictionary. Chi-square test is a commonly used hypothesis testing method based on the distribution of X2, whose invalid hypothesis H0 is: the observed frequency was not different from the expected frequency. The specific process is as follows: first, assuming that H0 holds, an X2 value is calculated based on this preamble, which represents the degree of deviation between the observed value and the theoretical value. From the X2 distribution and degrees of freedom, the probability P of obtaining the current statistics and more extreme cases with the H0 assumption being true can be determined. If the P value is small, the deviation degree of the observed value and the theoretical value is too large, the invalid assumption should be rejected, and the two comparison quantities have obvious difference and independence; otherwise, the invalid assumption cannot be rejected, i.e., independence between the two comparison quantities cannot be considered. Chi-square test is mostly used for feature extraction in natural language processing.

In some embodiments, the degree of influence of these feature words on the usability judgment, that is, the probability that a feature word can be determined as usability, is calculated, and this probability is referred to as a feature value. If the characteristic value of a certain characteristic word is lower than a threshold value, discarding the characteristic word; otherwise, the feature word is retained. And sorting the characteristic values, and adding the characteristic words corresponding to the characteristic values with the highest rank into the characteristic dictionary to form the characteristic dictionary.

FIG. 10 is a diagram illustrating the components of an intelligent speech material availability extraction system, according to an embodiment of the present disclosure. Referring to fig. 10, the system includes a recording module 701, a speech recognition module 702, a corpus preprocessing module 703, a word segmentation module 704, and an availability determination module 705. The recording module 701 is configured to obtain voice data. The voice recognition module 702 is configured to pre-process voice data to obtain text data; the corpus preprocessing module 703 is configured to perform word segmentation processing on each sentence in the text data; the word segmentation module is used for comparing the text data after word segmentation processing with the feature dictionary to obtain a sentence vector corresponding to the text data; the availability judging module 704 is configured to judge whether a short sentence corresponding to the sentence vector is an availability sentence or a non-availability sentence from the input sentence vector. For the function and implementation of each module in the device, reference may be made to the implementation of the corresponding step in the previous method embodiment. Details are omitted here for simplicity.

The above description is made on analyzing interview corpus and obtaining demand information and usability analysis and further emotion analysis through the neural network model, and the following further description is made on analyzing interview corpus and obtaining emotion information through the neural network model.

In some embodiments, for the extraction of emotion information, an interview corpus can be input into a second neural network model, and sentences reflecting polar emotion and sentences with neutral emotion of an interviewee in the interview corpus are extracted; and inputting the sentences of the polar emotion into a third neural network model to obtain positive emotion sentences and negative emotion sentences in the sentences of the polar emotion. Similarly, two TextCNN classifiers connected in series can be adopted, wherein one can be used as an emotion extraction model to realize the two classification of the polar emotion sentences and the neutral emotion sentences, and the other can be used as an emotion classification model to realize the further classification of the polar emotion sentences.

The analysis process for extracting emotion information in interview corpus is further described in detail with reference to fig. 11.

FIG. 11 illustrates a flow diagram of a method of speech content analysis, shown in accordance with some embodiments of the present application. The process 800 may be implemented as a set of instructions in a non-transitory storage medium in a voice content analysis apparatus. The voice content analysis device may execute the set of instructions and may perform the steps in flow 800 accordingly.

The operations of the illustrated flow 800 presented below are intended to be illustrative and not limiting. In some embodiments, flow 800 may be implemented with one or more additional operations not described, and/or with one or more operations described herein. Further, the order of the operations shown in FIG. 8 and described below is not intended to be limiting.

in 810, the voice content analysis device may obtain voice data.

The voice data may be a recording or a video. In some embodiments, the voice data may be a recording or video of an interview. For example, the voice data may be a recording of a merchant's interview with a consumer. The interview recordings may include interviewer recordings and interviewee recordings.

In 820, the voice content analysis means may acquire corresponding text data based on the voice data.

specifically, the speech content analysis device may perform speech recognition on the speech data, convert the speech data into an original text, and then convert the original text into text data satisfying the data format requirements of the emotion analysis model (in step 830).

In some embodiments, the voice content analysis means may acquire only text data corresponding to a part of the voice data. For example, for an interview recording, the voice content analysis means may only obtain text data corresponding to the interviewee recording. Further, the voice content analysis device can analyze the emotion of the interviewee (for example, product user) more accurately.

In some embodiments, the text data is a sentence vector. The sentence vector may be a one-dimensional or multidimensional vector, and the speech content analysis apparatus may acquire the sentence vector by the following steps.

Step one, the voice content analysis device can obtain an original text, namely a voice recognition result of the voice data.

And secondly, for each complete sentence in the original text, the voice content analysis device can divide the complete sentence into sentences to obtain at least one short sentence.

In some cases, the speech content analysis means may divide a complete sentence by punctuation marks in the complete sentence, such as comma, pause, colon, semicolon. As an example, the speech content analysis device may divide a complete sentence "i like the size and color of the mobile phone well, but the setting of the volume control key of the mobile phone is unreasonable, i feel that the setting of the volume control key of the mobile phone on the right side is convenient for the user to operate", into three short sentences. The three phrases are divided into sentences according to commas in the complete sentence, and the sentences are respectively 'I like the size and the color of the mobile phone', 'but the setting of the volume control key of the mobile phone is unreasonable' and 'I feel that the setting of the volume control key of the mobile phone on the right side is convenient for the user to operate'. It should be understood that, by segmenting a complete long sentence into a plurality of short sentences, the complexity of the sentence is reduced, the sentence analysis is facilitated, and the accuracy of the sentence analysis can be increased.

Step three, the speech content analysis device may determine a sentence vector of the at least one short sentence.

Specifically, for each short sentence in the at least one short sentence, the speech content analysis means may determine a word vector of the short sentence based on the word2vec model; the sentence vector is then determined based on the word vector of the short sentence. The word2vec model can be trained by a user, and can also be a word2vec model carried by a hand tool kit.

In some cases, the process of the speech content analysis device determining the Word vector based on the Word2vec model may include: (1) the method comprises the following steps of (1) performing word segmentation/stem extraction and morphology reduction, for example, for Chinese linguistic data, word segmentation is required, for English linguistic data, word segmentation is not required, but because English relates to various tenses, stem extraction and morphology reduction are required; (2) constructing a dictionary and counting word frequency, for example, in the step, traversing all texts once, finding out all the appeared words, and counting the appearance frequency of each word; (3) constructing a tree structure, for example, a Huffman tree according to the occurrence probability of each word such that all classifications are at leaf nodes; (4) generating a binary code of the node, wherein the binary code reflects the position of the node in the tree, so that the corresponding leaf node can be found step by step from the root node according to the code; (5) initializing intermediate vectors of each non-leaf node and word vectors in the leaf nodes, for example, each node in a Huffman tree stores a vector with the length of m, but the meanings of the vectors in the leaf nodes and the non-leaf nodes are different, specifically, the word vectors of each word stored in the leaf nodes are used as input of a neural network, and the intermediate vectors stored in the non-leaf nodes correspond to parameters of a hidden layer in the neural network and determine a classification result together with the input; (6) training the intermediate vectors and word vectors, for example, in the training process, the model will give these abstract intermediate nodes a proper vector, which represents all its corresponding subnodes, for the CBOW model, first add the word vectors of multiple words near the central word as the input of the system, and classify step by step according to the binary code generated by the central word in the previous step and train the intermediate vectors and word vectors according to the classification result.

In some cases, the speech content analysis device may determine that the mean of the word vectors of the short sentence is the sentence vector of the short sentence. Of course, the speech content analysis device may string all word vectors of the short sentence as a sentence vector.

In some embodiments, the speech content analysis device obtaining the sentence vector may further include preprocessing the original text. The preprocessing includes analyzing the vocabulary of the original text and deleting unnecessary vocabulary. As an example, the speech content analysis means may delete at least one of the inflicted word, stop word and messy code in the original text.

The tone words are fictional words representing tone, such as, for example, time, week, wool, bar, o. The stop word means that some words or words are automatically ignored in the information processing process, and can be filtered according to the information processing purpose. For example, for a product interview, the stop word may refer to a phrase in the keyword extraction result that does not match the actual demand. The messy code refers to a part which cannot be identified in the voice recognition. The speech content analysis device may delete the inflight words and stop words in the original text based on the inflight word list and stop word list constructed in advance.

at 830, the speech content analysis device may input the text data into a trained emotion analysis model that includes a trained emotion extraction model and a trained emotion classification model.

The trained emotion extraction model can extract polar emotion text data, and the trained emotion classification model can classify the polar emotion text data. The trained emotion extraction model and the trained emotion classification model are obtained by initial model training, and the specific training process is as follows.

For the trained emotion extraction model, the speech content analysis device can acquire the emotion by the following steps:

Step one, obtaining marked training data. The marked training data comprises marked neutral emotion text data and marked non-neutral emotion text data.

Neutral emotion text data herein means that the emotion expressed by the text data is neutral, for example, "i feel" the first aspect "," general ". The non-neutral emotion text data, also called as polar emotion data, comprises positive emotion text data and negative emotion text data, and means that the emotion expressed by the text data is stronger than that expressed by neutral emotion. For example, the forward emotion text data may include "like", "it may bring some convenience to me to some extent", "this design saves a lot of time". For another example, negative emotion text data may include "actually i will not do it with him", "this color is uncomfortable", and "nobody will choose this way". Of course, other classification criteria can be used to classify emotions, and the classification and emotion analysis method adapted to the classification still fall within the scope of the present application.

In some cases, the labeled training data may be labeled by an expert or by a user. Training data are labeled by experts, and the accuracy of labeling results is high; the training set is marked by the user, so that the marking result is more personalized and is suitable for personal requirements.

in some cases, the labeled training set is domain-specific text data. It should be understood that the emotion classification model trained from the domain-specific text data can be used specifically for emotion analysis of the domain-specific speech data.

And step two, inputting the marked training data into an initial emotion extraction model. The initial emotion extraction model is an initial neural network model, such as TextCNN. The initial emotion extraction model contains a plurality of features and a plurality of initial parameters.

A feature vocabulary of the emotion extraction model can be created based on a plurality of features of the emotion extraction model. The feature vocabulary contains a plurality of words representing polar emotions (positive and negative emotions), such as "like", "love", "dislike", and "dislike".

And step three, when the initial emotion extraction model reaches a convergence condition after being trained, determining the trained emotion extraction model.

In the training process, the emotion extraction model judges whether the output result is good or bad according to the marked training data, further continuously adjusts the initial parameters and continuously optimizes the result until the trained emotion extraction model reaches the convergence condition. The convergence condition may be that the loss function is smaller than a first threshold or that the training period is larger than a second threshold, and the first threshold and the second threshold may be set manually empirically.

For the trained emotion classification model, the speech content analysis device can acquire the emotion classification model by the following steps:

Step one, obtaining marked training data. The marked training data comprises marked positive emotion text data and marked negative emotion text data.

And step two, inputting the marked training data into an initial emotion classification model for training. The initial emotion classification model is an initial neural network model, such as TextCNN. The initial emotion extraction model contains a plurality of features and a plurality of initial parameters.

And step three, when the initial emotion classification model reaches a convergence condition after being trained, determining the trained emotion classification model.

In the training process, the emotion classification model judges whether the output result is good or bad according to the marked training data, further continuously adjusts the initial parameters and continuously optimizes the result until the trained emotion classification model reaches the convergence condition. The convergence condition may be that the loss function is smaller than a first threshold or that the training period is larger than a second threshold, and the first threshold and the second threshold may be set manually empirically.

At 840, the speech content analysis device may separate the text data into polar emotion text data and neutral emotion text data via the trained emotion extraction model.

specifically, the speech content analysis device can label the polar emotion text data and the neutral emotion text data differently through the trained emotion extraction model, and then classify the polar emotion text data and the neutral emotion text data. As an example, the emotion extraction model may label polar emotion text data as "" (i.e., not 2) and neutral emotion text data as "2". Exemplary output results of the emotion extraction model are listed below:

"it may bring me some convenience to some extent";

"i did not really do anything through him";

"I feel the first aspect".

In some embodiments, neutral emotion text data determined by the speech content analysis model may be used to analyze the text data (step 820) for relevant user requirements. For example, the voice data is a recording of an interview of a product, the interview is targeted to a user, and the content of the interview is the opinion of the user on the product. At this time, the user's requirement is the user requirement related to the text data. The description of the user's needs is analyzed using neutral emotion text data, and reference may be made to other related descriptions in this application.

At 850, the speech content analysis device may separate the polar emotion text data into positive emotion text data and negative emotion text data through the trained emotion classification model.

Specifically, the speech content analysis device can label the positive emotion text data and the negative emotion text data differently through the trained emotion classification model, and then classify the positive emotion text data and the negative emotion text data. As an example, the emotion classification model may label positive emotion text data as "1" and negative emotion text data as "0". Exemplary output results of the emotion classification model are listed below:

"1 it can bring me some convenience to some extent";

"0 really i am not doing so by him".

At 860, the speech content analysis device may derive emotion analysis results from the positive emotion text data and the negative emotion text data.

In some embodiments, after classifying each short sentence in each sentence in the speech data, the speech content analysis apparatus may determine the emotion analysis result according to a ratio of the positive emotion text data to the negative emotion text data. By way of example, positive emotion text data accounts for 65% of all speech data (i.e., its corresponding text data), negative emotion text data accounts for 10% of all speech data (i.e., its corresponding text data), and neutral emotion text data accounts for 25% of all speech data (i.e., its corresponding text data). Then, the speech content analysis means may derive that the emotional tendency of the speech data is positive.

In some embodiments, the speech content analysis device may analyze the specific content of the positive emotion text data and the negative emotion text data, respectively, to determine emotion analysis results. As an example, the voice content analysis device analyzes voice data of a product interview, resulting in positive emotion text data and negative emotion text data. The voice content analysis device can further analyze the positive emotion text data to obtain the advantages of the product, and analyze the negative emotion text data to obtain the disadvantages of the product. The advantages and disadvantages of the product can be used as the result of emotion analysis.

In some embodiments, the voice content analysis method may further include: determining the domain to which the content of the voice data belongs, and determining and calling the trained emotion analysis model (for example, a trained emotion extraction model and a trained emotion classification model) according to the domain to which the content of the voice data belongs.

For example, the voice content analysis means may determine, from the text data, a field to which the content of the corresponding voice data belongs. As an example, the voice content analysis means may extract keywords from the text data, and determine a field, such as a home appliance, sports, to which the content of the voice data corresponding to the text data belongs, according to the keywords.

For another example, the voice content analysis means may receive a user input to determine a domain to which the content of the voice data belongs. The user input includes a domain to which the content of the voice data belongs.

in some embodiments, the preprocessed corpus may be input into the demand judgment model to obtain a demand-class corpus and a non-demand-class corpus, and then the non-demand-class corpus is input into the emotion classification model to obtain a polar corpus and a neutral corpus, and finally the polar corpus is classified into a positive corpus and a negative corpus again. In the above embodiment, after obtaining the non-demand-class corpus, the non-demand corpus and its copy may be input into the emotion classification model and the usability classification model, respectively, so as to obtain the polar corpus and the neutral corpus from the emotion classification model and obtain the usability corpus and the non-usability corpus from the usability classification model.

In some embodiments, the preprocessed corpus may be input into the emotion classification model to obtain the polar corpus and the neutral corpus, and then the neutral corpus is input into the requirement judgment model to obtain the requirement-class corpus and the non-requirement-class corpus. In the above embodiment, after obtaining the non-requirement corpus, the non-requirement corpus may be input into the availability classification model to obtain the availability corpus and the non-availability corpus.

in some embodiments, the preprocessed corpus and its copy may be respectively input into the demand judgment model and the emotion classification model to obtain a demand-class corpus and a non-demand-class corpus from the demand judgment model, and obtain a polar corpus and a neutral corpus from the emotion classification model, and finally, the polar corpus is classified into a positive corpus and a negative corpus again.

in some embodiments, the preprocessed corpus and its copies may be respectively input into the demand judgment model and the usability classification model to obtain the demand-class corpus and the non-demand-class corpus from the demand judgment model, and obtain the usability corpus and the non-usability corpus from the usability classification model, and then the non-demand-class corpus is input into the emotion classification model to obtain the polar corpus and the neutral corpus, and finally the polar corpus is classified into the positive corpus and the negative corpus again.

In some embodiments, the preprocessed corpus and its copy may be respectively input into the emotion classification model and the usability classification model to obtain the polar corpus and the neutral corpus from the emotion classification model, and the usability classification model to obtain the usability corpus and the non-usability corpus, and then the neutral corpus is input into the requirement judgment model to obtain the requirement-class corpus and the non-requirement-class corpus. In the above embodiment, the non-availability corpus may be input into the demand judging model. In the above embodiment, the non-availability corpus and the neutral corpus may be merged and input into the requirement determining model.

In some embodiments, the preprocessed corpus and the first copy and the second copy thereof may be respectively input into the demand judgment model, the emotion classification model and the availability classification model to obtain a demand-class corpus and a non-demand-class corpus from the demand judgment model, obtain a polar corpus and a neutral corpus from the emotion classification model, and obtain an availability corpus and a non-availability corpus from the availability classification model.

In some embodiments, the polar corpus may include polar emotion information of the user on the product, or polar emotion information of other aspects besides the product, and the usability corpus may also analyze the polar emotion information of the user on the product. In some embodiments, product-independent polar emotion information in the polar corpus may be extracted by merging the availability corpus with the polar corpus. In some embodiments, the positive and negative corpora may be input into the availability classification model separately to filter out the positive and negative availability corpora, respectively, therefrom. In some embodiments, the non-requirement class information is also marked as neutral corpuses.

In some embodiments, the emotion analysis model and/or usability judgment model may be constructed and/or trained by a method of constructing and/or training a demand analysis model. In some embodiments, the emotion analysis model and/or the demand judgment model may be constructed and/or trained by a method of constructing and/or training an availability judgment model. In some embodiments, the usability judgment model and/or the demand judgment model may be constructed and/or trained by a method of constructing and/or training an emotion analysis model.

The sentence clustering method according to the embodiment of the present application will be described in detail with reference to fig. 12 to 17.

FIG. 12 is a flow diagram of a semantic unit clustering method according to one embodiment of the present application. In this embodiment, the method for generating an interview report of the present application may be implemented by an app loaded on the terminal device, and the implementation of the semantic element clustering method may be implemented by a processor of the terminal device. The method may be stored in a memory, which is executed by the terminal device upon receiving a triggering instruction to generate an interview report.

As shown in fig. 12, the semantic unit clustering method includes the steps of 2000: a plurality of semantic units is obtained. The semantics can be expressed by the units of all levels of the language and the combination of the units, in other words, the semantics is expressed by the morphemes, words, phrases, sentences and sentence groups of the language. In the present application, a semantic unit may be not only a morpheme, a word, a phrase, a sentence group, but also any object that can be configured to have a specific semantic meaning or make a person associate with the specific semantic meaning, such as a letter, a number, a symbol, an action, and the like, as needed, or may be any combination of one or more of the above. In some embodiments, the semantic units are selected from any form of corpus, such as audio corpus, text corpus, video corpus, corpus expressed in a computer language, and the like. In some embodiments, the semantic elements may be from the audio and/or textual representation of the interview report described previously. In some embodiments, a semantic unit may contain one or more keywords of interest to a user. In some embodiments, the semantic unit may be a morpheme, word, phrase, sentence group, letter, number, symbol, action, etc. containing the user's requirement, in which case, for example, a semantic unit may be a sentence "i want a cell phone", or may be a word "cell phone". In some embodiments, the semantic units may be morphemes, words, phrases, sentences, sentence clusters, letters, numbers, symbols, actions, etc. that contain a sentiment polarity, where the sentiment polarity (such as positive, negative) indicates the user's preference for an object, in which case, for example, one semantic unit may be a sentence "i like touch screen cell phone". In some implementations, the semantic units can be one or more words or sentences that have been subjected to emotion polarity classification at the emotion analysis model. In some implementations, the semantic units can be one or more words or sentences that have been subjected to a requirement classification at the requirement analysis model. In some embodiments, the corpus being clustered may be a corpus set classified as positive emotions via an emotion classification model. In some embodiments, the corpus being clustered may be a corpus set classified as negative-going emotion via an emotion polarity analysis model. In some embodiments, the corpus being clustered may be a corpus that is classified as neutral emotion via an emotion polarity analysis model. In some embodiments, the corpora being clustered may be a corpus that is classified into usability assessments via a usability classification model. In some embodiments, the corpus to be clustered may be a corpus set that is determined by the requirement determining module to be non-usability evaluation. In some embodiments, the corpus to be clustered may be a corpus set determined to be required by the requirement determining module. In some embodiments, the corpus to be clustered may be a corpus set determined to be non-demand by the demand determination module. In some embodiments, the corpora being clustered may be a combination of one or more of the aforementioned corpus sets. In some embodiments, the corpus being clustered may be pre-processed, such as keyword recognition, keyword extraction, non-keyword removal, punctuation recognition, and the like.

As shown in fig. 12, the semantic unit clustering method further includes the step 4000: determining one or more cluster centers based on the plurality of semantic units. The process of dividing a collection of physical or abstract objects into classes composed of similar objects is called clustering. The cluster (or clusters) generated by the clustering operation is a collection of a set of data objects that are similar to objects in the same cluster and different from objects in other clusters. The cluster center is the most important one of the objects in the cluster, which is the most representative of the cluster and the most capable of interpreting the other objects in the cluster. For example, the cluster center sentence expresses the subject or core idea of the interview to some extent. In some embodiments, there is only one cluster center for a cluster. In some embodiments, the cluster center may be one or more semantic units selected from a plurality of semantic units, and each cluster center is used as a reference object when calculating the similarity between the cluster center and other semantic units in the plurality of semantic units, in other words, in the similarity calculation process, the reference object needs to perform similarity calculation once with each of the other semantic units.

In some embodiments, the step 4000 of determining one or more cluster centers based on the plurality of semantic units comprises: one or more cluster centers are determined from the plurality of semantic units by the AP algorithm. The ap (affinity propagation) method is also known as an affinity propagation algorithm, where the size of each message reflects the affinity of the current data point to select another data point as its cluster center at any point in time. In the AP algorithm, all data points are used as potential cluster centers (also called cluster centers), and a connection line between every two data points forms a network, and each data point is considered as a network node. The AP algorithm calculates the clustering center of each sample through message (namely, attraction degree and attribution degree) transmission of each edge in the network, wherein the attraction degree refers to the degree that a first data point is suitable for being used as the clustering center of a second data point, and the attribution degree refers to the degree that the second data point selects the first data point as the clustering center of the second data point. In other words, the AP algorithm recursively, i.e., transmits information along the network edges until a good set of cluster centers appears and a corresponding cluster is generated.

in some embodiments, the step 4000 of determining one or more cluster centers based on the plurality of semantic units comprises: determining each of the plurality of semantic units as a cluster center.

In some embodiments, the step 4000 of determining one or more cluster centers based on the plurality of semantic units comprises: one or more cluster centers are determined from the plurality of semantic units based on the similarity between each two of the plurality of semantic units. As mentioned above, clustering refers to dividing similar objects (such as semantic units with similar semantics) into different groups or more subsets by static classification, so that the member objects in the same group or subset have a certain similarity. In some embodiments, similarity refers to the degree to which two different semantic units are similar, which may be expressed as a distance between their respective mathematical representations, such as euclidean distance, manhattan distance, infinite norm distance, mahalanobis distance, cosine distance, hamming distance, and the like. For example, the similarity between two semantic units can be calculated by the HanLP (HanLP) suite, which is a Java toolkit composed of a series of models and algorithms for popularizing the application of natural Language processing in a production environment.

FIG. 13 is a flow diagram of determining one or more cluster centers based on similarity between a plurality of semantic units, each of which is based on one or more semantic units, according to one embodiment of the present application.

As shown in fig. 13, determining one or more cluster centers based on the similarity between each two of the semantic units includes step 4200: and sequentially selecting each of the plurality of semantic units as candidate semantic units.

as shown in fig. 13, determining one or more clustering centers based on the similarity between each two of the semantic units further includes a step 4400: for each candidate semantic unit: calculating a similarity between each of the candidate semantic units and each of remaining semantic units of the plurality of semantic units, respectively, and determining each of the candidate semantic units as a cluster center if there is at least one semantic unit in the remaining semantic units having a similarity higher than a predetermined threshold.

FIG. 14 is a flow diagram for separately calculating a similarity between each candidate semantic unit and each of the remaining semantic units of the plurality of semantic units according to one embodiment of the present application.

As shown in fig. 14, calculating the similarity between each candidate semantic unit and each of the remaining semantic units in the plurality of semantic units, respectively, includes step 4420: and calculating a candidate semantic vector of each candidate semantic unit. A semantic vector may be a vector representation of a semantic unit. In some embodiments, the semantic vector may be a number vector, a symbol vector, a letter vector, a word vector, a sentence vector, a segment vector, and the like. In some embodiments, the word vector may be obtained based on one or more word vector calculations. In some embodiments, the sentence vectors may be obtained based on one or more word vector calculations. In some embodiments, the segment vector may be obtained based on one or more sentence vector calculations. In some embodiments, the same semantic vector may be used at the emotion analysis model and the demand analysis model for the same semantic unit. In some embodiments, different semantic vectors may be used at the emotion analysis model and the demand analysis model for the same semantic unit. In some embodiments, the same semantic vector may be used at the emotion analysis model and the clustering model for the same semantic unit. In some embodiments, the same semantic vector may be used at both the demand analysis model and the clustering model for the same semantic unit. In some embodiments, the same semantic vector may be used at the emotion analysis model, the demand analysis model, and the clustering model for the same semantic unit. In some embodiments, the semantic vector is randomly specified. In some embodiments, each element in the semantic vector represents a degree of association or weight of the semantic unit in some aspect of interest.

FIG. 15 is a flow diagram of computing a candidate semantic vector for each candidate semantic unit according to one embodiment of the present application.

As shown in fig. 15, calculating the candidate semantic vector for each candidate semantic unit includes the steps 4441: a feature semantic unit table is obtained, wherein the feature semantic unit table comprises one or more feature semantic units. In some embodiments, the feature semantic units may be letters, numbers, symbols, words, sentences, paragraphs, articles, etc. that represent emotional polarity. In some embodiments, the characteristic semantic units may also represent morphemes, words, phrases, sentences, sentence clusters, letters, numbers, symbols, actions, etc. of some objective attribute of the object. In some embodiments, the characteristic semantic units represent morphemes, words, phrases, sentences, sentence groups, letters, numbers, symbols, actions, etc. of objects desired by the user. In some embodiments, the feature semantic unit may be selected from an expert annotation thesaurus, or may be customized according to requirements.

As shown in fig. 15, calculating the candidate semantic vector for each candidate semantic unit includes the steps 4442: and respectively determining the association degree of each candidate semantic unit and each characteristic semantic unit. In some embodiments, the degree of association may be the degree to which the semantic unit expresses an emotion polarity with respect to a feature semantic unit. In some embodiments, the degree of association may be a degree of demand that the semantic unit expresses with respect to a feature semantic unit. In some embodiments, the degree of association is proportional to a frequency of occurrence of each feature semantic unit in the each candidate semantic unit.

As shown in fig. 15, calculating the candidate semantic vector for each candidate semantic unit includes the steps 4443: and generating the candidate semantic vector according to the association degree of each candidate semantic unit and each characteristic semantic unit. In some embodiments, the degree of association may be proportional to the frequency with which each feature semantic unit appears in the candidate semantic units. In some embodiments, the degree of association may be proportional to the mood of the stationary language modifying the characteristic semantic unit in the candidate semantic units. For example, in order to obtain a scene that a user likes colors, it is assumed that a feature semantic unit table or a feature semantic unit dictionary that the user is interested in includes keywords "red", "orange", "yellow", "green", "blue", "white", and "black", and for a semantic unit "i do not like blue very much, i prefer white but i prefer black most, it can be analyzed that the user holds a positive attitude for" black "and" white ", holds a negative attitude for" blue ", has no appearance for other colors, and likes" black "more than" white ". When calculating the semantic vector, a positive weight value is given to the positive attitude, a negative weight value is given to the negative attitude, 0 is given to the unknown attitude, and different liked degrees are represented by different weight values. Based on the above principle, if the definition of the vector is in the following order: { "red", "orange", "yellow", "green", "blue", "white", "black" }, we can get the semantic vector [0, 0, 0, 0, -0.5, 1] of this semantic unit. In some embodiments, the selection of the keywords in the feature semantic unit table, the numerical range of the weight, and the corresponding rule between the weight and the keyword may be changed according to actual requirements. In some embodiments, the sentence vectors required by the usability analysis model may be generated using the methods described above. In some embodiments, the sentence vectors required by the demand judgment model may be generated using the above-described method. In some embodiments, the sentence vectors required by the emotion analysis model may be generated using the methods described above.

FIG. 16 shows a flow diagram for computing a candidate semantic vector for each candidate semantic unit according to another embodiment of the present application.

As shown in fig. 16, calculating the candidate semantic vector for each candidate semantic unit includes the steps 4445: and allocating an identity vector to each candidate semantic unit. In some embodiments, each semantic unit (e.g., sentence) may be assigned a unique segment id (paragraph id). As with the normal words, the segment ID is first mapped into a segment vector (paragraph vector), which has the same dimensions as the word vector, but comes from two different vector spaces. During the training of a sentence or document, the segment ID remains unchanged, which is equivalent to using the semantics of the entire sentence each time the probability of a word is predicted. In the prediction stage, a segment ID is newly allocated to the sentence to be predicted, the word vector and the parameters are kept unchanged, and the segment vector of the sentence to be predicted is obtained after convergence.

As shown in fig. 16, calculating the candidate semantic vector for each candidate semantic unit includes the steps 4446: assigning a sub-semantic unit vector to each of the one or more sub-semantic units in the each candidate semantic unit. In some embodiments, each candidate semantic unit includes a plurality of sub-semantic units, some or all of which are assigned corresponding vectors (referred to as sub-semantic unit vectors). In some embodiments, the candidate semantic unit is a sentence, the sub-semantic units are words contained in the sentence, and the sub-semantic unit vector is a word vector. In some embodiments, the word vector is generated during the training process of the model and is a parameter of the model, and the word vector is a random value at the beginning of the training and is updated continuously as the training process progresses. In some embodiments, each sub-semantic unit may be assigned a vector by one-hot encoding.

as shown in fig. 16, calculating the candidate semantic vector for each candidate semantic unit includes the steps 4447: the identity and all sub-semantic unit vectors are input together into a predetermined prediction model to output a target vector. In some embodiments, the vector representation of the sentence may be the average of the vectors of all words in the sentence.

In some embodiments, the Word vectors required in the emotion analysis model and/or clusters are generated using the Word2vec language model. In some embodiments, the sentence vectors required by the usability analysis model are generated using Word2 vec. In some embodiments, the demand analysis model is generated using a Word2vec language model. The language model is to make assumptions and model the natural language so that the natural language can be expressed in a way that can be understood by a computer, and the core of the language model is still the representation of the context and the modeling of the relationship between the context and the target word. Word2vec uses an n-gram model, i.e. it is assumed that a Word is related only to the surrounding n words, and not to other words in the text. Word2vec utilizes the idea of deep learning, and can simplify the processing of text content into vector operation in a K-dimensional vector space through training, and the similarity in the vector space can be used for expressing the similarity in text semantics. The vector form of Word obtained by Word2vec can freely control the dimension. Word2vec is semantic analysis based on Word dimension, and after a Word vector is obtained, a sentence vector needs to be obtained on the basis of words, and the Word vector has the semantic analysis capability of context. The general flow of the Word2vec model includes (1) Word segmentation/stem extraction and morphology reduction, for example, for chinese corpus, Word segmentation is required, and for english corpus, Word segmentation is not required, but since english relates to various tenses, stem extraction and morphology reduction are required; (2) constructing a dictionary and counting word frequency, for example, in the step, traversing all texts once, finding out all the appeared words, and counting the appearance frequency of each word; (3) constructing a tree structure, for example, a Huffman tree according to the occurrence probability of each word such that all classifications are at leaf nodes; (4) generating a binary code of the node, wherein the binary code reflects the position of the node in the tree, so that the corresponding leaf node can be found step by step from the root node according to the code; (5) initializing intermediate vectors of each non-leaf node and word vectors in the leaf nodes, for example, each node in a Huffman tree stores a vector with the length of m, but the meanings of the vectors in the leaf nodes and the non-leaf nodes are different, specifically, the word vectors of each word stored in the leaf nodes are used as input of a neural network, and the intermediate vectors stored in the non-leaf nodes correspond to parameters of a hidden layer in the neural network and determine a classification result together with the input; (6) training the intermediate vectors and word vectors, for example, in the training process, the model will give these abstract intermediate nodes a proper vector, which represents all its corresponding subnodes, for the CBOW model, first add the word vectors of multiple words near the central word as the input of the system, and classify step by step according to the binary code generated by the central word in the previous step and train the intermediate vectors and word vectors according to the classification result. In some embodiments, the word2vec model used in emotion analysis is trained by the user himself, while the word2vec model carried by the hand toolkit itself is used in clustering. In some embodiments, both the word2vec model used in emotion analysis and the word2vec model used in clustering are from the hand toolkit. In some embodiments, both the word2vec model used in emotion analysis and the word2vec model used in clustering are trained by the user himself. In some embodiments, the target vector may be output by inputting the identity and all sub-semantic unit vectors together into a Continuous Bag of words (CBOW) model. For example, the input to the CBOW model is the sum of word vectors for n words around the core word of the sentence, and the output is the word vector for the core word itself, where n is an integer greater than 1. For example, in some embodiments, the target vector may be output by inputting the identity and all sub-semantic unit vectors together into a Skip-gram model. For example, the input to the Skip-gram model is the core word itself of the sentence, and the output is a word vector of n words around the core word. In some embodiments, the target vector is a word vector. In some embodiments, the Word vector may be computed and trained by the Word2vec tool.

In some embodiments, the sentence vectors required in the emotion analysis model and/or the clusters are both generated using Doc2 vec. In some embodiments, the sentence vectors required by the usability analysis model are generated using Doc2 vec. In some embodiments, the demand analysis model is generated using a Doc2vec language model. There are two models of Doc2Vec, namely a Distributed Memory (DM) model that predicts the probability of a word given a context and document vector, and a Distributed Bag of Words (DBOW) model that predicts the probability of a set of random Words in a document given a document vector. In some embodiments, the target vector may be output by inputting the identity and all sub-semantic unit vectors together into the DBOW model. In some embodiments, the target vector may be output by inputting the identity and all sub-semantic unit vectors together into the DM model. In some embodiments, the target vector is a sentence vector. In some embodiments, sentence vectors may be calculated and trained by the Doc2vec tool.

As shown in fig. 16, calculating the candidate semantic vector for each candidate semantic unit includes the steps 4448: designating a target vector as the candidate semantic vector.

as shown in fig. 14, calculating the similarity between each candidate semantic unit and each of the remaining semantic units in the plurality of semantic units further includes step 4440: and respectively calculating the similarity between the candidate semantic vector of each candidate semantic unit and the semantic vector of each residual semantic unit. In some embodiments, the similarity of the semantic vectors may be characterized by a cosine distance or a cosine similarity between the semantic vectors. In some embodiments, the predetermined threshold for cosine similarity may be 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, or 0.9. In some embodiments, the similarity of semantic vectors may be calculated by the HanLP suite.

As shown in fig. 12, the semantic unit clustering method further includes the step 6000: ranking the one or more cluster centers. In some embodiments, ranking the one or more cluster centers comprises calculating a similarity of the semantic unit corresponding to each of the one or more cluster centers to each of the remaining semantic units of the plurality of semantic units, respectively, and ranking all cluster centers based on the number of semantic units for which the similarity is above a predetermined threshold. In some embodiments, a similarity between the semantic vector of the semantic unit corresponding to each of the one or more clusters and the semantic vector of each of the remaining semantic units of the plurality of semantic units is calculated, respectively. In some embodiments, the similarity between the sentence vector of the sentence corresponding to each of the one or more clusters and the sentence vectors of each of the other sentences is calculated separately, and the cluster centers are sorted based on the number of sentences of which the similarity included in the cluster is higher than a predetermined threshold. In some embodiments, different predetermined thresholds may be employed in the calculations for different cluster centers. In some embodiments, the predetermined threshold used in the step of ranking of cluster centers may be different from the predetermined threshold used in the step of determining cluster centers. In some embodiments, the ordering of the cluster centers may be updated once after each similarity between one semantic unit and each of the remaining semantic units is calculated. In some embodiments, after each cluster is determined or found, the number of semantic units with similarity higher than a predetermined threshold of the cluster is compared with the number of semantic units with similarity higher than a predetermined threshold of the previous cluster, and the ranking of the cluster centers is updated once based on the comparison result. For example, if a newly generated cluster has a greater number of semantic units, the importance or priority of the newly generated cluster is ranked ahead of the previous cluster. In some embodiments, after the cluster center is calculated, the text of the semantic unit corresponding thereto may be output. In some embodiments, only the text of the semantic unit corresponding to the first ranked cluster center is output.

FIG. 17 is a schematic diagram of a semantic unit clustering apparatus according to one embodiment of the present application. As shown in fig. 14, the semantic unit clustering device includes a semantic unit obtaining component 7000, a cluster center determining component 8000, and an ordering component 9000.

In some embodiments, semantic unit acquisition component 7000 is configured to acquire a plurality of semantic units. In some embodiments, the cluster center determination component 8000 is configured to determine one or more cluster centers based on the plurality of semantic units. In some embodiments, an ordering component 9000 is configured to order the one or more cluster centers. In some embodiments, the ordering component 9000 is optional.

In some embodiments, the cluster center determination component comprises a cluster center determination module. In some embodiments, the cluster center determination module is configured to determine the one or more cluster centers from the plurality of semantic units by an AP clustering algorithm. In some embodiments, the cluster center determination module is configured to determine each of the plurality of semantic units as a cluster center. In some embodiments, the cluster center determination module is configured to determine the one or more cluster centers from the plurality of semantic units based on a similarity between two of the plurality of semantic units.

In some embodiments, the cluster center determining module further includes a candidate semantic unit selecting module, a similarity calculating module, and a cluster center determining module, wherein the candidate semantic unit selecting module is configured to sequentially select each of the plurality of semantic units as a candidate semantic unit, the similarity calculating module is configured to calculate, for each candidate semantic unit, a similarity between each candidate semantic unit and each of remaining semantic units in the plurality of semantic units, respectively, and the cluster center determining module is configured to determine each candidate semantic unit as a cluster center when there is at least one semantic unit whose similarity is higher than a predetermined threshold among the remaining semantic units.

In some embodiments, the similarity calculation module further comprises a candidate semantic vector calculation module configured to calculate a candidate semantic vector for each of the candidate semantic units, and a semantic vector similarity calculation module configured to calculate a similarity between the candidate semantic vector for each of the candidate semantic units and the semantic vectors for each of the remaining semantic units, respectively.

In some embodiments, the candidate semantic vector calculation module comprises a feature semantic unit acquisition module, an association determination module, and a candidate semantic vector generation module, wherein the feature semantic unit acquisition module is configured to acquire a feature semantic unit table, wherein the feature semantic unit table comprises one or more feature semantic units; the association degree determining module is configured to respectively determine the association degree of each candidate semantic unit and each feature semantic unit, and the candidate semantic vector generating module is configured to generate the candidate semantic vector according to the association degree of each candidate semantic unit and each feature semantic unit. In some embodiments, the degree of association is proportional to a frequency of occurrence of each feature semantic unit in the each candidate semantic unit.

In some embodiments, the candidate semantic vector calculation module comprises an identity vector assignment module configured to assign an identity vector to each of the candidate semantic units, a sub-semantic unit vector assignment module configured to assign a sub-semantic unit vector to each of one or more of the candidate semantic units, a target vector calculation module configured to input the identity vector and all sub-semantic unit vectors together into a predetermined prediction model to output a target vector, a candidate semantic vector designation module configured to designate the target vector as the candidate semantic vector. In some embodiments, the identity vector assignment module is optional. In embodiments where the identity vector assignment module is not present (e.g., when the target vector is a word vector), the target vector calculation module is configured to input only all sub-semantic unit vectors into a predetermined prediction model to output the target vector. In embodiments where an identity vector allocation module is present (e.g., where the target vector is a sentence vector), the target vector calculation module may be configured to input the identity vector and all sub-semantic unit vectors into a predetermined prediction model to output the target vector.

In some embodiments, the ranking component 9000 further comprises a similarity calculation module configured to calculate a similarity of the semantic unit corresponding to each of the one or more clusters to each of the remaining semantic units of the plurality of semantic units, respectively, and a cluster center ranking module configured to rank all cluster centers based on a number of semantic units for which the similarity is above a predetermined threshold. In some embodiments, the similarity calculation module is further configured to calculate a similarity between the semantic vector of the semantic unit corresponding to each of the one or more clusters and the semantic vector of each of the remaining semantic units of the plurality of semantic units, respectively.

The present application also provides a computer-readable storage medium, wherein the computer-readable storage medium comprises a program which, when executed by a processor, performs the semantic unit clustering method according to the previous description.

In summary, the method for generating an interview report according to the embodiment of the application can realize the autonomous one-key interview report generation of the interview corpus based on the neural network, wherein the interview corpus is converted into text data, preprocessing of the interview corpus, keyword extraction and the like can be realized by inputting corresponding trigger instructions, manual processing is not needed, the interview analysis time can be saved, interview personnel required to be equipped can be reduced, and the interview cost can be reduced.

it should be noted that in the description of the present specification, reference to the description of the terms "one embodiment", "some embodiments", "illustrative embodiments", "examples", "specific examples", or "some examples", etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example.

while embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the application, the scope of which is defined by the claims and their equivalents.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. a semantic unit clustering method is characterized by comprising the following steps:

Acquiring a plurality of semantic units; and

Determining one or more cluster centers based on the plurality of semantic units.

2. The method of semantic unit clustering according to claim 1, wherein determining one or more cluster centers based on the plurality of semantic units comprises:

Determining the one or more cluster centers from the plurality of semantic units by an AP algorithm.

3. The method of semantic unit clustering according to claim 1, wherein determining one or more cluster centers based on the plurality of semantic units comprises:

determining each of the plurality of semantic units as a cluster center.

4. the method of semantic unit clustering according to claim 1, wherein determining one or more cluster centers based on the plurality of semantic units comprises:

Determining the one or more clustering centers from the plurality of semantic units based on a similarity between the plurality of semantic units.

5. The semantic unit clustering method of claim 4, wherein determining one or more clustering centers based on the similarity between the plurality of semantic units comprises:

Sequentially selecting each semantic unit from the plurality of semantic units as a candidate semantic unit; and

For each candidate semantic unit:

Calculating a similarity between each of the candidate semantic units and each of the remaining semantic units of the plurality of semantic units, respectively, and

And if at least one semantic unit with the similarity higher than a preset threshold exists in the residual semantic units, determining each candidate semantic unit as a cluster center.

6. The semantic unit clustering method according to claim 5, wherein separately calculating the similarity between each candidate semantic unit and each of the remaining semantic units of the plurality of semantic units comprises:

Calculating a candidate semantic vector of each candidate semantic unit; and

and respectively calculating the similarity between the candidate semantic vector of each candidate semantic unit and the semantic vector of each residual semantic unit.

7. the method of clustering semantic units according to claim 6, wherein calculating the candidate semantic vector for each candidate semantic unit comprises:

Acquiring a characteristic semantic unit table, wherein the characteristic semantic unit table comprises one or more characteristic semantic units;

Respectively determining the association degree of each candidate semantic unit and each characteristic semantic unit; and

And generating the candidate semantic vector according to the association degree of each candidate semantic unit and each characteristic semantic unit.

8. The semantic unit clustering method according to claim 7, wherein the degree of association is proportional to the frequency of occurrence of each feature semantic unit in each candidate semantic unit.

9. The method of clustering semantic units according to claim 6, wherein calculating the candidate semantic vector for each candidate semantic unit comprises:

Assigning a sub-semantic unit vector to each of one or more sub-semantic units in the each candidate semantic unit;

Inputting all sub-semantic unit vectors into a predetermined prediction model to output a target vector; and

Designating the target vector as the candidate semantic vector.

10. The semantic unit clustering method according to claim 9, wherein calculating the candidate semantic vector for each of the candidate semantic units further comprises assigning an identity vector to each of the candidate semantic units, and wherein inputting all sub-semantic unit vectors into a predetermined prediction model to output a target vector comprises inputting the identity vector and all sub-semantic unit vectors together into a predetermined prediction model to output a target vector.