CN110717023B - Method and device for classifying interview answer text, electronic equipment and storage medium - Google Patents

Method and device for classifying interview answer text, electronic equipment and storage medium Download PDF

Info

Publication number
CN110717023B
CN110717023B CN201910882034.0A CN201910882034A CN110717023B CN 110717023 B CN110717023 B CN 110717023B CN 201910882034 A CN201910882034 A CN 201910882034A CN 110717023 B CN110717023 B CN 110717023B
Authority
CN
China
Prior art keywords
text
interview
length
answer
sample answer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910882034.0A
Other languages
Chinese (zh)
Other versions
CN110717023A (en
Inventor
郑立颖
徐亮
金戈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910882034.0A priority Critical patent/CN110717023B/en
Priority to PCT/CN2019/118036 priority patent/WO2021051586A1/en
Publication of CN110717023A publication Critical patent/CN110717023A/en
Application granted granted Critical
Publication of CN110717023B publication Critical patent/CN110717023B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of artificial intelligence, and particularly discloses a method and a device for classifying interview answer texts, wherein the method comprises the following steps: acquiring an interview answer text of an interviewer, wherein the interview answer text is obtained according to the reply of the interviewer to an interview question in interview; constructing semantic vectors of interview answer texts through a feature extraction layer of the constructed classification model; all the connection layers of the classification model are respectively connected according to semantic vectors to correspondingly obtain feature vectors, and the feature vectors obtained on the all connection layers are used for representing the features of sample answer texts on the corresponding set capacity items of the all connection layers; and carrying out classification prediction on the feature vectors obtained at each full connection layer to respectively obtain the grading grades of the interviewees on each set capability item. Therefore, the automatic expansion of the dictionary is realized, and the classifying speed of the interview answer text is improved. Thus realizing the interview evaluation of the automatic interviewee.

Description

Method and device for classifying interview answer text, electronic equipment and storage medium
Technical Field
The disclosure relates to the technical field of artificial intelligence, and in particular relates to a method and a device for classifying interview answer texts, electronic equipment and a computer readable storage medium.
Background
For interviews, the interviewer's ability to be on multiple set-up capacity items needs to be evaluated based on the interviewer's responses to questions, i.e., the interviewer's rating level on each set-up capacity item is determined separately.
In the prior art, the interviewer is generally interviewed by an interviewer, and then the interviewer evaluates the abilities of the interviewer in all aspects according to the answer corpus of the interviewer in the interview process. The problem of low efficiency exists because the interviewee determines the grading of the interviewee on each set capability item according to the interviewee answer corpus.
From the foregoing, a need exists for a method for automatically evaluating an interviewer that does not rely on the interviewer to evaluate the interviewer, thereby improving the efficiency of the interview evaluation.
Disclosure of Invention
In order to solve the problem of low interview evaluation efficiency caused by interview evaluation by interview officers in the prior art, embodiments of the present disclosure provide a method and apparatus for classifying interview answer texts, an electronic device, and a computer-readable storage medium, so as to implement automatic interview evaluation.
The technical scheme adopted by the application is as follows:
in a first aspect, a method of classifying interview answer text, the method comprising:
acquiring an interview answer text of an interviewer, wherein the interview answer text is obtained according to the reply of the interviewer to an interview question in interview;
constructing semantic vectors of the interview answer texts through a feature extraction layer of a constructed classification model, wherein the classification model is obtained by training a plurality of sample answer texts and label data marked for each sample answer text, and the label data indicates grading grades marked on a set capacity item for the interviewee according to the sample answer texts;
all connection is carried out according to the semantic vectors through each all connection layer of the classification model respectively, feature vectors are correspondingly obtained, the feature vectors obtained on the all connection layers are used for representing the features of the sample answer text on the corresponding set capacity items of the all connection layers, the classification model comprises at least two all connection layers, and each all connection layer corresponds to a set capacity item;
and carrying out classification prediction on the feature vectors obtained at each full connection layer to respectively obtain the grading grades of the interviewees on each set capability item.
In a second aspect, an apparatus for classifying interview answer text, the apparatus comprising:
the interview answering system comprises an acquisition module, a search module and a search module, wherein the acquisition module is used for acquiring interview answering text of an interviewer, and the interview answering text is acquired according to a reply of the interviewer to an interview question in interview;
the semantic vector construction module is used for constructing semantic vectors of the interview answer texts through a feature extraction layer of a constructed classification model, the classification model is obtained by training a plurality of sample answer texts and label data marked for each sample answer text, and the label data indicates grading grades marked on a set capacity item for the interviewee according to the sample answer texts;
the full-connection module is used for respectively carrying out full connection according to the semantic vectors through each full-connection layer of the classification model, correspondingly obtaining feature vectors, wherein the feature vectors obtained on the full-connection layers are used for representing the features of the sample answer text on the set capacity items corresponding to the full-connection layers, the classification model comprises at least two full-connection layers, and each full-connection layer corresponds to a set capacity item;
and the classification prediction module is used for performing classification prediction on the feature vectors obtained at each full-connection layer to respectively obtain the grading grade of the interviewee on each set capability item.
In a third aspect, an electronic device, comprising:
a processor; a kind of electronic device with high-pressure air-conditioning system
A memory having stored thereon computer readable instructions which when executed by the processor implement the interview answer text classification method as described above.
In a fourth aspect, a computer readable storage medium having stored thereon computer readable instructions which, when executed by a processor of a computer, implement the interview answer text classification method as described above.
According to the technical scheme, the grading grade of the interviewee on each set capability item is automatically determined according to the interview answer text of the interviewee, so that the capability of the interviewee on each set capability item is evaluated according to the interview answer text of the interviewee, in other words, the interview evaluation is automatically performed. The interviewee is not required to be evaluated on each capability item according to the interview condition of the interviewee by the interviewee, and the interview evaluation efficiency is greatly improved. Moreover, the interviewer is not required to participate in interview evaluation, so that the problem that the interviewer has inaccurate and objective grading on each capability item caused by subjective intention and personal preference of the interviewer can be avoided.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a block diagram of an apparatus shown schematically;
FIG. 2 is a flow chart illustrating a method of classifying interview answer text according to an example embodiment;
FIG. 3 is a flow chart of step 310 of FIG. 2 in one embodiment;
FIG. 4 is a flow chart of step 330 of FIG. 2 in one embodiment;
FIG. 5 is a flow chart of steps preceding step 351 in FIG. 4 in one embodiment;
FIG. 6 is a flow chart of steps in one embodiment for determining the text cutoff length based on the text length of each of the sample answer texts;
FIG. 7 is a flow chart of steps preceding step 330 of FIG. 2 in one embodiment;
FIG. 8 is a block diagram illustrating a device for classifying interview answer text according to an example embodiment;
fig. 9 is a block diagram of an electronic device, according to an example embodiment.
There has been shown in the drawings, and will hereinafter be described, specific embodiments of the application with the understanding that the present disclosure is to be considered in all respects as illustrative, and not restrictive, the scope of the inventive concepts being indicated by the appended claims.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
Fig. 1 illustrates a block diagram of an apparatus 200, according to an example embodiment. The apparatus 200 may be used as an executive body of the present disclosure for implementing the interview answer text classification method of the present disclosure. Of course, the method of the present disclosure is not limited to being implemented by the apparatus 200 as an execution body, and other electronic devices having processing capabilities may also be implemented as an execution body of the present disclosure for implementing the method of classifying interview answer texts of the present disclosure.
It should be noted that the apparatus 200 is only an example adapted to the present application, and should not be construed as providing any limitation to the scope of the present application. Nor should the apparatus be construed as necessarily relying on or necessarily having one or more of the components of the exemplary apparatus 200 shown in fig. 1.
The hardware structure of the apparatus 200 may vary widely depending on the configuration or performance, as shown in fig. 3, the apparatus 200 includes: a power supply 210, an interface 230, at least one memory 250, and at least one processor 270.
Wherein the power supply 210 is configured to provide an operating voltage for each hardware device on the apparatus 200.
The interface 230 includes at least one wired or wireless network interface 231, at least one serial-to-parallel interface 233, at least one input-output interface 235, and at least one USB interface 237, etc., for communicating with external devices.
The memory 250 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, where the resources stored include an operating system 251, application programs 253, and data 255, and the storage mode may be transient storage or permanent storage. Wherein the operating system 251 is used for managing and controlling various hardware devices and application programs 253 on the apparatus 200, so as to realize the calculation and processing of the mass data 255 by the processor 270,it may be Windows Server TM 、Mac OS X TM 、Unix TM 、Linux TM 、FreeBSD TM Etc. The application 253 is a computer program that performs at least one specific task based on the operating system 251, and may include at least one module (not shown in fig. 2), each of which may respectively contain a series of computer readable instructions for the apparatus 200. The data 255 may be sample text, tag data, etc. stored on disk.
Processor 270 may include one or more of the above processors and is configured to communicate with memory 250 via a bus for computing and processing mass data 255 in memory 250.
As described in detail above, the apparatus 200 embodying the present application will complete the method of interview reply text classification by processor 270 reading a series of computer readable instructions stored in memory 250.
Furthermore, the present application can be realized by hardware circuitry or by a combination of hardware circuitry and software, and thus, the implementation of the present application is not limited to any specific hardware circuitry, software, or combination of the two.
Fig. 2 is a flow chart illustrating a method of classifying interview answer text according to an exemplary embodiment, which may be performed by the apparatus 200 shown in fig. 1, or by other electronic devices having processing capabilities, and is not specifically limited herein. As shown in fig. 2, the method comprises at least the following steps:
step 310, acquiring interview answer text of the interviewer, wherein the interview answer text is obtained according to the reply of the interviewer to the interview question in interview.
For interview, interviewee answers to interview question, and the answer content is the reply of interview question. The interview answer text is a text expression of a reply to the interview question, for example, if the interviewer answers the interview question in text, the reply is the interview answer text; if the interviewer answers the interview question in a voice mode, the text obtained by performing voice recognition on the reply is interview answer text.
In one embodiment, the interviewee is interviewed by an intelligent interview system. In the intelligent interview system, a plurality of questions are set in advance for interviewees to be interviewed, for example, questions are set for data such as resumes of interviewees. Thus, when the interviewee is interviewed, the interviewee is interviewed according to the set questions, and replies of the interviewee to the interviewee are collected, so that interview answer text is obtained. In this embodiment, the intelligent interview system classifies interview response text of an interviewer by the methods of the present disclosure.
In step 330, a semantic vector of the interview answer text is constructed by a feature extraction layer of the constructed classification model, the classification model is obtained by training a plurality of sample answer texts and label data marked for each sample answer text, and the label data indicates a grading grade marked on a set capability item for the interviewer according to the sample answer text.
The semantic vector of the interview answer text is the vector representation of the corresponding semantic of the interview answer text.
The classification model is constructed through a neural network, the constructed classification model is used for classifying the interview answer text, the neural network is used for combining various neural networks, such as a deep feed-forward network, a convolution neural network (Convolution Neural Networks, CNN), a recurrent neural network (Recurrent Neural Networks) and the like, and the classification model for classifying the interview answer text is obtained.
In the technical scheme of the application, the purpose of classifying the interview answer text is to obtain the grading grade on the interview person set capability item through the interview answer text, so that the interview answer text is classified to a grading grade on the set capability item, and the capability assessment of the interview person according to the interview answer text is realized.
It will be appreciated that for the ability assessment of a interviewee, the ability assessment is performed on a plurality of set ability terms. The classification model of the present disclosure is constructed for classifying the interview answer text over a plurality of set capability items.
Set capability items such as learning capability, planning capability, stability, team collaboration capability, leader capability, and the like. Of course, the set capability items that need to be evaluated for the interviewer may be different in different application scenarios. Thus, a plurality of set capacity items to be evaluated by the interviewer can be selected according to actual needs.
Specifically, the classification model includes a feature extraction layer, full connection layers (where a set capability item corresponds to a full connection layer) and output layers (where each full connection layer corresponds to an output layer) respectively constructed for the set capability item. The feature extraction layer is used for constructing semantic vectors of the interview answer text; the full-connection layer is used for carrying out full connection on the set capacity item corresponding to the full-connection layer according to the semantic vector to obtain a characteristic vector for representing the characteristic of the interview answer text on the set capacity item; the output layer is used for outputting according to the feature vector, so as to obtain the grading grade on the set capacity item, and it is worth mentioning that the grading grade output by the output layer is the grading grade of the set capacity item corresponding to the output layer.
In order to ensure the accuracy of the classification model in classifying the interview answer text, model training is performed according to a plurality of sample answer texts and label data marked for the interview answer text before classifying the interview answer text, so as to obtain the classification model. As described above, the classification model is used to output the interviewer's scoring level on the set competency based on the interview response text, such that the label data used for model training characterizes the scoring level of the corresponding word sample response text on each set competency.
Step 350, performing full connection according to the semantic vectors through each full connection layer of the classification model, correspondingly obtaining feature vectors, wherein the feature vectors obtained on the full connection layers are used for representing features of the sample answer text on the set capacity items corresponding to the full connection layers, and the classification model comprises at least two full connection layers, and each full connection layer corresponds to a set capacity item.
As described above, in the classification model, a full connection layer is constructed for each set capability item. Although the semantic vector of the interview answer text is obtained through the feature extraction layer, since the interview answer text needs to be classified on at least two setting capability items, the semantic vector of the interview answer text characterizes all the features of the interview answer text, but in the semantic vector, the degree of expression of the features on each setting capability item is different, the features on some setting capability items are obvious, and the features on some setting capability items are not obvious. Therefore, if classification is performed on at least two setting capability items only by semantic vectors, there is a problem in that the classification accuracy is low.
Thus, in order to ensure the accuracy of classification on each set capability item, it is necessary to further extract the features for classification on a certain set capability item from the semantic vector, thereby realizing activation of the features represented by the interview answer text on each set capability item. The process is realized by performing full connection according to semantic vectors by a full connection layer corresponding to the set capability item, and correspondingly obtaining feature vectors for representing features of the interview answer text on the set capability item corresponding to the full connection layer.
Since each full-connection layer corresponds to a set capability item in the classification model, in order to classify the interview answer text on each set capability item, the feature vector corresponding to the set capability item is obtained according to the semantic vector through the full-connection layer corresponding to the set capability item.
And 370, classifying and predicting the feature vectors obtained at each full connection layer to obtain the grading grades of the interviewees on each set capability item.
The classification prediction is performed, namely, the probability that the feature vector corresponds to each grading grade is predicted for the grading grade set on each set capability item, so that the grading grade of the interview answer text on the set capability item is determined according to the predicted probability correspondence.
For example, on the learning ability, 4 scoring levels are preset, which are respectively: score a, score B, score C, and score D. Then, the probability that the interview answer text is classified into the score ranks A, B, C and D is predicted, respectively, based on the feature vectors obtained from the fully connected layers corresponding to the learning ability, respectively. For example, the probability that the interview answer text is classified to the score level a is predicted to be P1, the probability that the interview answer text is classified to the score level B is predicted to be P2, the probability that the interview answer text is classified to the score level C is predicted to be P3, and the probability that the interview answer text is classified to the score level D is predicted to be P4.
And traversing the probability of each grading level according to the predicted probability, comparing the probabilities P1, P2, P3 and P4, and classifying the interview answer text to the grading level A on the set capability item of learning capability if the probability P1 is maximum, namely, grading the interviewee on the learning capability as A.
Therefore, the grading grade of the interviewee on each set capability item can be determined according to the interview answer text of the interviewee through the steps, the capability of the interviewee on each set capability item can be evaluated according to the interview answer text of the interviewee, in other words, the interview evaluation can be automatically performed, and the interview evaluation efficiency can be improved. The interviewee is not required to be evaluated on each capability item according to the interview condition of the interviewee, and the workload of interview evaluation of the interviewee is greatly reduced. Moreover, the interviewee is not required to participate in interview evaluation, so that inaccurate and objection of evaluation results caused by subjective intention and personal preference of the interviewee can be avoided.
In one embodiment, as shown in FIG. 3, step 310 includes:
step 311, collect the reply voice of the interviewer aiming at the interview question in the interview process.
In this embodiment, the interviewee is interviewed by adopting a voice mode, and voice collection is performed in the interviewing process, so that the reply voice of the interviewee aiming at the interview question in the interview process is obtained.
Step 313, performing voice recognition on the reply voice to obtain the interview answer text corresponding to the reply voice.
And the voice recognition is carried out, namely, the reply voice is recognized as a text, so that the interview answer text corresponding to the reply voice is obtained.
In a specific embodiment, for speech recognition, the speech recognition tool in the prior art may be directly invoked for speech recognition.
In one embodiment, as shown in FIG. 4, step 330 includes:
step 331, word segmentation is carried out on the face test answer text through a feature extraction layer of the classification model, and a word sequence formed by a plurality of words is obtained.
Word segmentation refers to the process of dividing a continuous interview answer text into word sequences according to certain specifications, thereby obtaining word sequences composed of a plurality of individual words.
The word segmentation may be a word segmentation method based on character string matching, a word segmentation method based on understanding, or a word segmentation method based on statistics, and is not particularly limited herein.
In a specific embodiment, the word segmentation tool may also be directly invoked to segment words, such as jieba, snowNLP, THULAC, NLPIR.
It should be noted that the method used for word segmentation may be different for different languages, for example, for english text, word segmentation may be performed directly by space and punctuation, while for chinese text, word segmentation by space is not feasible because there is no space between characters, and word segmentation method adapted to chinese is required.
Step 333, constructing by the feature extraction layer according to the codes corresponding to the words in the word sequence and the semantic weights corresponding to the words, to obtain the semantic vector of the interview answer text.
It is understood that in text, the degree of contribution of different types of words to the semantics of the text is different. The corresponding semantic weight is a quantized representation of the degree of contribution of the word to the semantics of the text in which it is located.
In interview answer text, the semantic weights of words of different parts of speech are different, e.g. for nouns, verbs, and assisted words, the semantic weights of nouns and verbs are greater than the semantic weights of assisted words.
For the classification of interview answer texts, a semantic dictionary is correspondingly constructed, in which the codes of several words and the semantic weights of the words are stored. The feature extraction layer correspondingly generates a semantic vector of the interview answer text according to the codes and semantic weights of words in the word sequence corresponding to the interview answer text in the semantic dictionary.
In one embodiment, the classification model is constructed by text-CNN neural network, as shown in FIG. 5, and before step 331, the method further includes:
in step 410, a text cutoff length determined for word segmentation is obtained.
And step 430, cutting off the text of the trial answer according to the obtained text cutting-off length, and taking the text reserved by cutting off as an object for word segmentation.
text-CNN is an algorithm that utilizes convolutional neural networks to classify text. Before the text-CNN neural network classifies the interview answer text, the interview answer text needs to be truncated according to the text truncation length set for the text-CNN neural network.
The text cutoff length defines the length of the text input to the classification model for classification, i.e. if the text length of the text exceeds the text cutoff length, the text is cut according to the text cutoff length, and the part exceeding the text cutoff length in the text is removed, so that the text length of the text after cutting is the text cutoff length. If the text length of the text does not exceed the text cutoff length, the text needs to be complemented, i.e. complemented with a complement character, for example, 0, when constructing a semantic vector for the text; so that the semantic vector constructed for the text remains consistent with the text cutoff length.
The text cutoff length is determined for determining training parameter values of the classification model. The reasonable text cut-off length can improve the training efficiency of the classification model on the basis of ensuring that the semantic features of the text are fully captured.
Thus, after the training parameters of the classification model are set according to the text cutoff length, the text (i.e., the sample answer text or the interview answer text) is cut according to the text cutoff length, whether in training the classification model or in classifying the interview answer text.
Wherein the length of the text, i.e. the number of words obtained after word segmentation of the text.
In one embodiment, prior to step 410, the method further comprises:
and determining the text cut length according to the text length of each sample answer text.
For a classification model constructed by a text-CNN neural network, if the text cutoff length is too short, on one hand, the information captured from the interview answer text is insufficient, so that the accuracy of classifying the interview answer text is reduced, and on the other hand, the batch processing quantity is too small, the paths trained to be converged are more random, so that the classification accuracy of the classification model is not high; otherwise, if the text is cut off in length, the training time of the classification model is too long, and the training time of one batch is too long, so that the local optimum is easily trapped.
Therefore, in order to ensure the training efficiency of the classification model and the classification precision of the classification model, the text cut-off length is determined for the classification model according to the actual application scene of the classification model, namely, the text cut-off length is determined according to the text length of each sample answer text.
It will be appreciated that the text length of each sample answer text characterizes to some extent the range of text lengths of the interview answer texts, so that determining the text cutoff length from the text length of each sample answer text can adapt the determined text cutoff length to the actual situation in classifying the interview answer texts.
In one embodiment, as shown in fig. 6, determining a text cutoff length according to the text length of each sample answer text includes:
step 510, obtaining the text length of each sample answer text by word segmentation of each sample answer text, and taking the number of words obtained by word segmentation of the sample answer text as the text length of the sample answer text.
In step 530, according to the text length of each answer text, a text length mean value and a text length standard deviation are calculated.
In step 550, a text cutoff length is determined based on the text length mean and the text length standard deviation.
In a specific embodiment, a weighted sum of the text length mean and the text length standard deviation, for example, a sum of the text length mean and the text length standard deviation, is used as the text cutoff length.
The text cutoff length, as determined by the text length mean and the text length standard deviation, balances the full retention of information for sample answer text or interview answer text with the improvement of training efficiency for classification models.
In one embodiment, as shown in fig. 7, prior to step 330, the method further comprises:
step 610, pre-constructing a neural network model according to the set capability items, wherein the neural network model comprises a fully-connected layer correspondingly constructed for each set capability item.
And 630, training the neural network model through a plurality of sample answer texts and label data corresponding to each sample answer text until the loss function of the neural network model is converged, wherein the convergence function is a weighted sum of cross entropy on each set capacity item.
Step 650, taking the neural network model when the loss function converges as a classification model.
For a given capability item, the scoring level of the sample answer text or the interview answer text on the given capability item is a discrete random variable X, the value set is C, the probability distribution function P (X) =p (x=x), X e C, then the event x=x 0 The information amount of (2) is:
I(x 0 )=-log(p(x 0 ))
since the variable X has a plurality of values, each value has a corresponding probability p (X i ) The cross entropy on the set capability item is the expectation of all information amounts on the set capability item, namely
Wherein H (p) 1 ) Cross entropy represented on set capability term p1, p 1 (x i ) The value of the variable X is X i N represents the probability of setting the capability item p 1 The variable X may take on the number of values.
Thus, the convergence function of the neural network model is:
where m represents the number of capability items set.
The training process of the pre-constructed neural network model is as follows: predicting the grading grade of each sample answer text on each set capacity item through a neural network model, and if the predicted grading grade on the set capacity item is inconsistent with the grading grade on the set capacity item in the label data corresponding to the sample question-answer text, adjusting model parameters of the neural network model; otherwise, if the answer is consistent, training is continued by answering the text with the next sample. And in the training process, if the loss function converges, stopping training. And taking the neural network model when the loss function converges as a classification model.
The following are embodiments of the apparatus of the present disclosure that may be used to perform the classification method embodiments of interview answer text performed by the apparatus 200 of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the classification method of the interview answer text of the present disclosure.
Fig. 8 is a block diagram illustrating an interview-answer text classification apparatus according to an exemplary embodiment, which may be configured in the apparatus 200 of fig. 1, to perform all or part of the steps of the interview-answer text classification method illustrated in any of the above method embodiments. As shown in fig. 8, the interview reply text classifying means includes, but is not limited to:
the obtaining module 710 is configured to obtain interview answer text of the interviewer, where the interview answer text is obtained according to a reply of the interview user to the interview question in interview.
The semantic vector construction module 730 is configured to construct a semantic vector of the interview answer text through a feature extraction layer of a constructed classification model, where the classification model is obtained by training a plurality of sample answer texts and label data labeled for each sample answer text, and the label data indicates a grading level labeled for interviewee on a set capability item according to the sample answer text.
The full-connection module 750 is configured to perform full-connection according to semantic vectors through each full-connection layer of the classification model, and correspondingly obtain feature vectors, where the feature vectors obtained on the full-connection layers are used to characterize features of the sample answer text on the set capacity item corresponding to the full-connection layer, and the classification model includes at least two full-connection layers, and each full-connection layer corresponds to a certain set capacity item.
The classification prediction module 770 is configured to perform classification prediction on the feature vectors obtained at each full-connection layer, and obtain the grading level of the interviewee on each set capability item.
The implementation process of the functions and roles of each module in the device is specifically shown in the implementation process of the corresponding steps in the interview answer text classification method, and is not repeated here.
It is to be understood that these modules may be implemented in hardware, software, or a combination of both. When implemented in hardware, these modules may be implemented as one or more hardware modules, such as one or more application specific integrated circuits. When implemented in software, the modules may be implemented as one or more computer programs executing on one or more processors, such as the program stored in memory 250 executed by processor 270 of fig. 1.
In one embodiment, the acquisition module 710 includes:
the collecting unit is used for collecting the reply voice of the interviewee aiming at the interview question in the interview process.
And the voice recognition unit is used for carrying out voice recognition on the reply voice to obtain an interview answer text corresponding to the reply voice.
In one embodiment, the semantic vector construction module 730 includes:
the word segmentation unit is used for segmenting the face test answer text through the feature extraction layer of the classification model to obtain a word sequence formed by a plurality of words.
The semantic vector construction unit is used for constructing the semantic vector of the interview answer text according to the codes corresponding to the words in the word sequence and the semantic weights corresponding to the words through the feature extraction layer.
In one embodiment, the classification model is constructed by text-CNN neural network, the apparatus further comprising:
and the text stage length acquisition module is used for acquiring the text cut length determined for word segmentation.
And the cutting module is used for cutting the text of the face test answer according to the obtained text cutting length, and taking the text reserved by cutting as an object for word segmentation.
In one embodiment, the apparatus further comprises:
and the text cut-off length determining module is used for determining the text cut-off length according to the text length of each sample answer text.
In one embodiment, the text cutoff length determining module includes:
the text length obtaining unit is used for obtaining the text length of each sample answer text by word segmentation of each sample answer text, and the number of words obtained by word segmentation of the sample answer text is used as the text length of the sample answer text.
And the calculating unit is used for calculating a text length mean value and a text length standard deviation according to the text length of each sample answer text.
And the determining unit is used for determining the text cut-off length according to the text length mean value and the text length standard deviation.
In one embodiment, the apparatus further comprises:
the pre-construction module is used for pre-constructing a neural network model according to the set capacity items, and the neural network model comprises a full-connection layer correspondingly constructed for each set capacity item.
The training module is used for training the neural network model through a plurality of sample answer texts and label data corresponding to each sample answer text until the loss function of the neural network model is converged, and the convergence function is a weighted sum of cross entropy on each set capacity item.
And the classification model obtaining module is used for taking the neural network model when the loss function converges as a classification model.
The implementation process of the functions and roles of each module/unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method for classifying interview answer texts, and will not be described herein again.
Optionally, the present disclosure further provides an electronic device, which may perform all or part of the steps of the method for classifying interview answer text shown in any of the above method embodiments. As shown in fig. 9, the electronic device includes:
a processor 1001; a kind of electronic device with high-pressure air-conditioning system
Memory 1002, on which memory 1002 is stored computer readable instructions which when executed by processor 1001 implement a method of any of the above method implementations.
Wherein the executable instructions, when executed by the processor 1001, implement the method of any of the embodiments above. Such as computer readable instructions, which when executed by the processor 1001, read the computer readable instructions stored in the memory over a communication line/bus 1003 coupled to the memory.
The specific manner in which the processor of the apparatus in this embodiment performs the operations has been described in detail in connection with the embodiment of the method of classifying interview response text and will not be described in detail herein.
In an exemplary embodiment, a computer readable storage medium is also provided, on which a computer program is stored which, when executed by a processor, implements the method in any of the method embodiments above. Wherein the computer readable storage medium, for example, comprises a memory 250 of a computer program, the above instructions are executable by a processor 270 of the apparatus 200 to implement the interview answer text classification method in any of the embodiments described above.
The particular manner in which the processor in this embodiment performs the operations has been described in detail in connection with embodiments of the method of classifying interview answer text and will not be described in detail herein.
The foregoing is merely illustrative of the preferred embodiments of the present application and is not intended to limit the embodiments of the present application, and those skilled in the art can easily make corresponding variations or modifications according to the main concept and spirit of the present application, so that the protection scope of the present application shall be defined by the claims.

Claims (9)

1. A method of classifying interview answer text, the method comprising:
acquiring an interview answer text of an interviewer, wherein the interview answer text is obtained according to the reply of the interviewer to an interview question in interview;
constructing semantic vectors of the interview answer texts through a feature extraction layer of a constructed classification model, wherein the classification model is obtained by training a plurality of sample answer texts and label data marked for each sample answer text, and the label data indicates grading grades marked on a set capacity item for the interviewee according to the sample answer texts;
all connection is carried out according to the semantic vectors through each all connection layer of the classification model respectively, feature vectors are correspondingly obtained, the feature vectors obtained on the all connection layers are used for representing the features of the sample answer text on the corresponding set capacity items of the all connection layers, the classification model comprises at least two all connection layers, and each all connection layer corresponds to a set capacity item;
classifying and predicting the feature vectors obtained at each full connection layer to respectively obtain the grading grades of the interviewees on each set capability item;
wherein the classification model is constructed by text-CNN neural network, the method further comprises:
determining a text cut-off length according to the text length of each sample answer text;
acquiring a text cutoff length determined for word segmentation;
cutting off the interview answer text according to the obtained text cutting-off length, and taking the text reserved by cutting off as an object for word segmentation;
the determining the text cut-off length according to the text length of each sample answer text comprises the following steps:
obtaining text lengths of the sample answer texts by word segmentation of the sample answer texts, and taking the number of words obtained by word segmentation of the sample answer texts as the text lengths of the sample answer texts;
according to the text length of each sample answer text, calculating to obtain a text length mean value and a text length standard deviation;
determining a text cut-off length according to the text length mean value and the text length standard deviation;
the method further comprises the steps of:
pre-constructing a neural network model according to a plurality of set capacity items, wherein the neural network model comprises a full connection layer correspondingly constructed for each set capacity item;
training the neural network model through a plurality of sample answer texts and label data corresponding to each sample answer text until a loss function of the neural network model converges, wherein the loss function is a weighted sum of cross entropy on each set capacity item;
taking the neural network model when the loss function converges as a classification model;
for a given capability item, the scoring level of the sample answer text or the interview answer text on the given capability item is a discrete random variable X, the value set of whichTo sum to C, the probability distribution function P (X) =p (x=x), X e C, then event x=x 0 The information amount of (2) is:
I(x 0 )=-log(p(x 0 ))
since the variable X has a plurality of values, each value has a corresponding probability p (X i ) The cross entropy on the set capability item is the expectation of all information amounts on the set capability item, namely
Wherein H (p) 1 ) Represented in the set capability item p 1 Cross entropy on p 1 (x i ) The value of the variable X is X i N represents the probability of setting the capability item p 1 The number of variable X values;
thus, the convergence function of the neural network model is:
where m represents the number of capability items set.
2. The method of claim 1, wherein the acquiring interview data acquired for an interviewer comprises:
collecting reply voice of an interviewer aiming at the interview question in the interview process;
and carrying out voice recognition on the reply voice to obtain an interview answer text corresponding to the reply voice.
3. The method of claim 1, wherein said constructing semantic vectors of said interview answer text by a feature extraction layer of said constructed classification model comprises:
the interview answer text is segmented through a feature extraction layer of the classification model, and a word sequence formed by a plurality of words is obtained;
and constructing a semantic vector of the interview answer text through the feature extraction layer according to codes corresponding to words in the word sequence and semantic weights corresponding to the words.
4. The method of claim 1, wherein prior to the obtaining the text cutoff length determined for word segmentation, the method further comprises:
and determining the text cut-off length according to the text length of each sample answer text.
5. The method of claim 4, wherein said determining said text cutoff length based on a text length of each of said sample answer texts comprises:
obtaining text lengths of the sample answer texts by word segmentation of the sample answer texts, and taking the number of words obtained by word segmentation of the sample answer texts as the text lengths of the sample answer texts;
according to the text length of each sample answer text, calculating to obtain a text length mean value and a text length standard deviation;
and determining the text cut-off length according to the text length mean value and the text length standard deviation.
6. The method of claim 1, wherein prior to constructing the semantic vector of interview answer text by the feature extraction layer of the constructed classification model, the method further comprises:
pre-constructing a neural network model according to a plurality of set capacity items, wherein the neural network model comprises a full connection layer correspondingly constructed for each set capacity item;
training the neural network model through the sample answer texts and the label data corresponding to each sample answer text until a loss function of the neural network model is converged, wherein the convergence function is a weighted sum of cross entropy on each set capacity item;
and taking the neural network model when the loss function converges as the classification model.
7. A device for classifying interview response texts, the device comprising:
the interview answering system comprises an acquisition module, a search module and a search module, wherein the acquisition module is used for acquiring interview answering text of an interviewer, and the interview answering text is acquired according to a reply of the interviewer to an interview question in interview;
the semantic vector construction module is used for constructing semantic vectors of the interview answer texts through a feature extraction layer of a constructed classification model, the classification model is obtained by training a plurality of sample answer texts and label data marked for each sample answer text, and the label data indicates grading grades marked on a set capacity item for the interviewee according to the sample answer texts;
the full-connection module is used for respectively carrying out full connection according to the semantic vectors through each full-connection layer of the classification model, correspondingly obtaining feature vectors, wherein the feature vectors obtained on the full-connection layers are used for representing the features of the sample answer text on the set capacity items corresponding to the full-connection layers, the classification model comprises at least two full-connection layers, and each full-connection layer corresponds to a set capacity item;
the classification prediction module is used for performing classification prediction on the feature vectors obtained at each full-connection layer to respectively obtain the grading grade of the interviewee on each set capacity item;
the acquisition module is used for acquiring the text cut-off length determined for word segmentation;
the cutting module is used for determining the text cutting length according to the text length of each sample answer text; cutting off the interview answer text according to the obtained text cutting-off length, and taking the text reserved by cutting off as an object for word segmentation;
the cutoff module is further used for determining a text cutoff length according to the text length of each sample answer text to obtain the text length of each sample answer text obtained by word segmentation of each sample answer text, and the number of words obtained by word segmentation of the sample answer text is used as the text length of the sample answer text; according to the text length of each sample answer text, calculating to obtain a text length mean value and a text length standard deviation; determining a text cut-off length according to the text length mean value and the text length standard deviation;
the device also comprises a model construction module, a model generation module and a model generation module, wherein the model construction module is used for pre-constructing a neural network model according to a plurality of set capacity items, and the neural network model comprises a full connection layer correspondingly constructed for each set capacity item;
training the neural network model through a plurality of sample answer texts and label data corresponding to each sample answer text until a loss function of the neural network model converges, wherein the loss function is a weighted sum of cross entropy on each set capacity item;
taking the neural network model when the loss function converges as a classification model;
for a given capability item, the scoring level of the sample answer text or the interview answer text on the given capability item is a discrete random variable X, the value set is C, the probability distribution function P (X) =p (x=x), X e C, then the event x=x 0 The information amount of (2) is:
I(x 0 )=-log(p(x 0 ))
since the variable X has a plurality of values, each value has a corresponding probability p (X i ) The cross entropy on the set capability item is the expectation of all information amounts on the set capability item, namely
Wherein H (p) 1 ) Represented in the set capability item p 1 Cross entropy on p 1 (x i ) The value of the variable X is X i N represents the probability of setting the capability item p 1 The number of variable X values;
thus, the convergence function of the neural network model is:
where m represents the number of capability items set.
8. An electronic device, comprising:
a processor; a kind of electronic device with high-pressure air-conditioning system
A memory having stored thereon computer readable instructions which, when executed by the processor, implement the method of any of claims 1 to 6.
9. A computer readable storage medium having stored thereon computer readable instructions which, when executed by a processor of a computer, implement the method of any of claims 1 to 6.
CN201910882034.0A 2019-09-18 2019-09-18 Method and device for classifying interview answer text, electronic equipment and storage medium Active CN110717023B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910882034.0A CN110717023B (en) 2019-09-18 2019-09-18 Method and device for classifying interview answer text, electronic equipment and storage medium
PCT/CN2019/118036 WO2021051586A1 (en) 2019-09-18 2019-11-13 Interview answer text classification method, device, electronic apparatus and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910882034.0A CN110717023B (en) 2019-09-18 2019-09-18 Method and device for classifying interview answer text, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110717023A CN110717023A (en) 2020-01-21
CN110717023B true CN110717023B (en) 2023-11-07

Family

ID=69210550

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910882034.0A Active CN110717023B (en) 2019-09-18 2019-09-18 Method and device for classifying interview answer text, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN110717023B (en)
WO (1) WO2021051586A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113111234A (en) * 2020-02-13 2021-07-13 北京明亿科技有限公司 Regular expression-based alarm condition category determination method and device
CN111522916B (en) * 2020-04-20 2021-03-09 马上消费金融股份有限公司 Voice service quality detection method, model training method and device
CN111695591B (en) * 2020-04-26 2024-05-10 平安科技(深圳)有限公司 AI-based interview corpus classification method, AI-based interview corpus classification device, AI-based interview corpus classification computer equipment and AI-based interview corpus classification medium
CN111695352A (en) * 2020-05-28 2020-09-22 平安科技(深圳)有限公司 Grading method and device based on semantic analysis, terminal equipment and storage medium
CN111709630A (en) * 2020-06-08 2020-09-25 深圳乐信软件技术有限公司 Voice quality inspection method, device, equipment and storage medium
CN113449095A (en) * 2021-07-02 2021-09-28 中国工商银行股份有限公司 Interview data analysis method and device
CN116452047A (en) * 2023-04-12 2023-07-18 上海才历网络有限公司 Candidate competence evaluation method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241288A (en) * 2018-10-12 2019-01-18 平安科技(深圳)有限公司 Update training method, device and the equipment of textual classification model
CN109522395A (en) * 2018-10-12 2019-03-26 平安科技(深圳)有限公司 Automatic question-answering method and device
CN109978339A (en) * 2019-02-27 2019-07-05 平安科技(深圳)有限公司 AI interviews model training method, device, computer equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548124B (en) * 2015-09-17 2021-09-07 松下知识产权经营株式会社 Theme estimation system and theme estimation method
CN113627458A (en) * 2017-10-16 2021-11-09 因美纳有限公司 Variant pathogenicity classifier based on recurrent neural network
CN108519975B (en) * 2018-04-03 2021-09-28 北京先声教育科技有限公司 Composition scoring method, device and storage medium
CN109670168B (en) * 2018-11-14 2023-04-18 华南师范大学 Short answer automatic scoring method, system and storage medium based on feature learning
CN109299246B (en) * 2018-12-04 2021-08-03 北京容联易通信息技术有限公司 Text classification method and device
CN109918497A (en) * 2018-12-21 2019-06-21 厦门市美亚柏科信息股份有限公司 A kind of file classification method, device and storage medium based on improvement textCNN model
CN109918506B (en) * 2019-03-07 2022-12-16 安徽省泰岳祥升软件有限公司 Text classification method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241288A (en) * 2018-10-12 2019-01-18 平安科技(深圳)有限公司 Update training method, device and the equipment of textual classification model
CN109522395A (en) * 2018-10-12 2019-03-26 平安科技(深圳)有限公司 Automatic question-answering method and device
CN109978339A (en) * 2019-02-27 2019-07-05 平安科技(深圳)有限公司 AI interviews model training method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2021051586A1 (en) 2021-03-25
CN110717023A (en) 2020-01-21

Similar Documents

Publication Publication Date Title
CN110717023B (en) Method and device for classifying interview answer text, electronic equipment and storage medium
CN111444344B (en) Entity classification method, entity classification device, computer equipment and storage medium
CN111222305A (en) Information structuring method and device
CN111191032B (en) Corpus expansion method, corpus expansion device, computer equipment and storage medium
CN110263854B (en) Live broadcast label determining method, device and storage medium
CN110502742B (en) Complex entity extraction method, device, medium and system
CN110516057B (en) Petition question answering method and device
CN111159404B (en) Text classification method and device
CN112183994A (en) Method and device for evaluating equipment state, computer equipment and storage medium
CN111428448A (en) Text generation method and device, computer equipment and readable storage medium
CN110717021B (en) Input text acquisition and related device in artificial intelligence interview
CN112100377B (en) Text classification method, apparatus, computer device and storage medium
CN113761868B (en) Text processing method, text processing device, electronic equipment and readable storage medium
CN117112744B (en) Assessment method and device for large language model and electronic equipment
CN110689359A (en) Method and device for dynamically updating model
CN109062977A (en) A kind of automatic question answering text matching technique, automatic question-answering method and system based on semantic similarity
CN111708890A (en) Search term determining method and related device
CN114817478A (en) Text-based question and answer method and device, computer equipment and storage medium
CN115905187B (en) Intelligent proposition system oriented to cloud computing engineering technician authentication
CN117195046A (en) Abnormal text recognition method and related equipment
CN113095073B (en) Corpus tag generation method and device, computer equipment and storage medium
CN112559713B (en) Text relevance judging method and device, model, electronic equipment and readable medium
Andersen et al. Towards More Reliable Text Classification on Edge Devices via a Human-in-the-Loop.
CN114357152A (en) Information processing method, information processing device, computer-readable storage medium and computer equipment
CN115774778A (en) Resume processing method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant