WO2021218028A1

WO2021218028A1 - Artificial intelligence-based interview content refining method, apparatus and device, and medium

Info

Publication number: WO2021218028A1
Application number: PCT/CN2020/118928
Authority: WO
Inventors: 邓悦; 金戈; 徐亮
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-04-29
Filing date: 2020-09-29
Publication date: 2021-11-04
Also published as: CN111695338A

Abstract

An artificial intelligence-based interview content refining method, apparatus and device, and a medium. The method comprises: obtaining an interview recording and converting the interview recording into self-introduction text and interview response text (S201); performing text parsing on the self-introduction text, and obtaining basic information of an interviewee (S202); classifying sentences of the interview response text to obtain classified text (S203); using a language extraction model to extract sentences from each type of classified text to obtain extracted sentences, and using a Transformer model to refine the extracted sentences to obtain a refined interview corpus (S204). Thus, the accurate extraction of core content from content of an interview recording having a large amount of data is achieved, and the accuracy of content extraction is improved, which helps to improve the accuracy of intelligent interview evaluation. The basic information of the interviewee and the refined interview corpus are stored in a blockchain and are sent at the same time to a management end for evaluation, which avoids direct semantic recognition that causes evaluation results to fail to meet requirements, which helps to improve the accuracy and efficiency of the intelligent evaluation of interview results.

Description

Refining methods, devices, equipment and media for interview content based on artificial intelligence

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 29, 2020, the application number is 202010356767.3, and the invention title is "Artificial Intelligence-based Interview Content Refining Method, Apparatus, Equipment, and Media", all of which The content is incorporated in this application by reference.

Technical field

This application relates to the field of artificial intelligence, and in particular to a method, device, equipment and medium for refining interview content based on artificial intelligence.

Background technique

In the hot recruitment season for large companies, there are often many interviewers participating in the interview. At present, most employers and interviewers conduct interviews on-site or through video conferences. Employers often evaluate the interviewer based on the interviewer’s response to the interview after the interview. The usual manual interviews include at least the following questions: (1) Different interviewers have a preference for asking questions. The same interviewer will have different judgments due to different workplace experience, interview skills and emotional states; (2) High In view of this, some companies use artificial intelligence-based interview robots to conduct interviews and provide the interview content to decision makers for result evaluation. This is conducive to improving the fairness of the interview, but also at the same time. This leads to a new problem. When there are many interviewers, more interview content will be obtained. This also increases the time cost of decision-making evaluation and leads to inefficient intelligent interviews.

The existing solutions mainly use keyword matching on the interview content to obtain key sentences, or use Natural Language Processing (NLP) models for semantic recognition. When keyword matching is used, different interviewers In the response process, the way to answer the question may be different, and there may be situations where the preset keywords may not be matched, resulting in low accuracy of the final interview estimation. When the general natural language processing model is used for semantic recognition, the accuracy of semantic recognition is often up to Not required.

Summary of the invention

The embodiments of the application provide an artificial intelligence-based method, device, and medium for refining interview content, so as to improve the accuracy of the interview content evaluation in the smart interview.

In order to solve the above technical problems, an embodiment of the present application provides an artificial intelligence-based method for refining interview content, including:

Obtain interview recordings, and convert the interview recordings into interview texts, where the interview texts include self-introduction texts and interview response texts;

Perform text analysis on the self-introduction text to obtain the basic information of the interviewer;

According to the interview angle involved, classify the sentence of the interview response text to obtain the classified text;

Using the language extraction model, extract sentences from each type of the classified text to obtain the extracted sentences, and use the Transformer model to refine the extracted sentences to obtain the interview refined corpus;

The basic information of the interviewer and the refined corpus of the interview are sent to the management terminal, so that the management terminal determines the interview result according to the basic information of the interviewer and the refined corpus of the interview.

In order to solve the above technical problems, an embodiment of the present application also provides an artificial intelligence-based interview content refining device, including:

A text acquisition module for acquiring interview recordings and converting the interview recordings into interview texts, where the interview texts include self-introduction texts and interview response texts;

The text analysis module is used for text analysis of the self-introduction text to obtain basic information of the interviewer;

The text classification module is used to classify the interview response text according to the interview angle involved to obtain the classified text;

The corpus extraction module is used to extract sentences from each type of the classified text through the language extraction model to obtain the extracted sentences, and use the Transformer model to refine the extracted sentences to obtain the interview refined corpus;

The information sending module is configured to send the basic information of the interviewer and the refined corpus of the interview to the management terminal, so that the management terminal determines the interview result according to the basic information of the interviewer and the refined corpus of the interview.

In order to solve the above technical problems, an embodiment of the present application also provides a computer device, including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor executes all When describing computer-readable instructions, the steps to implement the method for refining interview content based on artificial intelligence are:

In order to solve the above technical problems, embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the following is achieved based on artificial intelligence The steps to refine the interview content:

The artificial intelligence-based interview content refining method, device, equipment and medium provided by this application embodiment obtain interview recordings and convert the interview recordings into interview texts. The interview texts include self-introduction text and interview response text. The introduction text is analyzed to obtain the basic information of the interviewer. According to the interview angle involved, the interview response text is classified to obtain the classified text. Through the language extraction model, the sentence is extracted from each type of classified text to obtain the extracted sentence, and The Transformer model is used to refine and extract sentences to obtain interview refined corpus, which can accurately extract the core content from the content of interview records with a large amount of data, improve the accuracy of content extraction, and help improve the accuracy of intelligent interview evaluation. The interviewer’s basic information and the interview refined corpus are sent to the management side, so that the management side can determine the interview result based on the interviewer’s basic information and the interview refined corpus, avoiding inaccurate evaluation results caused by direct semantic recognition, and improving the evaluation of intelligent interview results. Accuracy and efficiency.

Description of the drawings

In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.

Figure 1 is an exemplary system architecture diagram to which the present application can be applied;

FIG. 2 is a flowchart of an embodiment of the method for refining interview content based on artificial intelligence of the present application;

Fig. 3 is a schematic structural diagram of an embodiment of an artificial intelligence-based interview content refining device according to the present application;

Fig. 4 is a schematic structural diagram of an embodiment of a computer device according to the present application.

Detailed ways

Unless otherwise defined, all technical and scientific terms used herein have the same meanings as commonly understood by those skilled in the technical field of the application; the terms used in the specification of the application herein are only for describing specific embodiments. The purpose is not to limit the application; the terms "including" and "having" in the specification and claims of the application and the above-mentioned description of the drawings and any variations thereof are intended to cover non-exclusive inclusions. The terms "first", "second", etc. in the specification and claims of the present application or the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific sequence.

The reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.

The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

Please refer to FIG. 1. As shown in FIG. 1, the system architecture 100 may include

terminal devices

101, 102, and 103, a network 104 and a server 105. The network 104 is used to provide a medium for communication links between the

terminal devices

101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.

The user can use the

terminal devices

101, 102, and 103 to interact with the server 105 through the network 104 to receive or send messages and so on.

The

terminal devices

101, 102, 103 may be various electronic devices with a display screen and support web browsing, including but not limited to smart phones, tablets, e-book readers, MP3 players (Moving Picture E interface display perts Group Audio Layer III. The moving picture expert compresses the standard audio layer 3), MP4 (Moving Picture E interface displays perts Group Audio Layer IV, the moving picture expert compresses the standard audio layer 4) player, laptop portable computer and desktop computer, etc.

The server 105 may be a server that provides various services, for example, a background server that provides support for pages displayed on the

terminal devices

101, 102, and 103.

It should be noted that the artificial intelligence-based interview content refining method provided by the embodiment of the present application is executed by the server, and accordingly, the artificial intelligence-based interview content refining device is set in the server.

It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there may be any number of terminal devices, networks, and servers. The

terminal devices

101, 102, and 103 in the embodiments of the present application may specifically correspond to application systems in actual production.

Please refer to FIG. 2. FIG. 2 shows an artificial intelligence-based interview content refining method provided by an embodiment of the present application. The application of the method to the server in FIG. 1 is taken as an example for description. The details are as follows:

S201: Obtain interview recordings and convert the interview recordings into interview texts, where the interview text includes self-introduction text and interview response text.

Specifically, in the interview recruitment process of an enterprise, many interviewers participate in the interview. Due to the limited number of interview positions, there are situations in which multiple interviewers interview the same position. In order to avoid confusion or forgetting the interviewer’s information, this embodiment is used in the interview process. In, the interview process of many interviewees is recorded, and the content of the recording is converted into interview text after the fact and subsequent processing is carried out. The interview text includes self-introduction text and interview response text.

Among them, the self-introduction text refers to the text obtained by voice conversion of the interviewer's self-introduction, and the response text refers to the text that the interviewer asks and the interviewer responds after the self-introduction.

It needs to be explained that the interviewer mentioned in this embodiment may be a person or a question-and-answer robot participating in a smart interview, which is not specifically limited here.

It should be understood that the general interview time is 30-40 minutes, or even longer, so the content of the interviewer’s answer is relatively large in total. In view of this situation, this embodiment uses self-introduction as the starting point, because the self-introduction part The information can already summarize a larger part of the interviewer’s abilities, and other aspects of the interview, such as skill surveys and business acumen surveys, can be used as reference and used as training data to supplement and verify the interviewer’s self-introduction , Get a more comprehensive result.

In this embodiment, to convert interview recordings into interview text, a tool that supports voice-to-text conversion can be used specifically, or a voice-to-text algorithm can also be used, and there is no specific limitation here. For the specific implementation process of dividing the interview text into the self-introduction text and the interview response text, refer to the description of the subsequent embodiments. To avoid repetition, it will not be repeated here.

S202: Perform text analysis on the self-introduction text to obtain basic information of the interviewer.

Specifically, since the self-introduction text generally includes basic personal information, experience information, areas of expertise and skills, past honors and awards, and self-evaluation, the content modules involved are relatively similar. In order to improve efficiency, this embodiment adopts regular expressions based on Analyze the self-introduction text, quickly extract the content in the self-introduction text, and get the basic information of the interviewer.

Among them, the basic information of the interviewer includes, but is not limited to: personal fixed information such as name, household registration, graduate school, major, working years, etc., and personal professional information such as honors obtained, companies served, experience and skills mastered. Wait.

It should be noted that since the content dimensions contained in the self-introduction text are roughly similar, the basic information of the interviewer to be obtained is divided into multiple dimensions, and at least one regular expression is set for each dimension to match the self-introduction text. Perform matching analysis to obtain the content corresponding to the dimension as the analysis content of the dimension.

Among them, regular expression (regular expression) describes a string matching pattern (pattern), which can be used to check whether a string contains a certain substring, replace the matched substring, or take out a string that matches a certain The substring of the condition, etc.

For example, in a specific implementation, text analysis is carried out from the seven dimensions of name, household registration, graduate school, major, working years, employment experience, and skills mastered. Among them, the household registration dimension can be set to include some Match keywords with specific characters, for example, match sentence patterns containing specific keywords such as "I am a XXX person", "I come from XXX", "I am a XXX person", "I grew up in XXX" .

S203: According to the interview angle involved, classify the interview response text to obtain the classified text.

Specifically, in the interviewer’s questioning process, questions are usually asked around work experience, field of expertise, and skills. In this embodiment, these interview angles are preset according to actual needs, and the interview response text is obtained. Then, according to the interview angle involved, the interview response text is classified to obtain the classified text, so that the subsequent extraction and refining of key sentences can be carried out according to the category of the classified text, which is beneficial to improve the accuracy of content refining.

Among them, the interview angle involved refers to the focus of questions and responses, such as salary requirements, awards, working years, professional skills, etc.

Further, in this embodiment, according to the interview angle involved, the interview response text is semantically recognized, and the sentence is classified according to the semantic recognition result to obtain the specific implementation process of the classified text. You can refer to the description of the subsequent embodiments. To avoid repetition, I won't repeat them here.

Among them, the sentence classification according to the semantic recognition results can be specifically clustering the recognition results to obtain the clustering results, and calculating the Euclidean distance between the clustering results and the word vectors corresponding to each interview angle, and then calculating the nearest interview Angle, as the interview angle corresponding to the clustering result.

S204: Use the language extraction model to extract sentences from each type of classified text to obtain the extracted sentences, and use the Transformer model to refine the extracted sentences to obtain interview refined corpus.

Specifically, the language extraction model is used to extract sentences from each type of classified text to obtain the extracted sentences, and then use the Transformer model to refine the extracted sentences to obtain the interview refined corpus.

Among them, language extraction models include but are not limited to: deep semantic representation (Embedding from Language Model, ELMo) algorithm, OpenAI GPT, and pre-trained Bidirectional Encoder Representations from Transformers (BERT) model.

Preferably, in this embodiment, the improved OpenAI GPT model is used as the semantic extraction model. For the specific implementation process of sentence extraction, please refer to the description of the subsequent embodiments. In order to avoid repetition, it will not be repeated here.

It should be noted that the specific expression form of the extracted sentence obtained in this embodiment may also be in the form of a vector, so that it can be quickly input to the Transformer model for refinement and extraction.

Among them, the Transformer model can quickly extract sentences of higher importance based on the weight through the attention mechanism.

It should be noted that in the decoding stage of the Transformer model of this embodiment, the sum of the generated document feature vectors is input to the decoder. This autoregressive long- and short-term network will predict the next sentence to be extracted, and the output result will be in the next sentence Connect to the input when decoding. The biggest difference between the decoder used in the Transformer model of this embodiment and other commonly used decoders is that in the process of obtaining attention through dot products, if the same index appears twice in succession, the entire extraction process is ended to avoid multiple extractions. Similar information leads to information redundancy.

It should be understood that, in this embodiment, there is no necessary logical sequence between step S203 to step S204 and step S202, and it can also be executed in parallel, which is not limited here.

S205: Send the interviewer's basic information and the interview refined corpus to the management terminal, so that the management terminal determines the interview result based on the interviewer's basic information and the interview refined corpus.

Specifically, the extracted basic information of the interviewer and the interview refined corpus are sent to the management end to ensure the accuracy and refinement of the extracted content, so that the subsequent management end user can accurately and quickly determine the evaluation result based on the extracted content, which is beneficial to improve the intelligent interview Accuracy and efficiency.

In this embodiment, the interview recording is obtained and converted into interview text. The interview text includes self-introduction text and interview response text. The self-introduction text is text-analyzed to obtain the basic information of the interviewer. From the interview perspective, the interview response text is classified to obtain the classified text. Through the language extraction model, the sentence is extracted from each type of classified text to obtain the extracted sentence, and the Transformer model is used to refine the extracted sentence to obtain the interview refined corpus. The core content is accurately extracted from the content of a large number of interview records, which improves the accuracy of content extraction, which is conducive to improving the accuracy of intelligent interview evaluation. Finally, the basic information of the interviewer and the interview refined corpus are sent to the management end to enable The management terminal determines the interview results based on the interviewee's basic information and the interview refined corpus, avoiding inaccurate evaluation results caused by direct semantic recognition, which is conducive to improving the accuracy and efficiency of the evaluation of intelligent interview results.

In one embodiment, the obtained basic interviewer information and interview refined corpus can be stored on the blockchain network, and the data information can be shared between different platforms through the storage of the blockchain, and the data can also be prevented from being tampered with.

Among them, the blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

In some optional implementation manners of this embodiment, in step S201, converting interview recordings into interview text includes:

Identify the beginning of the question and answer included in the interview recording;

Use the voice-to-text method to convert the interview recording text, and use the text converted from the recording content before the question and answer start identification as the self-introduction text, and the text converted from the recording content before the question and answer start identification as the interview response text .

Specifically, before performing the voice conversion text, the interview recording file is traversed to find a voice segment with the same voice information as the preset question and answer start identification, as a demarcation point, the voice before the voice segment is converted into the text obtained, As the self-introduction text, the speech after the speech fragment is converted into the text as the interview response text.

Among them, search for a voice segment with the same voice information as the preset Q&A start identifier. Specifically, the voice signal can be subjected to amplitude normalization processing, pre-emphasis processing, and frame windowing to obtain a voice frame set, and then from the voice In the frame set, the voice frame segment that is the same as the voice frame of the preset question and answer start identifier is found by traversal and comparison, and the voice frame segment is determined as the voice segment with the same voice information as the preset question and answer start identifier.

Among them, the preset Q&A start mark is used to remind the self-introduction stage is over and the Q&A session begins, such as the voice prompt of "Thank you for your introduction, now I want to ask you a few questions", etc., which can be carried out according to the actual situation. By default, there is no limitation here.

Among them, the voice conversion text can use a voice recognition algorithm, or a third-party tool with a voice conversion function can be used, and the specifics are not limited. Speech-to-speech-to-text algorithms include, but are not limited to: speech recognition algorithms based on vocal tract models, speech template matching recognition algorithms, or artificial neural network speech recognition algorithms, etc.

In this embodiment, the interview recording text is converted into the self-introduction text and the interview response text, so that the two types of text are processed separately in the subsequent process, which is more targeted and the processing result obtained is more accurate.

In some optional implementations of this embodiment, in step S203, sentence classification is performed on the interview response text according to the interview angle involved, and the obtained classification text includes:

Take each sentence in the interview response text as a basic sentence, and perform word segmentation on the basic sentence through the preset word segmentation method to obtain the basic word segmentation;

Convert the basic word segmentation into a word vector, and cluster the word vector through a clustering algorithm to obtain the cluster center corresponding to the basic sentence;

For each basic sentence, calculate the Euclidean distance between the cluster center corresponding to the basic sentence and the word vector corresponding to each preset interview angle, and use the preset interview angle with the smallest distance as the target classification of the basic sentence. , As the classification text corresponding to the target classification.

Specifically, by performing word segmentation clustering on each sentence in the interview response text, the cluster center corresponding to each sentence is obtained, and then the word vector corresponding to the cluster center and the preset interview angle is calculated to determine the classification to which each sentence belongs.

Among them, the preset word segmentation methods include, but are not limited to: third-party word segmentation tools or word segmentation algorithms, etc.

Among them, common third-party word segmentation tools include but are not limited to: Stanford NLP word segmentation, ICTCLAS word segmentation system, ansj word segmentation tool and HanLP Chinese word segmentation tool, etc.

Among them, word segmentation algorithms include, but are not limited to: rule-based word segmentation methods, statistics-based word segmentation methods, understanding-based word segmentation methods, and neural network word segmentation methods.

Rule-based word segmentation methods mainly include: Minimum Matching, Maximum Matching, Reverse Direction Maximum Matching, and Bi-Direction Maximum Matching (BMM) , Sign segmentation method, full segmentation path selection method and Association-Backtracking Method (Association-Backtracking Method, AB method for short), etc.

The word segmentation methods based on statistics mainly include: N-Gram model, Hidden Markov Model (HMM) sequence labeling method, Maximum Entropy Model (MEM) sequence labeling method, Maximum Entropy Markov model (Maximum Entropy Markov Model (MEMM) sequence labeling method and Conditional Random Fields (CRF) sequence labeling method, etc.

Preferably, this embodiment adopts an improved CRF model for word segmentation, and the specific implementation process can refer to the description of the subsequent embodiments. In order to avoid repetition, it will not be repeated here.

It is easy to understand that by extracting basic word segmentation by word segmentation, on the one hand, it can effectively filter out some meaningless words in the text, and on the other hand, it is also conducive to the subsequent use of these texts to generate word vectors.

Among them, the clustering algorithm is also called cluster analysis. It is a statistical analysis method for the classification of samples or indicators. It is also an important algorithm for data mining. Clustering algorithms include but are not limited to: K-Means ) Clustering algorithm, mean shift clustering algorithm, density-based clustering (Density-Based Spatial Clustering of Applications with Noise, DBSCAN) method, maximum expected clustering based on Gaussian mixture model, agglomerative hierarchical clustering and graph group detection ( Graph Community Detection) algorithm, etc.

Preferably, in this embodiment, a K-Means clustering algorithm is adopted.

In this embodiment, by clustering and calculating semantic similarity, the classification of each sentence in the interview response text is determined, which is beneficial to the subsequent refinement of sentences of different classifications in a targeted manner.

In some optional implementation manners of this embodiment, performing word segmentation processing on the basic sentence through a preset word segmentation method to obtain the basic word segmentation includes:

Using the conditional random field model, the basic sentence is segmented to obtain the initial segmentation;

Obtain the word frequency of each initial word segmentation from the historical interview response text;

Based on the word frequency of the initial word segmentation, the weight of the initial word segmentation is generated, and the weighted initial word segmentation is marked as the basic word segmentation.

Specifically, the conditional random field model is used to segment the basic sentence to obtain the initial word segmentation, and then through the historical interview response text, the word frequency of each initial word segmentation is obtained, and the weight corresponding to the initial word segmentation is generated according to the word frequency, and the weight information is obtained The basic word segmentation makes the proportion of each basic word segmentation more in line with the needs of the interview scene when the basic word segmentation is subsequently marked.

Among them, the conditional random field (CRF) model is a discriminative probability model, which is a kind of random field. It is often used to label or analyze sequence data. It represents a given set of input random variables X. Under the conditions, another set of Markov random fields outputting random variable Y has good effect in sequence tagging tasks such as word segmentation, part-of-speech tagging and named entity recognition.

Among them, the historical interview response text refers to the interview response text generated by the interview that has occurred. The word frequency of the historical interview response text can reflect the proportion of some word segmentation in the interview process.

In this embodiment, weight is assigned to the initial word segmentation obtained by performing word segmentation on the conditional random field model to obtain a basic word segmentation that is more in line with the intelligent interview scenario, which is beneficial to improve the accuracy of classification.

In some optional implementations of this embodiment, in step S204, the language extraction model is a two-way long short-term memory network model. The two-way long short-term memory network model includes a sentence encoder and a document encoder. Sentence extraction is performed in the class classification text, and the extracted sentences include:

Use the sentence encoder to split the text in the classified text according to the characters to obtain the basic characters;

Encode the basic characters to obtain the encoding content corresponding to the basic characters;

Input the encoded content into the character encoding layer of the initial weight, map each encoding into a character vector through the character encoding layer, and use each character vector as the sentence encoding result;

The sentence encoding results are spliced into a hidden layer vector in the forward and reverse hidden layer output, and the hidden layer vector is input to the document encoder;

The hidden layer vector is weighted by the document encoder to obtain the document feature vector, and the document feature vector is decoded, and the output result obtained by the decoding is used as the extraction sentence.

Specifically, the text in the classified text is split and encoded by the sentence encoder to obtain the encoding content, and then the encoding content is input into the character encoding layer to obtain the character vector corresponding to each encoding, and each character vector , As the encoding result of the sentence, and passed to the document encoder through the hidden layer, and weighted by the document encoder to obtain the extracted sentence.

It is worth noting that, based on the encoding result of the sentence, the forward and reverse hidden layer output corresponding to each character in the model will be spliced into a hidden layer vector:

The positive direction is represented by superscript +, the reverse direction is represented by superscript -, and the i-th character is represented by subscript i.

Among them, the Long Short-Term Memory (LSTM) is a time recurrent neural network, which is suitable for processing and predicting important events with relatively long intervals and delays in a time series.

It should be noted that the one-way LSTM can memorize the first word to the last word of a sentence according to the human reading order. This LSTM structure can only capture the above information but not the following information, while the two-way LSTM is composed of Two LSTMs with different directions are composed. One LSTM reads data from front to back according to the order of words in the sentence, and the other LSTM reads data from back to front in the opposite direction of the sentence word order, so that the first LSTM obtains the above information. Another LSTM obtains the following information. The joint statement of the two LSTMs is the context information of the entire sentence, and the context information is provided by the entire sentence, which naturally contains more abstract semantic information (meaning of the sentence). The advantage of this method is It makes full use of the advantages of LSTM in processing sequence data with time series characteristics, and because we input location features, the entity direction information contained in the location features can be extracted after bidirectional LSTM encoding.

In this embodiment, the sentence encoder and the document encoder are used to analyze and extract the classified sentences from two bidirectional long-short memory networks of different levels, so as to improve the accuracy of key sentence extraction.

In some optional implementation manners of this embodiment, weighting the hidden layer vector by the document encoder to obtain the document feature vector includes:

Use the following formula to determine the document feature vector:

Among them, C _i is the i-th document feature vector, j is the serial number of the embedding code, n is the number of embedding codes, b _ij is the weight of the i-th document feature vector for the j-th hidden layer vector, and h _j is the j-th hidden layer vector. A hidden layer vector, in which the embedded coding is generated based on the hidden state of the bidirectional long and short-term memory network model.

In this embodiment, the method of generating document feature vectors is obtained through weighted calculation, which is beneficial for accurately extracting key sentences.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

Fig. 3 shows a schematic block diagram of an artificial intelligence-based interview content refining device corresponding to the above-mentioned embodiment of the artificial intelligence-based interview content refining method one-to-one. As shown in FIG. 3, the artificial intelligence-based interview content refining device includes a text acquisition module 31, a text analysis module 32, a text classification module 33, a corpus extraction module 34, and an information sending module 35. The detailed description of each functional module is as follows:

The text acquisition module 31 is used to acquire interview recordings and convert the interview recordings into interview texts, where the interview text includes self-introduction text and interview response text;

The text analysis module 32 is used for text analysis of the self-introduction text to obtain basic information of the interviewer;

The text classification module 33 is used to classify the interview response text according to the interview angle involved to obtain the classified text;

The corpus extraction module 34 is used to extract sentences from each type of classified text through the language extraction model to obtain the extracted sentences, and use the Transformer model to refine the extracted sentences to obtain the interview refined corpus;

The information sending module 35 is used to send the interviewer's basic information and the interview refined corpus to the management terminal, so that the management terminal determines the interview result based on the interviewer's basic information and the interview refined corpus.

Optionally, the text acquisition module 31 includes:

Logo recognition unit, used to identify the start logo of the question and answer contained in the interview recording;

The text determination unit is used to convert the interview recording to the text by using the voice-to-text method, and convert the text obtained by converting the recording content before the question and answer start marking, as a self-introduction text, and convert the recording content before the question and answer start marking Text, as the interview response text.

Optionally, the text classification module 33 includes:

The word segmentation unit is used to treat each sentence in the interview response text as a basic sentence, and use the preset word segmentation method to segment the basic sentence to obtain the basic word segmentation;

The clustering unit is used to convert the basic word segmentation into a word vector, and use a clustering algorithm to cluster the word vector to obtain the cluster center corresponding to the basic sentence;

The classification unit is used to calculate the Euclidean distance between the cluster center corresponding to the basic sentence and the word vector corresponding to each preset interview angle for each basic sentence, and use the preset interview angle with the smallest distance as the target classification of the basic sentence , Regard the basic sentence as the classification text corresponding to the target classification.

Optionally, the word segmentation unit includes:

The initial word segmentation unit is used to segment the basic sentence using the conditional random field model to obtain the initial word segmentation;

The word frequency acquisition subunit is used to obtain the word frequency of each initial word segmentation from the historical interview response text;

The word segmentation weighting unit is used to generate the weight of the initial word segmentation based on the word frequency of the initial word segmentation, and use the weighted initial word segmentation as the basic word segmentation.

Optionally, the corpus extraction module 34 includes:

The splitting unit is used to split the text in the classified text by the sentence encoder according to the characters to obtain the basic characters;

The coding unit is used to encode the basic character to obtain the coding content corresponding to the basic character;

The mapping unit is used to input the encoded content to the character encoding layer of the initialization weight, and map each encoding into a character vector through the character encoding layer, and use each character vector as a sentence encoding result;

The splicing unit is used to splice the sentence encoding result into a hidden layer vector in the forward and reverse hidden layer output, and input the hidden layer vector to the document encoder;

The weighting unit is used to weight the hidden layer vector by the document encoder to obtain the document feature vector, decode the document feature vector, and use the decoded output result as the extraction sentence.

Optionally, the weighted decoding unit includes:

The calculation subunit is used to determine the document feature vector using the following formula:

Optionally, the device for refining interview content based on artificial intelligence further includes:

The storage module is used to store the basic information of the interviewer and the refined corpus of the interview in the blockchain network.

Regarding the specific limitation of the artificial intelligence-based interview content refining device, please refer to the above limitation on the artificial intelligence-based interview content refining method, which will not be repeated here. Each module in the above artificial intelligence-based interview content refining device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

In order to solve the above technical problems, the embodiments of the present application also provide computer equipment. Please refer to FIG. 4 for details. FIG. 4 is a block diagram of the basic structure of the computer device in this embodiment.

The computer device 4 includes a memory 41, a processor 42, and a network interface 43 that are connected to each other in communication via a system bus. It should be pointed out that the figure only shows the computer device 4 with the components connected to the memory 41, the processor 42, and the network interface 43, but it should be understood that it is not required to implement all the shown components, and alternative implementations can be made. More or fewer components. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.

The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.

The memory 41 includes at least one type of readable storage medium, the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or D interface display memory, etc.), random access memory (RAM) , Static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk equipped on the computer device 4, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Of course, the memory 41 may also include both the internal storage unit of the computer device 4 and its external storage device. In this embodiment, the memory 41 is generally used to store an operating system and various application software installed in the computer device 4, such as program codes for controlling electronic files. In addition, the memory 41 can also be used to temporarily store various types of data that have been output or will be output.

The processor 42 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 42 is generally used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to run program codes or process data stored in the memory 41, for example, run program codes for controlling electronic files.

The network interface 43 may include a wireless network interface or a wired network interface, and the network interface 43 is generally used to establish a communication connection between the computer device 4 and other electronic devices.

This application also provides another implementation manner, that is, a computer-readable storage medium is provided. The computer-readable storage medium may be non-volatile or volatile, and the computer-readable storage medium stores An interface display program, the interface display program can be executed by at least one processor, so that the at least one processor executes the steps of the above-mentioned artificial intelligence-based interview content refinement method.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present application.

Obviously, the above-described embodiments are only a part of the embodiments of the present application, rather than all of the embodiments. The drawings show preferred embodiments of the present application, but do not limit the patent scope of the present application. The present application can be implemented in many different forms. On the contrary, the purpose of providing these examples is to make the understanding of the disclosure of the present application more thorough and comprehensive. Although this application has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it is still possible for those skilled in the art to modify the technical solutions described in each of the foregoing specific embodiments, or equivalently replace some of the technical features. . All equivalent structures made by using the contents of the description and drawings of this application, directly or indirectly used in other related technical fields, are similarly within the scope of patent protection of this application.

Claims

An artificial intelligence-based interview content refining method, wherein the artificial intelligence-based interview content refining method includes:

Obtain interview recordings, and convert the interview recordings into interview texts, where the interview texts include self-introduction texts and interview response texts;

Perform text analysis on the self-introduction text to obtain the basic information of the interviewer;

According to the interview angle involved, classify the sentence of the interview response text to obtain the classified text;

Using the language extraction model, extract sentences from each type of the classified text to obtain the extracted sentences, and use the Transformer model to refine the extracted sentences to obtain the interview refined corpus;

The basic information of the interviewer and the refined corpus of the interview are sent to the management terminal, so that the management terminal determines the interview result according to the basic information of the interviewer and the refined corpus of the interview.
The method for refining interview content based on artificial intelligence according to claim 1, wherein said converting said interview recording into interview text comprises:

Identify the question and answer start mark contained in the interview recording;

Using voice-to-text, the interview recording is converted into text, and the text obtained by converting the recording content before the start of the question and answer mark is used as the self-introduction text, and the recorded content before the start of the question and answer mark is converted into the text Text, as the interview response text.
The method for refining interview content based on artificial intelligence according to claim 1, wherein said classifying the interview response text according to the interview angle involved, and obtaining the classified text comprises:

Use each sentence in the interview response text as a basic sentence, and perform word segmentation processing on the basic sentence through a preset word segmentation method to obtain the basic word segmentation;

Converting the basic word segmentation into a word vector, and clustering the word vector through a clustering algorithm to obtain the cluster center corresponding to the basic sentence;

For each basic sentence, calculate the Euclidean distance between the cluster center corresponding to the basic sentence and the word vector corresponding to each preset interview angle, and use the preset interview angle with the smallest distance as the target classification of the basic sentence, The basic sentence is used as the classification text corresponding to the target classification.
The method for refining interview content based on artificial intelligence according to claim 3, wherein said performing word segmentation processing on said basic sentence through a preset word segmentation method to obtain basic word segmentation comprises:

Using a conditional random field model to segment the basic sentence to obtain an initial segmentation;

Obtain the word frequency of each initial word segmentation from the historical interview response text;

Based on the word frequency of the initial segmentation, the weight of the initial segmentation is generated, and the initial segmentation labeled with the weight is used as the basic segmentation.
The method for refining interview content based on artificial intelligence according to claim 1, wherein the language extraction model is a two-way long and short-term memory network model, and the two-way long and short-term memory network model includes a sentence encoder and a document encoder. Through the language extraction model, sentences are extracted from each type of the classified text, and the extracted sentences are obtained including:

Splitting the text in the classified text by the sentence encoder according to characters to obtain basic characters;

Encoding the basic characters to obtain the encoding content corresponding to the basic characters;

Inputting the encoded content to a character encoding layer of initial weights, mapping each encoding into a character vector through the character encoding layer, and using each character vector as a sentence encoding result;

Concatenating the sentence encoding result into a hidden layer vector in the forward and reverse hidden layer outputs, and inputting the hidden layer vector to the document encoder;

The hidden layer vector is weighted by the document encoder to obtain a document feature vector, and the document feature vector is decoded, and the output result obtained by the decoding is used as the extraction sentence.
The method for refining interview content based on artificial intelligence according to claim 5, wherein said weighting said hidden layer vector by said document encoder to obtain a document feature vector comprises:

Use the following formula to determine the document feature vector:

Wherein, C i is the i-th document feature vector, j is the serial number of the embedding code, n is the number of embedding codes, b ij is the weight of the i-th document feature vector for the j-th hidden layer vector, h j is the j-th hidden layer vector, where the embedded coding is generated based on the hidden state of the bidirectional long short-term memory network model.
The method for refining interview content based on artificial intelligence according to claim 1, wherein after the extracting sentence is refined by the Transformer model and the interview refined corpus is obtained, the method further comprises: combining the basic information of the interviewer and the interview The refined corpus is stored in the blockchain network.
An artificial intelligence-based interview content refining device, wherein the artificial intelligence-based interview content refining device includes:

A text acquisition module for acquiring interview recordings and converting the interview recordings into interview texts, where the interview texts include self-introduction texts and interview response texts;

The text analysis module is used for text analysis of the self-introduction text to obtain basic information of the interviewer;

The text classification module is used to classify the interview response text according to the interview angle involved to obtain the classified text;

The corpus extraction module is used to extract sentences from each type of the classified text through the language extraction model to obtain the extracted sentences, and use the Transformer model to refine the extracted sentences to obtain the interview refined corpus;

The information sending module is configured to send the basic information of the interviewer and the refined corpus of the interview to the management terminal, so that the management terminal determines the interview result according to the basic information of the interviewer and the refined corpus of the interview.
A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer-readable instructions:

Obtain interview recordings, and convert the interview recordings into interview texts, where the interview texts include self-introduction texts and interview response texts;

Perform text analysis on the self-introduction text to obtain the basic information of the interviewer;

According to the interview angle involved, classify the sentence of the interview response text to obtain the classified text;

Using the language extraction model, extract sentences from each type of the classified text to obtain the extracted sentences, and use the Transformer model to refine the extracted sentences to obtain the interview refined corpus;

The basic information of the interviewer and the refined corpus of the interview are sent to the management terminal, so that the management terminal determines the interview result according to the basic information of the interviewer and the refined corpus of the interview.
9. The computer device according to claim 9, wherein said converting said interview recording into interview text comprises:

Identify the question and answer start mark contained in the interview recording;

Using voice-to-text, the interview recording is converted into text, and the text obtained by converting the recording content before the start of the question and answer mark is used as the self-introduction text, and the recorded content before the start of the question and answer mark is converted into the text Text, as the interview response text.
9. The computer device according to claim 9, wherein said classifying the interview response text according to the interview angle involved, and obtaining the classified text comprises:

Use each sentence in the interview response text as a basic sentence, and perform word segmentation processing on the basic sentence through a preset word segmentation method to obtain the basic word segmentation;

Converting the basic word segmentation into a word vector, and clustering the word vector through a clustering algorithm to obtain the cluster center corresponding to the basic sentence;

For each basic sentence, calculate the Euclidean distance between the cluster center corresponding to the basic sentence and the word vector corresponding to each preset interview angle, and use the preset interview angle with the smallest distance as the target classification of the basic sentence, The basic sentence is used as the classification text corresponding to the target classification.
11. The computer device according to claim 11, wherein said performing word segmentation processing on said basic sentence through a preset word segmentation method to obtain basic word segmentation comprises:

Using a conditional random field model to segment the basic sentence to obtain an initial segmentation;

Obtain the word frequency of each initial word segmentation from the historical interview response text;

Based on the word frequency of the initial segmentation, the weight of the initial segmentation is generated, and the initial segmentation labeled with the weight is used as the basic segmentation.
The computer device according to claim 9, wherein the language extraction model is a two-way long and short-term memory network model, the two-way long and short-term memory network model includes a sentence encoder and a document encoder, and the language extraction model from Sentence extraction is performed in each type of the classified text, and the extracted sentences include:

Splitting the text in the classified text by the sentence encoder according to characters to obtain basic characters;

Encoding the basic characters to obtain the encoding content corresponding to the basic characters;

Inputting the encoded content to a character encoding layer of initial weights, mapping each encoding into a character vector through the character encoding layer, and using each character vector as a sentence encoding result;

Concatenating the sentence encoding result into a hidden layer vector in the forward and reverse hidden layer outputs, and inputting the hidden layer vector to the document encoder;

The hidden layer vector is weighted by the document encoder to obtain a document feature vector, and the document feature vector is decoded, and the output result obtained by the decoding is used as the extraction sentence.
The computer device according to claim 13, wherein said weighting said hidden layer vector by said document encoder to obtain a document feature vector comprises:

Use the following formula to determine the document feature vector:

Wherein, C i is the i-th document feature vector, j is the serial number of the embedding code, n is the number of embedding codes, b ij is the weight of the i-th document feature vector for the j-th hidden layer vector, h j is the j-th hidden layer vector, where the embedded coding is generated based on the hidden state of the bidirectional long short-term memory network model.
A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, wherein, when the computer-readable instructions are executed by a processor, the following steps are implemented as claimed in the claims:

Obtain interview recordings, and convert the interview recordings into interview texts, where the interview texts include self-introduction texts and interview response texts;

Perform text analysis on the self-introduction text to obtain the basic information of the interviewer;

According to the interview angle involved, classify the sentence of the interview response text to obtain the classified text;

Using the language extraction model, extract sentences from each type of the classified text to obtain the extracted sentences, and use the Transformer model to refine the extracted sentences to obtain the interview refined corpus;

The basic information of the interviewer and the refined corpus of the interview are sent to the management terminal, so that the management terminal determines the interview result according to the basic information of the interviewer and the refined corpus of the interview.
15. The computer-readable storage medium of claim 15, wherein said converting said interview recording into interview text comprises:

Identify the question and answer start mark contained in the interview recording;

Using voice-to-text, the interview recording is converted into text, and the text obtained by converting the recording content before the start of the question and answer mark is used as the self-introduction text, and the recorded content before the start of the question and answer mark is converted into the text Text, as the interview response text.
15. The computer-readable storage medium according to claim 15, wherein the sentence classification of the interview response text according to the interview angle involved to obtain the classified text comprises:

Use each sentence in the interview response text as a basic sentence, and perform word segmentation processing on the basic sentence through a preset word segmentation method to obtain the basic word segmentation;

Converting the basic word segmentation into a word vector, and clustering the word vector through a clustering algorithm to obtain the cluster center corresponding to the basic sentence;

For each basic sentence, calculate the Euclidean distance between the cluster center corresponding to the basic sentence and the word vector corresponding to each preset interview angle, and use the preset interview angle with the smallest distance as the target classification of the basic sentence, The basic sentence is used as the classification text corresponding to the target classification.
17. The computer-readable storage medium of claim 17, wherein said performing word segmentation processing on said basic sentence through a preset word segmentation method to obtain basic word segmentation comprises:

Using a conditional random field model to segment the basic sentence to obtain an initial segmentation;

Obtain the word frequency of each initial word segmentation from the historical interview response text;

Based on the word frequency of the initial segmentation, the weight of the initial segmentation is generated, and the initial segmentation labeled with the weight is used as the basic segmentation.
The computer-readable storage medium according to claim 15, wherein the language extraction model is a two-way long and short-term memory network model, the two-way long and short-term memory network model includes a sentence encoder and a document encoder, and the language extraction The model extracts sentences from each type of the classified text, and obtains the extracted sentences including:

Splitting the text in the classified text by the sentence encoder according to characters to obtain basic characters;

Encoding the basic characters to obtain the encoding content corresponding to the basic characters;

Inputting the encoded content to a character encoding layer of initial weights, mapping each encoding into a character vector through the character encoding layer, and using each character vector as a sentence encoding result;

Concatenating the sentence encoding result into a hidden layer vector in the forward and reverse hidden layer outputs, and inputting the hidden layer vector to the document encoder;

The hidden layer vector is weighted by the document encoder to obtain a document feature vector, and the document feature vector is decoded, and the output result obtained by the decoding is used as the extraction sentence.
19. The computer-readable storage medium of claim 19, wherein said weighting said hidden layer vector by said document encoder to obtain a document feature vector comprises:

Use the following formula to determine the document feature vector:

Wherein, C i is the i-th document feature vector, j is the serial number of the embedding code, n is the number of embedding codes, b ij is the weight of the i-th document feature vector for the j-th hidden layer vector, h j is the j-th hidden layer vector, where the embedded coding is generated based on the hidden state of the bidirectional long short-term memory network model.