WO2021057146A1 - Voice-based interviewee determination method and device, terminal, and storage medium - Google Patents

Voice-based interviewee determination method and device, terminal, and storage medium Download PDF

Info

Publication number
WO2021057146A1
WO2021057146A1 PCT/CN2020/098891 CN2020098891W WO2021057146A1 WO 2021057146 A1 WO2021057146 A1 WO 2021057146A1 CN 2020098891 W CN2020098891 W CN 2020098891W WO 2021057146 A1 WO2021057146 A1 WO 2021057146A1
Authority
WO
WIPO (PCT)
Prior art keywords
confidence
confidence level
duration
interviewer
question
Prior art date
Application number
PCT/CN2020/098891
Other languages
French (fr)
Chinese (zh)
Inventor
黄竹梅
王志鹏
孙汀娟
周雅君
李恒
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021057146A1 publication Critical patent/WO2021057146A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

Definitions

  • This application relates to the field of speech recognition technology, and in particular to a method, device, terminal and storage medium for determining interviewers based on speech.
  • the first aspect of the present application provides a voice-based interviewer judgment method, the method includes:
  • the interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
  • the second aspect of the present application provides a voice-based interviewer determination device, the device includes:
  • the acquisition module is used to acquire the answer voice of the interviewer’s multiple questions
  • the slicing module is used to slice the answer voice of each question to obtain multiple voice fragments
  • the calculation module is used to calculate the volume characteristic, the speaking rate characteristic, the duration, and the intermittent duration of each question according to the multiple speech fragments;
  • the first determining module is configured to determine the emotional stability of the interviewer according to the volume characteristics of each question
  • the second determination module is configured to use a pre-built confidence determination model to determine the speaking rate feature, the intermittent duration, and the duration, and determine the interviewer's confidence;
  • the third determining module is configured to use a pre-built confidence level determination model to determine the speaking rate feature and the interruption duration, and determine the interviewer's response speed;
  • the output module is used to output the interview result of the interviewer according to the emotional stability, reaction speed and confidence.
  • a third aspect of the present application provides a terminal, the terminal includes a processor, and the processor is configured to implement the following steps when executing computer-readable instructions stored in a memory:
  • the interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
  • a fourth aspect of the present application provides a computer-readable storage medium having computer-readable instructions stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
  • the interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
  • the voice-based interviewer determination method, device, terminal, and storage medium described in this application can be applied to fields such as smart government affairs, thereby promoting the construction of smart cities.
  • This application obtains the answer speech of each question of the interviewer, slices the answer speech of each question to obtain multiple speech fragments, and extracts the volume characteristics, speaking rate characteristics, duration, and intermittent length of each of the speech fragments , Determine the emotional stability of the interviewer based on the volume characteristics, and then use the pre-built confidence judgment model and reaction speed judgment model to judge the speech rate characteristics, duration, and intermittent time to determine the interviewer’s confidence and
  • the reaction speed is to output the interview result of the interviewer according to the emotional stability, reaction speed, and self-confidence.
  • This application uses in-depth analysis and mining of the human-computer interaction voice during the interview process to determine multiple characteristics of the interviewer, such as emotional stability, reaction speed, and self-confidence. Through these characteristics, the interviewer can be evaluated objectively and comprehensively. The result is more precise and accurate, which improves the efficiency and quality of the interview judgment.
  • Fig. 1 is a flowchart of a voice-based interviewer judgment method provided by Embodiment 1 of the present application.
  • Fig. 2 is a structural diagram of a voice-based interviewer judging device provided in the second embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a terminal provided in Embodiment 3 of the present application.
  • Fig. 1 is a flowchart of a voice-based interviewer judgment method provided by Embodiment 1 of the present application.
  • the voice-based interviewer determination method can be applied to a terminal.
  • the voice-based interview provided by the method of this application can be directly integrated on the terminal.
  • the function determined by the user may be run in the terminal in the form of a Software Development Kit (SKD).
  • the voice-based interviewer judgment method specifically includes the following steps. According to different needs, the order of the steps in the flowchart can be changed, and some of the steps can be omitted.
  • the method before the obtaining voice answers to multiple questions of the interviewer, the method further includes:
  • the process of constructing the confidence level judgment model and the reaction speed judgment model includes:
  • the first significant feature with a large degree of confidence and the second significant feature with a large degree of response speed are selected from the multiple features, wherein the first significant Features include: speech rate characteristics, duration, and intermittent duration, and the second significant feature includes: speech rate features, intermittent duration;
  • a reaction speed determination model is constructed based on the plurality of second salient characteristics, the plurality of reaction speed grades, and the characteristic range corresponding to each of the reaction speed grades.
  • the self-confidence, emotional stability, and reaction speed of the sample speech of each question answered by multiple interviewers are labeled, and then the four relevant features and the corresponding labeling results are used as the learning object to establish a learning model , Found that: from the data distribution of each relevant feature in different degrees of confidence/emotional stability/reaction speed, the data distribution of people with different confidence/emotional stability/reaction speed is obvious and regular, thus The interviewer's confidence, emotional stability, and reaction speed can be quantitatively evaluated through four relevant characteristics of the interviewer: volume characteristics, speaking rate characteristics, duration and intermittent duration.
  • a feature type with a relatively large degree of discrimination According to the four relevant features and confidence levels of the sample speech, generate the first box plots of each relevant feature at different confidence levels and the second box plots of each relevant feature at different reaction speed levels, and start from the first The box chart identifies several first notable features that have a greater degree of discrimination in different levels of confidence: speaking rate characteristics, duration, and intermittent duration, and the second box chart determines the degree of discrimination between different levels of reaction speed The relatively large second significant features: the characteristics of speech rate, the length of the interruption. Finally, a self-confidence judgment model is constructed based on the three first salient features of speech rate, duration, and intermittent duration. A response speed judgment model is constructed based on the two second salient features of speech rate and intermittent duration.
  • the first box diagram is generated from the distribution of the eigenvalues of the first salient feature at different confidence levels
  • the second box diagram is generated from the distribution of the eigenvalues of the second salient feature at different reaction speed levels.
  • the salient feature corresponding to the salient feature when training the salient feature, it is necessary to determine the salient feature corresponding to the salient feature according to the maximum and minimum values corresponding to the salient feature in the box diagrams of different confidence/reaction speed levels. The range of characteristic values in different levels of confidence/reaction speed. After determining the feature value range corresponding to the salient feature at different confidence levels/reaction speed grades, it is necessary to determine whether the feature value range conforms to the extreme value consistency, for example, one salient feature corresponds to five confidence levels/reactions.
  • the feature value range needs to be changed.
  • the salient features in the above example correspond to the features in the five confidence/reaction speed grades
  • the value range is [a1,b1],[a2,b2],[a3,b3],[a4,b4],[a5,b5]
  • S12 Slice the answer speech of each question to obtain multiple speech fragments.
  • the interviewer's answer speech for each question is divided into multiple speech fragments.
  • the answer voice of each question of the interviewer is divided into 28 voice fragments.
  • S13 Calculate the volume characteristic, the speaking rate characteristic, the duration, and the intermittent duration of each question according to the multiple speech fragments.
  • the volume feature refers to the size of the interviewer's voice when answering questions.
  • the speaking rate feature refers to the speed of the interviewer in answering questions, and the amount of voice content per unit time.
  • the duration refers to the length of time that the interviewer continuously speaks when answering questions.
  • the intermittent duration refers to the length of time that the interviewer does not speak when answering questions.
  • Each voice segment has four related features: volume feature, speaking rate feature, duration, and intermittent duration. After averaging the related features of all voice segments of the same question, you can get the relevant feature of each question.
  • Mean Specifically, the volume characteristics of the multiple speech fragments of each question are averaged to obtain the mean value of the volume characteristics of each question; the speech rate characteristics of the multiple speech fragments of each question are averaged to obtain each question The mean value of the speech rate characteristics; average the duration of the multiple speech fragments of each question to obtain the mean duration of each question; average the discontinuous duration of the multiple speech fragments of each question to obtain each The mean duration of the problem. That is, the volume feature, the speech rate feature, the duration and the intermittent duration obtained from the multiple speech segments all refer to the average value.
  • the size of the sound can reflect the emotional stability of a person.
  • the determining the emotional stability of the interviewer according to the volume characteristics of each question includes:
  • volume characteristic amplitude values Correspondences between different volume characteristic amplitude values and emotional stability are preset. Once the interviewer’s volume characteristic amplitude values are determined, the emotional stability of the interviewer can be matched according to the correspondence.
  • the maximum volume feature of all questions is max
  • the minimum volume feature is min
  • the average volume feature of all questions is avg
  • the volume feature of each question is ai
  • the volume fluctuation range of each question Is
  • the average volume fluctuation range is less than 20%
  • the interviewer’s emotional stability is the first degree of stability, indicating that the interviewer’s emotional stability is "high”
  • the average volume fluctuation range is between 20%-30%
  • the emotional stability of the interviewer is determined to be the second degree of stability, indicating that the emotional stability of the interviewer is "medium”.
  • the average volume fluctuation is greater than 30%
  • the emotional stability of the interviewer is determined to be the third degree of stability , Indicating that the interviewee’s emotional stability is "low”.
  • S15 Use a pre-built confidence level determination model to determine the speech rate feature, interruption duration, and duration, and determine the interviewer's confidence level.
  • the use of a pre-built confidence level determination model to determine the speech rate feature, interruption duration, and duration duration, and determining the confidence level of the interviewer includes:
  • the average is rounded up to get the interviewer’s confidence judgment result.
  • the confidence levels of the five questions are determined as follows: Question 1-Confidence level A, Question 2-Confidence Level B, Question 3-Confidence Level B, Question 4-Confidence Level B, Question 5-Confidence Level A, sort the confidence levels corresponding to the 5 questions according to the serial number of the question.
  • ABBBA finally determines that the center position in ABBBA is B, and the target confidence level is B, as the final judgment result of the confidence of the interviewer in the interview process.
  • the scores of all questions can be converted into numerical values, and the numerical conversion results are averaged and rounded up (larger) to obtain a personal grade.
  • the scores of all questions can be converted into numerical values, and the numerical conversion results are averaged and rounded up (larger) to obtain a personal grade.
  • the average is 4.4, and the score is 5 points after rounding up (larger), then the interviewer's confidence level judgment result is Grade A.
  • the use of a pre-built confidence determination model to determine the speech rate characteristics, interruption duration, and duration of each question, and determining the confidence level of each question includes:
  • the confidence level of the target candidate in the confidence level ranking queue is the confidence level of the problem.
  • each confidence level in any feature box diagram determines a feature range (range Is the maximum and minimum of different levels), only when all the characteristics of a certain question (characteristics of speech rate, interruption duration, duration) are determined to be the same level, the confidence level of the question is determined to be this level .
  • range I the maximum and minimum of different levels
  • the confidence level of the question is determined to be this level .
  • the speech rate feature of a speech is 3.4
  • the interval length is 1.3
  • the duration is 5.6.
  • the speech rate feature range of grade B in the speech rate feature box chart is [3.2,4]
  • the interval time box chart The interval duration range of the middle level B is [0.8, 1.5], and the interval duration range of the B level in the duration box chart is [5.3, 5.7]. Because of the characteristics of speech rate, interval length and duration, all satisfy the range of level B. Therefore, the confidence level of this question is judged as B level for the first time.
  • the first confidence level is A and B
  • the second confidence is A and B
  • the third confidence is A and B, that is, the first confidence level
  • the first confidence level is A, B, and C
  • the second confidence is A, B, and C
  • the third confidence is A, B, and C. If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level, and the multiple first, second, and third confidence levels are all the same, the candidate confidence level There are multiple levels: level A, level B, and level C.
  • the confidence level ranking queue is ABC. Based on the law of large numbers, the confidence level of the target candidate is determined as level B as the confidence level of the problem.
  • the method further includes:
  • the first confidence level is A, B, and D
  • the second confidence is A, B, and E
  • the third confidence is A, B, and C. That is, the first, second, and third confidence levels are multiple and the multiple first, second, and third confidence levels are not the same, but the first Confidence, second confidence, and third confidence have the same grades A and B, then the same grades A and B are used as candidate confidence grades, and finally the confidence grade of the problem is determined based on the law of large numbers For the B grade.
  • the method further includes:
  • the neutral grade refers to the grade when all the grades are not met after traversing.
  • the pre-built confidence determination model determines that the confidence level corresponding to the speech rate feature of the question is grade A, and use the pre-built confidence determination model to determine the confidence level corresponding to the intermittent duration of the question Level B, using the pre-built confidence determination model to determine that the duration of the question corresponds to the level of confidence level A, because the question’s speech rate characteristics, interruption duration, and duration are not all of the same confidence level, then It is determined that the confidence level of the question does not belong to the A level and does not belong to the B level, that is, there is no situation where the first, second and third confidence levels are the same at the same time, then the confidence level of the question is determined to be the neutral level.
  • the problem of the neutral level is most likely to belong to the most general situation, that is, the C level, so the neutral level can be preset as the C level.
  • S16 Use a pre-built confidence determination model to determine the speech rate feature and the interruption duration, and determine the interviewer's response speed.
  • the S15 and the S16 are executed in parallel.
  • two threads can be started for synchronous execution at the same time.
  • One thread is used to determine the speech rate feature, interruption duration, and duration using a pre-built confidence determination model, and the other thread uses To use a pre-built reaction speed judgment model to judge the speech rate characteristics and the length of the interruption. Since the two threads are executed in parallel, it can improve the interviewer's confidence and response speed judgment efficiency, shorten the judgment time, and improve the efficiency of interview screening.
  • S17 Output the interview result of the interviewer according to the emotional stability, reaction speed, and confidence.
  • the interviewer In the interview process, after the interviewer’s voice analysis of the interviewer’s answer to the question, the interviewer’s emotional stability, reaction speed, and confidence, the interviewer can be selected according to the focus of the interview position to meet the interview requirements.
  • the voice-based interviewer judgment method described in this application obtains the answer voice of each question of the interviewer, slices the answer voice of each question, and obtains multiple voice fragments, and extracts each of the voices.
  • the volume characteristics, speaking rate characteristics, duration, and intermittent duration of the fragments are used to determine the emotional stability of the interviewer based on the volume characteristics, and then the pre-built confidence determination model and reaction speed determination model are used to determine the speech rate characteristics and duration.
  • Intermittent time judgment determine the confidence and reaction speed of the interviewer, and output the interview result of the interviewer according to the emotional stability, reaction speed and confidence.
  • This application uses in-depth analysis and mining of the human-computer interaction voice during the interview process to determine multiple characteristics of the interviewer, such as emotional stability, reaction speed, and self-confidence. Through these characteristics, the interviewer can be evaluated objectively and comprehensively. The result is more precise and accurate, which improves the efficiency and quality of the interview judgment.
  • this application can be applied in fields such as smart government affairs, so as to promote the development of smart cities.
  • Fig. 2 is a structural diagram of a voice-based interviewer judging device provided in the second embodiment of the present application.
  • the voice-based interviewer determination device 20 may include multiple functional modules composed of computer-readable instruction segments.
  • the computer-readable instructions of each program segment in the voice-based interviewer determination device 20 may be stored in the memory of the terminal and executed by the at least one processor to execute (see FIG. 1 for details). The function of the interviewer's judgment.
  • the voice-based interviewer determination device 20 can be divided into multiple functional modules according to the functions it performs.
  • the functional modules may include: an acquisition module 201, a construction module 202, a slicing module 203, a calculation module 204, a first determination module 205, a second determination module 206, a third determination module 207, and an output module 208.
  • the module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the functions of each module will be described in detail in subsequent embodiments.
  • the obtaining module 201 is used to obtain the answer voices of multiple questions of the interviewer.
  • the apparatus before the obtaining the answer voices of the multiple questions of the interviewer, the apparatus further includes:
  • the construction module 202 is used to construct a confidence degree judgment model and a reaction speed judgment model.
  • the process of constructing the confidence level judgment model and the reaction speed judgment model includes:
  • the first significant feature with a large degree of confidence and the second significant feature with a large degree of response speed are selected from the multiple features, wherein the first significant Features include: speech rate characteristics, duration, and intermittent duration, and the second significant feature includes: speech rate features, intermittent duration;
  • a reaction speed determination model is constructed based on the plurality of second salient characteristics, the plurality of reaction speed grades, and the characteristic range corresponding to each of the reaction speed grades.
  • the self-confidence, emotional stability, and reaction speed of the sample speech of each question answered by multiple interviewers are labeled, and then the four relevant features and the corresponding labeling results are used as the learning object to establish a learning model , Found that: from the data distribution of each relevant feature in different degrees of confidence/emotional stability/reaction speed, the data distribution of people with different confidence/emotional stability/reaction speed is obvious and regular, thus The interviewer's confidence, emotional stability, and reaction speed can be quantitatively evaluated through four relevant characteristics of the interviewer: volume characteristics, speaking rate characteristics, duration and intermittent duration.
  • a feature type with a relatively large degree of discrimination According to the four relevant features and confidence levels of the sample speech, generate the first box plots of each relevant feature at different confidence levels and the second box plots of each relevant feature at different reaction speed levels, and start from the first The box chart identifies several first notable features that have a greater degree of discrimination in different levels of confidence: speaking rate characteristics, duration, and intermittent duration, and the second box chart determines the degree of discrimination between different levels of reaction speed The relatively large second significant features: the characteristics of speech rate, the length of the interruption. Finally, a self-confidence judgment model is constructed based on the three first salient features of speech rate, duration, and intermittent duration. A response speed judgment model is constructed based on the two second salient features of speech rate and intermittent duration.
  • the first box diagram is generated from the distribution of the eigenvalues of the first salient feature at different confidence levels
  • the second box diagram is generated from the distribution of the eigenvalues of the second salient feature at different reaction speed levels.
  • the salient feature corresponding to the salient feature when training the salient feature, it is necessary to determine the salient feature corresponding to the salient feature according to the maximum and minimum values corresponding to the salient feature in the box diagrams of different confidence/reaction speed levels. The range of characteristic values in different levels of confidence/reaction speed. After determining the feature value range corresponding to the salient feature at different confidence levels/reaction speed grades, it is necessary to determine whether the feature value range conforms to the extreme value consistency, for example, one salient feature corresponds to five confidence levels/reactions.
  • the feature value range needs to be changed.
  • the salient features in the above example correspond to the features in the five confidence/reaction speed grades
  • the value range is [a1,b1],[a2,b2],[a3,b3],[a4,b4],[a5,b5]
  • the slicing module 203 is used to slice the answer speech of each question to obtain multiple speech fragments.
  • the interviewer's answer speech for each question is divided into multiple speech fragments.
  • the answer voice of each question of the interviewer is divided into 28 voice fragments.
  • the calculation module 204 is configured to calculate the volume characteristic, the speaking rate characteristic, the duration, and the intermittent duration of each question according to the multiple speech fragments.
  • the volume feature refers to the size of the interviewer's voice when answering questions.
  • the speaking rate feature refers to the speed of the interviewer in answering questions, and the amount of voice content per unit time.
  • the duration refers to the length of time that the interviewer continuously speaks when answering questions.
  • the intermittent duration refers to the length of time that the interviewer does not speak when answering questions.
  • Each voice segment has four related features: volume feature, speaking rate feature, duration, and intermittent duration. After averaging the related features of all voice segments of the same question, you can get the relevant feature of each question.
  • Mean Specifically, the volume characteristics of the multiple speech fragments of each question are averaged to obtain the mean value of the volume characteristics of each question; the speech rate characteristics of the multiple speech fragments of each question are averaged to obtain each question The mean value of the speech rate characteristics; average the duration of the multiple speech fragments of each question to obtain the mean duration of each question; average the discontinuous duration of the multiple speech fragments of each question to obtain each The mean duration of the problem. That is, the volume feature, the speech rate feature, the duration and the intermittent duration obtained from the multiple speech segments all refer to the average value.
  • the first determining module 205 is configured to determine the emotional stability of the interviewer according to the volume characteristics of each question.
  • the size of the sound can reflect the emotional stability of a person.
  • the first determining module 205 determining the emotional stability of the interviewer according to the volume characteristics of each question includes:
  • volume characteristic amplitude values Correspondences between different volume characteristic amplitude values and emotional stability are preset. Once the interviewer’s volume characteristic amplitude values are determined, the emotional stability of the interviewer can be matched according to the correspondence.
  • the maximum volume feature of all questions is max
  • the minimum volume feature is min
  • the average volume feature of all questions is avg
  • the volume feature of each question is ai
  • the volume fluctuation range of each question Is
  • the average volume fluctuation range is less than 20%
  • the interviewer’s emotional stability is the first degree of stability, indicating that the interviewer’s emotional stability is "high”
  • the average volume fluctuation range is between 20%-30%
  • the emotional stability of the interviewer is determined to be the second degree of stability, indicating that the emotional stability of the interviewer is "medium”.
  • the average volume fluctuation is greater than 30%
  • the emotional stability of the interviewer is determined to be the third degree of stability , Indicating that the interviewee’s emotional stability is "low”.
  • the second determining module 206 is configured to use a pre-built confidence level determination model to determine the speech rate feature, interruption duration, and duration duration, and determine the confidence level of the interviewer.
  • the second determining module 206 uses a pre-built confidence level determination model to determine the speech rate feature, interruption duration, and duration, and determining the interviewer’s confidence level includes:
  • the average is rounded up to get the interviewer’s confidence judgment result.
  • the confidence levels of the five questions are determined as follows: Question 1-Confidence level A, Question 2-Confidence Level B, Question 3-Confidence Level B, Question 4-Confidence Level B, Question 5-Confidence Level A, sort the confidence levels corresponding to the 5 questions according to the serial number of the question.
  • ABBBA finally determines that the center position in ABBBA is B, and the target confidence level is B, as the final judgment result of the confidence of the interviewer in the interview process.
  • the scores of all questions can be converted into numerical values, and the numerical conversion results are averaged and rounded up (larger) to obtain a personal grade.
  • the scores of all questions can be converted into numerical values, and the numerical conversion results are averaged and rounded up (larger) to obtain a personal grade.
  • the average is 4.4, and the score is 5 points after rounding up (larger), then the interviewer's confidence level judgment result is Grade A.
  • the use of a pre-built confidence determination model to determine the speech rate characteristics, interruption duration, and duration of each question, and determining the confidence level of each question includes:
  • the confidence level of the target candidate in the confidence level ranking queue is the confidence level of the problem.
  • each confidence level in any feature box diagram determines a feature range (range Is the maximum and minimum of different levels), only when all the characteristics of a certain question (characteristics of speech rate, interruption duration, duration) are determined to be the same level, the confidence level of the question is determined to be this level .
  • range I the maximum and minimum of different levels
  • the confidence level of the question is determined to be this level .
  • the speech rate feature of a speech is 3.4
  • the interval length is 1.3
  • the duration is 5.6.
  • the speech rate feature range of grade B in the speech rate feature box chart is [3.2,4]
  • the interval time box chart The interval duration range of the middle level B is [0.8, 1.5], and the interval duration range of the B level in the duration box chart is [5.3, 5.7]. Because of the characteristics of speech rate, interval length and duration, all satisfy the range of level B. Therefore, the confidence level of this question is judged as B level for the first time.
  • the first confidence level is A and B
  • the second confidence is A and B
  • the third confidence is A and B, that is, the first confidence level
  • the first confidence level is A, B, and C
  • the second confidence is A, B, and C
  • the third confidence is A, B, and C. If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level, and the multiple first, second, and third confidence levels are all the same, the candidate confidence level There are multiple levels: level A, level B, and level C.
  • the confidence level ranking queue is ABC. Based on the law of large numbers, the confidence level of the target candidate is determined as level B as the confidence level of the problem.
  • the device further includes:
  • the judgment module is used to judge whether the multiple levels of the first confidence level, the second confidence level, and the third confidence level have the same level;
  • the judgment module is also used to determine the same grade as the candidate confidence grade if there are the same grades.
  • the first confidence level is A, B, and D
  • the second confidence is A, B, and E
  • the third confidence is A, B, and C. That is, the first, second, and third confidence levels are multiple and the multiple first, second, and third confidence levels are not the same, but the first Confidence, second confidence, and third confidence have the same grades A and B, then the same grades A and B are used as candidate confidence grades, and finally the confidence grade of the problem is determined based on the law of large numbers For the B grade.
  • the third determining module 207 is further configured to determine The confidence level of the problem is neutral.
  • the neutral grade refers to the grade when all the grades are not met after traversing.
  • the pre-built confidence determination model determines that the confidence level corresponding to the speech rate feature of the question is grade A, and use the pre-built confidence determination model to determine the confidence level corresponding to the intermittent duration of the question Level B, using the pre-built confidence determination model to determine that the duration of the question corresponds to the level of confidence level A, because the question’s speech rate characteristics, interruption duration, and duration are not all of the same confidence level, then It is determined that the confidence level of the question does not belong to the A level and does not belong to the B level, that is, there is no situation where the first, second and third confidence levels are the same at the same time, then the confidence level of the question is determined to be the neutral level.
  • the problem of the neutral level is most likely to belong to the most general situation, that is, the C level, so the neutral level can be preset as the C level.
  • the third determining module 207 is further configured to use a pre-built confidence level determination model to determine the speech rate characteristics and interruption duration, and determine the interviewer's response speed.
  • the second determining module 206 and the third determining module 207 are executed in parallel.
  • two threads can be started for synchronous execution at the same time.
  • One thread is used to determine the speech rate feature, interruption duration, and duration using a pre-built confidence determination model, and the other thread uses To use a pre-built reaction speed judgment model to judge the speech rate characteristics and the length of the interruption. Since the two threads are executed in parallel, it can improve the interviewer's confidence and response speed judgment efficiency, shorten the judgment time, and improve the efficiency of interview screening.
  • the output module 208 is configured to output the interview result of the interviewer according to the emotional stability, reaction speed, and confidence.
  • the interviewer In the interview process, after the interviewer’s voice analysis of the interviewer’s answer to the question, the interviewer’s emotional stability, reaction speed, and confidence, the interviewer can be selected according to the focus of the interview position to meet the interview requirements.
  • the voice-based interviewer judgment device described in this application obtains the answer voice of each question of the interviewer, slices the answer voice of each question, and obtains multiple voice fragments, and extracts each of the voices.
  • the volume characteristics, speaking rate characteristics, duration, and intermittent duration of the fragments are used to determine the emotional stability of the interviewer based on the volume characteristics, and then the pre-built confidence determination model and reaction speed determination model are used to determine the speech rate characteristics and duration.
  • Intermittent time judgment determine the confidence and reaction speed of the interviewer, and output the interview result of the interviewer according to the emotional stability, reaction speed and confidence.
  • This application uses in-depth analysis and mining of the human-computer interaction voice during the interview process to determine multiple characteristics of the interviewer, such as emotional stability, reaction speed, and self-confidence. Through these characteristics, the interviewer can be evaluated objectively and comprehensively. The result is more precise and accurate, which improves the efficiency and quality of the interview judgment.
  • this application can be applied in fields such as smart government affairs, so as to promote the development of smart cities.
  • the terminal 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.
  • the structure of the terminal shown in FIG. 3 does not constitute a limitation of the embodiment of the present application. It may be a bus-type structure or a star structure. The terminal 3 may also include more More or less other hardware or software, or different component arrangements.
  • the terminal 3 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. Its hardware includes but is not limited to a microprocessor, an application specific integrated circuit, and Programming gate arrays, digital processors and embedded devices, etc.
  • the terminal 3 may also include client equipment.
  • the client equipment includes, but is not limited to, any electronic product that can interact with the client through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, a personal computer. Computers, tablets, smart phones, digital cameras, etc.
  • terminal 3 is only an example. If other existing or future electronic products can be adapted to this application, they should also be included in the protection scope of this application and included here by reference.
  • the memory 31 is used to store computer-readable instructions and various data, such as a device installed in the terminal 3, and realize high-speed and automatic completion of programs or data during the operation of the terminal 3 Access.
  • the memory 31 includes volatile and non-volatile memory, for example, random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), and programmable read-only memory (Programmable Read-Only Memory).
  • PROM Erasable Programmable Read-Only Memory
  • EPROM Erasable Programmable Read-Only Memory
  • OTPROM One-time Programmable Read-Only Memory
  • EEPROM electronically erasable rewritable Read-only memory
  • CD-ROM Compact Disc Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • the computer-readable storage medium may be non-volatile or volatile.
  • the at least one processor 32 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one Or a combination of multiple central processing units (CPU), microprocessors, digital processing chips, graphics processors, and various control chips.
  • the at least one processor 32 is the control core (Control Unit) of the terminal 3.
  • Various interfaces and lines are used to connect the various components of the entire terminal 3, and by running or executing programs or modules stored in the memory 31, And call the data stored in the memory 31 to execute various functions of the terminal 3 and process data.
  • the at least one communication bus 33 is configured to implement connection and communication between the memory 31 and the at least one processor 32 and the like.
  • the terminal 3 may also include a power source (such as a battery) for supplying power to various components.
  • the power source may be logically connected to the at least one processor 32 through a power management device, so as to realize management through the power management device. Functions such as charging, discharging, and power management.
  • the power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators.
  • the terminal 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the above-mentioned integrated unit implemented in the form of a software function module may be stored in a computer readable storage medium.
  • the above-mentioned software function module is stored in a storage medium and includes several instructions to make a computer device (which may be a personal computer, a terminal, or a network device, etc.) or a processor execute the method described in each embodiment of the present application. section.
  • the at least one processor 32 can execute the operating device of the terminal 3 and various installed applications, computer-readable instructions, etc., such as the above-mentioned modules.
  • the memory 31 stores computer-readable instructions, and the at least one processor 32 can call the computer-readable instructions stored in the memory 31 to perform related functions.
  • the various modules described in FIG. 2 are computer-readable instructions stored in the memory 31 and executed by the at least one processor 32, so as to realize the functions of the various modules.
  • the memory 31 stores multiple instructions, and the multiple instructions are executed by the at least one processor 32 to implement all or part of the steps in the method described in the present application.
  • the disclosed device and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
  • modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A voice-based interviewee determination method, a voice-based interviewee determination device (20), a terminal (3), and a storage medium. The voice-based interviewee determination method comprises: obtaining answer voices of a plurality of questions for an interviewee (S11); slicing the answer voice of each question to obtain a plurality of voice segments (S12); calculating the volume characteristic, the speed characteristic, the continuity duration, and the discontinuity duration for each question according to the plurality of voice segments (S13); determining the emotion stability of the interviewee according to the volume characteristic for each question (S14); determining the speed characteristics, the discontinuity durations, and the continuity durations by using a pre-constructed self-confidence degree determination model to determine the self-confidence degree of the interviewee (S15); determining the speed characteristics and the discontinuity durations by using the pre-constructed self-confidence degree determination model to determine the response speed of the interviewee (S16); and outputting an interview result of the interviewee according to the emotion stability, the response speed, and the self-confidence degree (S17). According to the voice-based interviewee determination method, the interviewee can be objectively and comprehensively evaluated, so that an evaluation result is more precise and accurate.

Description

基于语音的面试者判定方法、装置、终端及存储介质Voice-based interviewer judgment method, device, terminal and storage medium
本申请要求于2019年09月23日提交中国专利局、申请号为201910900813.9,发明名称为“基于语音的面试者判定方法、装置、终端及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on September 23, 2019, the application number is 201910900813.9, and the invention title is "Voice-based Interviewer Judgment Method, Device, Terminal, and Storage Medium", and its entire content Incorporated in this application by reference.
技术领域Technical field
本申请涉及语音识别技术领域,具体涉及一种基于语音的面试者判定方法、装置、终端及存储介质。This application relates to the field of speech recognition technology, and in particular to a method, device, terminal and storage medium for determining interviewers based on speech.
背景技术Background technique
招聘是每个企业必不可少的一个环节,招聘效率不管是对企业的下一步发展战略还是企业成本都是至关重要的。然而由于应聘量较大,需要处理的简历多,带来了较大的工作量。Recruitment is an indispensable part of every company. Recruitment efficiency is crucial to the company's next development strategy and company cost. However, due to the large number of applicants, there are many resumes that need to be processed, which brings a large amount of work.
现有技术中,主要是通过面试官与面试者的面对面交流来判断面试者是否符合招聘需求,虽然也有通过人机交互的方式,获取面试者的语音,通过语音来进行面试。然而,发明人意识到这种人机交互的方式,也仅仅是分析语音的内容来判断面试者回答的问题是否正确,没有对面试者的语音进行深入分析,比如,分析面试者的情绪稳定性、反应速度以及自信度等特质。而这些特质对于岗位的匹配度也至关重要。In the prior art, it is mainly through face-to-face communication between the interviewer and the interviewer to determine whether the interviewer meets the recruitment requirements, although there is also a way of human-computer interaction to obtain the interviewer's voice and conduct the interview through voice. However, the inventor realized that this way of human-computer interaction is only to analyze the content of the voice to determine whether the interviewer answered the question correctly, and did not conduct an in-depth analysis of the interviewer’s voice, such as analyzing the emotional stability of the interviewer. , Reaction speed and self-confidence. And these characteristics are also crucial for job matching.
因此,如何在面试中通过分析语音特征快速且全面的评定面试者,是一个亟待解决的技术问题。Therefore, how to quickly and comprehensively evaluate interviewers by analyzing voice characteristics in an interview is a technical problem that needs to be solved urgently.
发明内容Summary of the invention
鉴于以上内容,有必要提出一种基于语音的面试者判定方法、装置、终端及存储介质,通过对面试过程人机交互的语音进行深入分析与挖掘,确定出面试者的多个特征,通过这些特征客观且全面的评价面试者,评价结果更为精确和准确。In view of the above, it is necessary to propose a voice-based interviewer judgment method, device, terminal, and storage medium. Through in-depth analysis and mining of the human-computer interaction voice during the interview, multiple characteristics of the interviewer are determined, and through these The characteristics of the interviewer are objectively and comprehensively evaluated, and the evaluation results are more precise and accurate.
本申请的第一方面提供一种基于语音的面试者判定方法,所述方法包括:The first aspect of the present application provides a voice-based interviewer judgment method, the method includes:
获取面试者的多个问题的回答语音;Obtain the answer voice of the interviewer's multiple questions;
对每个问题的回答语音进行切片,得到多个语音片段;Slice the answer voice of each question to get multiple voice fragments;
根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长;Calculate the volume characteristics, speech rate characteristics, duration, and intermittent duration of each question according to the multiple speech fragments;
根据所述每个问题的音量特征确定所述面试者的情绪稳定度;Determine the emotional stability of the interviewer according to the volume characteristics of each question;
使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度;Use a pre-built confidence judgment model to judge the speaking rate feature, the intermittent duration, and the duration to determine the interviewer's confidence;
使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度;Use a pre-built confidence judgment model to judge the speech rate characteristics and the length of the interruption, and determine the interviewer's response speed;
根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
本申请的第二方面提供一种基于语音的面试者判定装置,所述装置包括:The second aspect of the present application provides a voice-based interviewer determination device, the device includes:
获取模块,用于获取面试者的多个问题的回答语音;The acquisition module is used to acquire the answer voice of the interviewer’s multiple questions
切片模块,用于对每个问题的回答语音进行切片,得到多个语音片段;The slicing module is used to slice the answer voice of each question to obtain multiple voice fragments;
计算模块,用于根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长;The calculation module is used to calculate the volume characteristic, the speaking rate characteristic, the duration, and the intermittent duration of each question according to the multiple speech fragments;
第一确定模块,用于根据所述每个问题的音量特征确定所述面试者的情绪稳定度;The first determining module is configured to determine the emotional stability of the interviewer according to the volume characteristics of each question;
第二确定模块,用于使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度;The second determination module is configured to use a pre-built confidence determination model to determine the speaking rate feature, the intermittent duration, and the duration, and determine the interviewer's confidence;
第三确定模块,用于使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度;The third determining module is configured to use a pre-built confidence level determination model to determine the speaking rate feature and the interruption duration, and determine the interviewer's response speed;
输出模块,用于根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The output module is used to output the interview result of the interviewer according to the emotional stability, reaction speed and confidence.
本申请的第三方面提供一种终端,所述终端包括处理器,所述处理器用于执行存储器中存储的计算机可读指令时实现以下步骤:A third aspect of the present application provides a terminal, the terminal includes a processor, and the processor is configured to implement the following steps when executing computer-readable instructions stored in a memory:
获取面试者的多个问题的回答语音;Obtain the answer voice of the interviewer's multiple questions;
对每个问题的回答语音进行切片,得到多个语音片段;Slice the answer voice of each question to get multiple voice fragments;
根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长;Calculate the volume characteristics, speech rate characteristics, duration, and intermittent duration of each question according to the multiple speech fragments;
根据所述每个问题的音量特征确定所述面试者的情绪稳定度;Determine the emotional stability of the interviewer according to the volume characteristics of each question;
使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度;Use a pre-built confidence judgment model to judge the speaking rate feature, the intermittent duration, and the duration to determine the interviewer's confidence;
使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度;Use a pre-built confidence judgment model to judge the speech rate characteristics and the length of the interruption, and determine the interviewer's response speed;
根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
本申请的第四方面提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:A fourth aspect of the present application provides a computer-readable storage medium having computer-readable instructions stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
获取面试者的多个问题的回答语音;Obtain the answer voice of the interviewer's multiple questions;
对每个问题的回答语音进行切片,得到多个语音片段;Slice the answer voice of each question to get multiple voice fragments;
根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长;Calculate the volume characteristics, speech rate characteristics, duration, and intermittent duration of each question according to the multiple speech fragments;
根据所述每个问题的音量特征确定所述面试者的情绪稳定度;Determine the emotional stability of the interviewer according to the volume characteristics of each question;
使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度;Use a pre-built confidence judgment model to judge the speaking rate feature, the intermittent duration, and the duration to determine the interviewer's confidence;
使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度;Use a pre-built confidence judgment model to judge the speech rate characteristics and the length of the interruption, and determine the interviewer's response speed;
根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
综上所述,本申请所述的基于语音的面试者判定方法、装置、终端及存储介质,可应用于智慧政务等领域,从而推动智慧城市的建设。本申请通过获取面试者的每个问题的回答语音,对每个问题的回答语音进行切片,得到多个语音片段,提取每个所述语音片段的音量特征、语速特征、持续时长、间断时长,基于音量特征确定出面试者的情绪稳定度,再采用预先构建的自信度判定模型和反应速度判定模型对所述语速特征、持续时长、间断时长进行判断,确定出面试者的自信度和反应速度,根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。本申请通过对面试过程人机交互的语音进行深入分析与挖掘,确定出面试者的多个特征,例如情绪稳定度、反应速度和自信度,通过这些特征能够客观且全面的评价面试者,评价结果更为精确和准确,提高了面试的判定效率和质量。In summary, the voice-based interviewer determination method, device, terminal, and storage medium described in this application can be applied to fields such as smart government affairs, thereby promoting the construction of smart cities. This application obtains the answer speech of each question of the interviewer, slices the answer speech of each question to obtain multiple speech fragments, and extracts the volume characteristics, speaking rate characteristics, duration, and intermittent length of each of the speech fragments , Determine the emotional stability of the interviewer based on the volume characteristics, and then use the pre-built confidence judgment model and reaction speed judgment model to judge the speech rate characteristics, duration, and intermittent time to determine the interviewer’s confidence and The reaction speed is to output the interview result of the interviewer according to the emotional stability, reaction speed, and self-confidence. This application uses in-depth analysis and mining of the human-computer interaction voice during the interview process to determine multiple characteristics of the interviewer, such as emotional stability, reaction speed, and self-confidence. Through these characteristics, the interviewer can be evaluated objectively and comprehensively. The result is more precise and accurate, which improves the efficiency and quality of the interview judgment.
附图说明Description of the drawings
图1是本申请实施例一提供的基于语音的面试者判定方法的流程图。Fig. 1 is a flowchart of a voice-based interviewer judgment method provided by Embodiment 1 of the present application.
图2是本申请实施例二提供的基于语音的面试者判定装置的结构图。Fig. 2 is a structural diagram of a voice-based interviewer judging device provided in the second embodiment of the present application.
图3是本申请实施例三提供的终端的结构示意图。FIG. 3 is a schematic structural diagram of a terminal provided in Embodiment 3 of the present application.
具体实施方式detailed description
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to be able to understand the above objectives, features and advantages of the application more clearly, the application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the application and the features in the embodiments can be combined with each other if there is no conflict.
在下面的描述中阐述了很多具体细节以便于充分理解本申请,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In the following description, many specific details are set forth in order to fully understand the present application, and the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人 员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of this application. The terms used in the specification of the application herein are only for the purpose of describing specific embodiments, and are not intended to limit the application.
实施例一Example one
图1是本申请实施例一提供的基于语音的面试者判定方法的流程图。Fig. 1 is a flowchart of a voice-based interviewer judgment method provided by Embodiment 1 of the present application.
在本实施例中,所述基于语音的面试者判定方法可以应用于终端中,对于需要进行基于语音的面试者判定的终端,可以直接在终端上集成本申请的方法所提供的基于语音的面试者判定的功能,或者以软件开发工具包(Software Development Kit,SKD)的形式运行在终端中。In this embodiment, the voice-based interviewer determination method can be applied to a terminal. For a terminal that needs voice-based interviewer determination, the voice-based interview provided by the method of this application can be directly integrated on the terminal. The function determined by the user may be run in the terminal in the form of a Software Development Kit (SKD).
如图1所示,所述基于语音的面试者判定方法具体包括以下步骤,根据不同的需求,该流程图中步骤的顺序可以改变,某些可以省略。As shown in Fig. 1, the voice-based interviewer judgment method specifically includes the following steps. According to different needs, the order of the steps in the flowchart can be changed, and some of the steps can be omitted.
S11,获取面试者的多个问题的回答语音。S11: Acquire answer voices for multiple questions of the interviewer.
预先根据招聘岗位的需求设置多个问题,通过人机交互的方式,获取面试过程中面试者与机器的针对每个问题的语音,再将机器发出的提问语音和面试者的回答语音分离开来,最后筛选出面试者的回答语音。Set up multiple questions in advance according to the needs of the recruitment position, and obtain the voice of the interviewer and the machine for each question in the interview process through human-computer interaction, and then separate the question voice issued by the machine from the interviewer’s answer voice , And finally screen out the interviewer’s answer voice.
作为一种可选的实施例,在所述获取面试者的多个问题的回答语音之前,所述方法还包括:As an optional embodiment, before the obtaining voice answers to multiple questions of the interviewer, the method further includes:
构建自信度判定模型和反应速度判定模型。Construct a self-confidence judgment model and a reaction speed judgment model.
其中,所述自信度判定模型和反应速度判定模型的构建过程包括:Wherein, the process of constructing the confidence level judgment model and the reaction speed judgment model includes:
获取多个样本语音;Obtain multiple sample voices;
提取所述多个样本语音中的多个特征;Extracting multiple features in the multiple sample voices;
根据所述多个特征的分布情况,从所述多个特征中筛选出自信度区分度大的第一显著特征和筛选出反应速度区分度大的第二显著特征,其中,所述第一显著特征包括:语速特征、持续时长、间断时长,所述第二显著特征包括:语速特征、间断时长;According to the distribution of the multiple features, the first significant feature with a large degree of confidence and the second significant feature with a large degree of response speed are selected from the multiple features, wherein the first significant Features include: speech rate characteristics, duration, and intermittent duration, and the second significant feature includes: speech rate features, intermittent duration;
确定所述多个第一显著特征对应的多个自信度档次以及每个所述自信度档次对应的特征范围,及确定所述多个第二显著特征对应的多个反应速度档次以及每个所述反应速度档次对应的特征范围;Determine the multiple confidence levels corresponding to the multiple first salient features and the feature range corresponding to each of the confidence levels, and determine the multiple response speed levels corresponding to the multiple second salient features and each Describe the characteristic range corresponding to the reaction speed grade;
分别判断不同自信度档次的特征范围和不同反应速度档次的特征范围是否符合极值一致性;Judge whether the characteristic ranges of different confidence levels and the characteristic ranges of different reaction speed levels are consistent with extreme values;
若不同自信度档次的特征范围符合极值一致性,基于所述多个第一显著特征、多个自信度档次以及每个所述自信度档次对应的特征范围,构建自信度判定模型;If the feature ranges of different confidence levels are consistent with extreme values, construct a confidence determination model based on the multiple first salient features, multiple confidence levels, and the feature range corresponding to each of the confidence levels;
若不同反应速度档次的特征范围符合极值一致性,基于所述多个第二显著特征、多个反应速度档次以及每个所述反应速度档次对应的特征范围,构建反应速度判定模型。If the characteristic ranges of different reaction speed grades meet the extreme consistency, a reaction speed determination model is constructed based on the plurality of second salient characteristics, the plurality of reaction speed grades, and the characteristic range corresponding to each of the reaction speed grades.
通过大量的实验:对多个面试者回答的每个问题的样本语音进行自信度、情绪稳定度、反应速度的标注,再将四个相关特征及对应的标注结果作为学习的对象建立学习的模型,发现:从每个相关特征在不同程度的自信度/情绪稳定度/反应速度上的数据分布来看,不同自信度/情绪稳定度/反应速度的人数据差异分布明显且有规律,由此可以通过面试者的四个相关特征:音量特征、语速特征、持续时长和间断时长对面试者的自信度、情绪稳定度、反应速度进行量化评定。Through a large number of experiments: the self-confidence, emotional stability, and reaction speed of the sample speech of each question answered by multiple interviewers are labeled, and then the four relevant features and the corresponding labeling results are used as the learning object to establish a learning model , Found that: from the data distribution of each relevant feature in different degrees of confidence/emotional stability/reaction speed, the data distribution of people with different confidence/emotional stability/reaction speed is obvious and regular, thus The interviewer's confidence, emotional stability, and reaction speed can be quantitatively evaluated through four relevant characteristics of the interviewer: volume characteristics, speaking rate characteristics, duration and intermittent duration.
再通过观察发现四个不同的音量特征、语速特征、持续时长、间断时长在不同自信度和不同反应速度的分布情况,从中确定出了不同自信度区分度比较大的特征类型和不同反应速度区分度比较大的特征类型。根据样本语音的四个相关特征和自信度档次,生成每个相关特征在不同自信度档次的第一箱型图和每个相关特征在不同反应速度档次的第二箱型图,并从第一箱型图中确定了在不同档次自信度区分度比较大的几种第一显著特征:语速特征、持续时长、间断时长,及从第二箱型图中确定了在不同档次反应速度区分度比较大的几种第二显著特征:语速特征、间断时长。最后基于语速特征、持续时长、间断时长这三种第一显著特征构建出了自信度判定模型。基于语速特征、间断时长这两种第二显著特征构建出了反应速 度判定模型。Through observation, we found the distribution of four different volume characteristics, speaking rate characteristics, duration, and intermittent duration at different confidence levels and different reaction speeds, and determined the characteristic types with different degrees of confidence and different reaction speeds. A feature type with a relatively large degree of discrimination. According to the four relevant features and confidence levels of the sample speech, generate the first box plots of each relevant feature at different confidence levels and the second box plots of each relevant feature at different reaction speed levels, and start from the first The box chart identifies several first notable features that have a greater degree of discrimination in different levels of confidence: speaking rate characteristics, duration, and intermittent duration, and the second box chart determines the degree of discrimination between different levels of reaction speed The relatively large second significant features: the characteristics of speech rate, the length of the interruption. Finally, a self-confidence judgment model is constructed based on the three first salient features of speech rate, duration, and intermittent duration. A response speed judgment model is constructed based on the two second salient features of speech rate and intermittent duration.
其中,所述第一箱型图是由第一显著特征的特征值在不同自信度档次分布生成的,所述第二箱型图是由第二显著特征的特征值在不同反应速度档次分布生成的。Wherein, the first box diagram is generated from the distribution of the eigenvalues of the first salient feature at different confidence levels, and the second box diagram is generated from the distribution of the eigenvalues of the second salient feature at different reaction speed levels. of.
本申请实施例中,在对所述显著特征进行训练时,需要根据所述显著特征对应的在不同自信度/反应速度档次的箱型图中最大值以及最小值来确定所述显著特征对应的在不同自信度/反应速度档次的特征值范围。在确定了所述显著特征对应的在不同自信度/反应速度档次的特征值范围后,需要判断所述特征值范围是否符合极值一致性,比如一个显著特征对应的在五个自信度/反应速度档次的特征比值范围为[a1,b1],[a2,b2],[a3,b3],[a4,b4],[a5,b5],若自信度/反应速度档次在这个显著特征上是单调递增的,即,自信度/反应速度档次越高,该显著特征对应的特征比值的最大值与最小值越大,若特征比值范围满足a1<=a2<=a3<=a4<=a5,b1<=b2<=b3<=b4<=b5。这时候可以确定不同自信度/反应速度档次的特征比值范围符合极值一致性。根据所述多个第二特征中的显著特征、多个自信度/反应速度档次以及每个所述自信度/反应速度档次对应的特征值范围,生成自信度/反应速度判定模型。In the embodiment of the present application, when training the salient feature, it is necessary to determine the salient feature corresponding to the salient feature according to the maximum and minimum values corresponding to the salient feature in the box diagrams of different confidence/reaction speed levels. The range of characteristic values in different levels of confidence/reaction speed. After determining the feature value range corresponding to the salient feature at different confidence levels/reaction speed grades, it is necessary to determine whether the feature value range conforms to the extreme value consistency, for example, one salient feature corresponds to five confidence levels/reactions. The characteristic ratio range of the speed grade is [a1,b1], [a2,b2], [a3,b3], [a4,b4], [a5,b5], if the confidence level/reaction speed grade is on this significant feature Monotonically increasing, that is, the higher the level of confidence/reaction speed, the larger the maximum and minimum value of the feature ratio corresponding to the salient feature. If the range of feature ratio satisfies a1<=a2<=a3<=a4<=a5, b1<=b2<=b3<=b4<=b5. At this time, it can be determined that the range of characteristic ratios of different confidence levels/reaction speed grades conforms to the extreme value consistency. According to the salient features of the plurality of second characteristics, the plurality of confidence/reaction speed grades, and the feature value range corresponding to each of the confidence/reaction speed grades, a confidence/reaction speed determination model is generated.
可选的,若不同自信度/反应速度档次的特征比值范围不符合极值一致性时,需要变更特征值范围,比如上面例子中的显著特征对应的在五个自信度/反应速度档次的特征值范围为[a1,b1],[a2,b2],[a3,b3],[a4,b4],[a5,b5],自信度/反应速度档次在这个显著特征上是单调递增的,若某个特征比值范围不满足a1<=a2<=a3<=a4<=a5,b1<=b2<=b3<=b4<=b5,需要将该特征值范围变更为下一档次的特征值范围,比如:a1>a2<=a3<=a4<=a5,这时需要将a1的值变更为a2的值,使得a1<=a2<=a3<=a4<=a5成立。Optionally, if the feature ratio range of different confidence/reaction speed grades does not meet the extreme consistency, the feature value range needs to be changed. For example, the salient features in the above example correspond to the features in the five confidence/reaction speed grades The value range is [a1,b1],[a2,b2],[a3,b3],[a4,b4],[a5,b5], the confidence level/reaction speed grade is monotonically increasing in this significant feature, if A certain characteristic ratio range does not satisfy a1<=a2<=a3<=a4<=a5, b1<=b2<=b3<=b4<=b5, it is necessary to change the characteristic value range to the next grade characteristic value range For example: a1>a2<=a3<=a4<=a5, then the value of a1 needs to be changed to the value of a2, so that a1<=a2<=a3<=a4<=a5 holds true.
应当理解的是,还可以预先划分更多的档次或更少的档次,本申请对此不做具体的限定。It should be understood that more grades or fewer grades can also be pre-divided, which is not specifically limited in this application.
S12,对每个问题的回答语音进行切片,得到多个语音片段。S12: Slice the answer speech of each question to obtain multiple speech fragments.
面试者对每个问题进行回答之后,将面试者针对每个问题的回答语音进行切分成多个语音片段。After the interviewer answers each question, the interviewer's answer speech for each question is divided into multiple speech fragments.
示例性的,将面试者的每个问题的回答语音切分为28个语音片段。Exemplarily, the answer voice of each question of the interviewer is divided into 28 voice fragments.
S13,根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长。S13: Calculate the volume characteristic, the speaking rate characteristic, the duration, and the intermittent duration of each question according to the multiple speech fragments.
所述音量特征是指面试者在回答问题时的声音的大小。The volume feature refers to the size of the interviewer's voice when answering questions.
所述语速特征是指面试者在回答问题时的快慢,单位时间内语音内容的多少。The speaking rate feature refers to the speed of the interviewer in answering questions, and the amount of voice content per unit time.
所述持续时长是指面试者在回答问题时连续说话的时间长短。The duration refers to the length of time that the interviewer continuously speaks when answering questions.
所述间断时长是指面试者在回答问题时未说话的时间长短。The intermittent duration refers to the length of time that the interviewer does not speak when answering questions.
每个语音片段都有四个相关特征:音量特征、语速特征、持续时长、间断时长,将同一个问题的所有语音片段的相关特征进行平均之后,即可得到每个问题每个相关特征的均值。具体而言,将每个问题的所述多个语音片段的音量特征进行平均得到每个问题的音量特征均值;将每个问题的所述多个语音片段的语速特征进行平均得到每个问题的语速特征均值;将每个问题的所述多个语音片段的持续时长进行平均得到每个问题的持续时长均值;将每个问题的所述多个语音片段的间断时长进行平均得到每个问题的间断时长均值。即,根据所述多个语音片段得到的音量特征、语速特征、持续时长和间断时长均是指均值。Each voice segment has four related features: volume feature, speaking rate feature, duration, and intermittent duration. After averaging the related features of all voice segments of the same question, you can get the relevant feature of each question. Mean. Specifically, the volume characteristics of the multiple speech fragments of each question are averaged to obtain the mean value of the volume characteristics of each question; the speech rate characteristics of the multiple speech fragments of each question are averaged to obtain each question The mean value of the speech rate characteristics; average the duration of the multiple speech fragments of each question to obtain the mean duration of each question; average the discontinuous duration of the multiple speech fragments of each question to obtain each The mean duration of the problem. That is, the volume feature, the speech rate feature, the duration and the intermittent duration obtained from the multiple speech segments all refer to the average value.
S14,根据所述每个问题的音量特征确定所述面试者的情绪稳定度。S14: Determine the emotional stability of the interviewer according to the volume characteristics of each question.
通常而言,声音的大小能够反应出人的情绪稳定性,声音波动的越大,表明人的情绪越激动;声音波动的越小,表明人的情绪越平稳。因而,可以通过面试者的音量特征的分布情况来确定面试者的情绪稳定度。Generally speaking, the size of the sound can reflect the emotional stability of a person. The greater the fluctuation of the sound, the more agitated the person's emotions; the smaller the fluctuation of the sound, the more stable the emotions of the person. Therefore, the emotional stability of the interviewer can be determined by the distribution of the interviewer's volume characteristics.
优选的,所述根据所述每个问题的音量特征确定所述面试者的情绪稳定度包括:Preferably, the determining the emotional stability of the interviewer according to the volume characteristics of each question includes:
获取所述问题的音量特征中的最大音量特征及最小音量特征;Acquiring the maximum volume feature and the minimum volume feature among the volume features of the problem;
计算所有问题的平均音量特征;Calculate the average volume characteristics of all questions;
计算所述最大音量特征和所述最小音量特征之间的音量特征幅度值;Calculating a volume feature amplitude value between the maximum volume feature and the minimum volume feature;
根据所述每个问题的音量特征与所述所有问题的平均音量特征的差值的绝对值占所述音量特征幅度值的占比,确定出每个问题的音量波动幅度;Determine the volume fluctuation range of each question according to the percentage of the absolute value of the difference between the volume characteristic of each question and the average volume characteristic of all the questions in the volume characteristic amplitude value;
根据所有问题的音量波动幅度的平均值,确定面试者的情绪稳定度。Determine the emotional stability of the interviewee based on the average of the volume fluctuations of all questions.
预先设置了不同的音量特征幅度值与情绪稳定度之间的对应关系,一旦确定了面试者的音量特征幅度值,即可根据对应关系匹配出面试者的情绪稳定度。Correspondences between different volume characteristic amplitude values and emotional stability are preset. Once the interviewer’s volume characteristic amplitude values are determined, the emotional stability of the interviewer can be matched according to the correspondence.
示例性的,假设所有问题的音量特征中最大音量特征为max,最小音量特征为min,且所有问题的平均音量特征为avg,每个问题的音量特征为ai,则每个问题的音量波动幅度为|ai-avg|/(max-min),再计算所有问题的音量波动幅度的平均值即可得到所有问题的平均音量波动幅度。如果平均音量波动幅度小于20%,则确定面试者的情绪稳定度为第一稳定度,表明所述面试者的情绪稳定度为“高”;如果平均音量波动幅度介于20%-30%,则确定面试者的情绪稳定度为第二稳定度,表明所述面试者的情绪稳定度为“中”,如果平均音量波动幅度大于30%,则确定面试者的情绪稳定度为第三稳定度,表明所述面试者的情绪稳定度为“低”。Exemplarily, suppose that the maximum volume feature of all questions is max, the minimum volume feature is min, and the average volume feature of all questions is avg, and the volume feature of each question is ai, then the volume fluctuation range of each question Is |ai-avg|/(max-min), and then calculate the average of the volume fluctuations of all the questions to get the average volume fluctuations of all the questions. If the average volume fluctuation range is less than 20%, it is determined that the interviewer’s emotional stability is the first degree of stability, indicating that the interviewer’s emotional stability is "high"; if the average volume fluctuation range is between 20%-30%, The emotional stability of the interviewer is determined to be the second degree of stability, indicating that the emotional stability of the interviewer is "medium". If the average volume fluctuation is greater than 30%, the emotional stability of the interviewer is determined to be the third degree of stability , Indicating that the interviewee’s emotional stability is "low".
S15,使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度。S15: Use a pre-built confidence level determination model to determine the speech rate feature, interruption duration, and duration, and determine the interviewer's confidence level.
越自信的人,说话越快,间断时长越短,且持续时长越长;越不自信的,说话越慢,间断时长越长,持续时长越短。The more confident the person is, the faster the speech, the shorter the break, and the longer the duration; the less confident the person, the slower the speech, the longer the break, and the shorter the duration.
优选的,所述使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度包括:Preferably, the use of a pre-built confidence level determination model to determine the speech rate feature, interruption duration, and duration duration, and determining the confidence level of the interviewer includes:
使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次;Use a pre-built confidence judgment model to judge the speech rate characteristics, interruption time and duration of each question, and determine the confidence level of each question;
将所有问题得到的自信度档次转换为数值;Convert the confidence level obtained from all questions into numerical values;
对所有问题自信度档次数据取平均值;Take the average of the confidence level data of all questions;
平均值向上取整得到面试者的自信度判定结果。The average is rounded up to get the interviewer’s confidence judgment result.
示例性的,假设使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出5个问题的自信度档次分别为:问题1-自信度档次A,问题2-自信度档次B,问题3-自信度档次B,问题4-自信度档次B,问题5-自信度档次A,根据问题的序号对5个问题对应的自信度档次进行排序得到ABBBA,最后确定出ABBBA中的中心位置为B,则目标自信度档次为B,作为所述面试者面试过程的自信度的最终判定结果。Exemplarily, assuming that a pre-built confidence level determination model is used to determine the speech rate characteristics, interruption duration and duration of each question, the confidence levels of the five questions are determined as follows: Question 1-Confidence level A, Question 2-Confidence Level B, Question 3-Confidence Level B, Question 4-Confidence Level B, Question 5-Confidence Level A, sort the confidence levels corresponding to the 5 questions according to the serial number of the question. ABBBA finally determines that the center position in ABBBA is B, and the target confidence level is B, as the final judgment result of the confidence of the interviewer in the interview process.
为了避免出现偶数个问题没法确定面试者的自信度判定结果,可以将所有问题的得分进行数值转换,数值转换的结果平均后向上(大)取整得到个人的档次。比如:题1-自信度档次A-5分,问题2-自信度档次B-4分,问题3-自信度档次B-4分,问题4-自信度档次B-4分,问题5-自信度档次A-5分,平均值为4.4,向上(大)取整后得分为5分,则面试者的自信度判定结果为档次A。In order to avoid an even number of questions failing to determine the interviewer's confidence judgment results, the scores of all questions can be converted into numerical values, and the numerical conversion results are averaged and rounded up (larger) to obtain a personal grade. For example: Question 1-Confidence Level A-5 points, Question 2-Confidence Level B-4 points, Question 3-Confidence Level B-4 points, Question 4-Confidence Level B-4 points, Question 5-Confidence Level Grade A-5 points, the average is 4.4, and the score is 5 points after rounding up (larger), then the interviewer's confidence level judgment result is Grade A.
在一个可选的实施例中,所述使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次包括:In an optional embodiment, the use of a pre-built confidence determination model to determine the speech rate characteristics, interruption duration, and duration of each question, and determining the confidence level of each question includes:
使用预先构建的自信度判定模型分别识别所述语速特征、间断时长及持续时长属于的自信度档次对应的特征范围;Using a pre-built confidence determination model to respectively identify the characteristic range corresponding to the confidence level to which the speech rate feature, the interruption duration, and the duration belong;
将属于的特征范围对应的自信度档次确定为所述语速特征的第一自信度档次、所述间断时长的第二自信度档次及所述持续时长的第三自信度档次;Determining the confidence level corresponding to the characteristic range that belongs to the first confidence level of the speaking rate feature, the second confidence level of the intermittent duration, and the third confidence level of the continuous duration;
判断所述第一自信度档次、第二自信度档次及第三自信度档次是否为多个;Judging whether there are multiple levels of the first confidence level, the second confidence level, and the third confidence level;
若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且均相同时,将多个相同的档次确定为候选自信度档次;If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level and they are all the same, determine the multiple same levels as the candidate confidence level;
将多个所述候选自信度档次按自信度档次从高到低排序,获得自信度档次排序队列;Sorting the multiple candidate confidence levels in descending order of confidence level to obtain a confidence level ranking queue;
基于大数定理确定所述自信度档次排序队列的目标候选自信度档次为所述问题的自信度档次。Based on the law of large numbers, it is determined that the confidence level of the target candidate in the confidence level ranking queue is the confidence level of the problem.
在该可选的实施例中,任何一个特征箱型图(语速特征箱型图、间断时长箱型图、持续时长箱型图)中的每个自信度档次都确定了一个特征范围(范围为不同档次的最大值和最小值),只有当某一个问题的所有特征(语速特征、间断时长、持续时长)都被确定为同一档次 时,则所述问题的自信度被确定为此档次。示例性的,假设一通语音的语速特征为3.4,间隔时长为1.3,持续时长为5.6,语速特征箱型图中B档次的语速特征范围为[3.2,4],间隔时长箱型图中B档次的间隔时长范围为[0.8,1.5],持续时长箱型图中B档次的间隔时长范围为[5.3,5.7],由于语速特征、间隔时长和持续时长均满足B档次的范围,因此该问题的自信度档次初次判定为B档次。In this optional embodiment, each confidence level in any feature box diagram (speaking rate feature box diagram, intermittent duration box diagram, duration box diagram) determines a feature range (range Is the maximum and minimum of different levels), only when all the characteristics of a certain question (characteristics of speech rate, interruption duration, duration) are determined to be the same level, the confidence level of the question is determined to be this level . Exemplarily, suppose that the speech rate feature of a speech is 3.4, the interval length is 1.3, and the duration is 5.6. The speech rate feature range of grade B in the speech rate feature box chart is [3.2,4], and the interval time box chart The interval duration range of the middle level B is [0.8, 1.5], and the interval duration range of the B level in the duration box chart is [5.3, 5.7]. Because of the characteristics of speech rate, interval length and duration, all satisfy the range of level B. Therefore, the confidence level of this question is judged as B level for the first time.
示例性的,假如第一自信度档次为A档次和B档次、第二自信度档次为A档次和B档次,第三自信度档次为A档次和B档次,即所述第一自信度档次、第二自信度档次和第三自信度档次均有多个且多个第一自信度档次、第二自信度档次和第三自信度档次均相同,则所述候选自信度档次有多个:A档次和B档次,自信度档次排序队列为AB,基于大数定理确定目标候选自信度为B档次,作为所述问题的自信度档次。Exemplarily, if the first confidence level is A and B, the second confidence is A and B, and the third confidence is A and B, that is, the first confidence level, There are multiple second confidence level and third confidence level and multiple first confidence level, second confidence level and third confidence level are all the same, then there are multiple candidate confidence levels: A Level and B level, the confidence level sorting queue is AB, and the target candidate’s confidence level is determined to be B level based on the law of large numbers, as the confidence level of the problem.
又如,第一自信度档次为A档次、B档次和C档次、第二自信度档次为A档次、B档次和C档次,第三自信度档次为A档次、B档次和C档次,即所述第一自信度档次、第二自信度档次和第三自信度档次均有多个且多个第一自信度档次、第二自信度档次和第三自信度档次均相同,则所述候选自信度档次有多个:A档次、B档次和C档次,自信度档次排序队列为ABC,基于大数定理确定目标候选自信度为B档次,作为所述问题的自信度档次。For another example, the first confidence level is A, B, and C, the second confidence is A, B, and C, and the third confidence is A, B, and C. If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level, and the multiple first, second, and third confidence levels are all the same, the candidate confidence level There are multiple levels: level A, level B, and level C. The confidence level ranking queue is ABC. Based on the law of large numbers, the confidence level of the target candidate is determined as level B as the confidence level of the problem.
应当理解的是,由于每个档次的特征范围都符合极值一致性,因而不会出现ABD或者BCE这种中间断档的情况。It should be understood that since the characteristic range of each grade conforms to extreme value consistency, there will be no intermediate gear disconnection such as ABD or BCE.
在一个可选的实施例中,若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且不均相同时,所述方法还包括:In an optional embodiment, if the first self-confidence level, the second self-confidence level, and the third self-confidence level are multiple and are not uniformly the same, the method further includes:
判断第一自信度档次、第二自信度档次及第三自信度档次的多个档次中是否有相同的档次;Determine whether the multiple levels of the first confidence level, the second confidence level and the third confidence level have the same level;
若有相同的档次,则将相同的档次确定为候选自信度档次。If there are the same grades, the same grades are determined as candidate confidence grades.
示例性的,假设第一自信度档次为A档次、B档次和D档次、第二自信度档次为A档次、B档次和E档次,第三自信度档次为A档次、B档次和C档次,即所述第一自信度档次、第二自信度档次和第三自信度档次为多个且多个第一自信度档次、第二自信度档次和第三自信度档次不均相同,但第一自信度、第二自信度和第三自信度有相同的档次A和B,则将相同的A档次和B档次均作为候选自信度档次,最后基于大数定理确定出所述问题的自信度档次为B档次。For example, suppose the first confidence level is A, B, and D, the second confidence is A, B, and E, and the third confidence is A, B, and C. That is, the first, second, and third confidence levels are multiple and the multiple first, second, and third confidence levels are not the same, but the first Confidence, second confidence, and third confidence have the same grades A and B, then the same grades A and B are used as candidate confidence grades, and finally the confidence grade of the problem is determined based on the law of large numbers For the B grade.
在一个可选的实施例中,若所述第一自信度档次、第二自信度档次及第三自信度档次为一个且不均相同时,所述方法还包括:In an optional embodiment, if the first self-confidence level, the second self-confidence level, and the third self-confidence level are one and are not uniformly the same, the method further includes:
确定所述问题的自信度档次为空档次。It is determined that the confidence level of the question is the neutral level.
其中,所述空档次是指遍历完所有的档次都不符合时的档次。Wherein, the neutral grade refers to the grade when all the grades are not met after traversing.
假设针对某一个问题,使用预先构建的自信度判定模型判断该问题的语速特征对应的自信度档次为A档次,而使用预先构建的自信度判定模型判断该问题的间断时长对应的自信度档次为B档次,使用预先构建的自信度判定模型判断该问题的持续时长对应的自信度档次为A档次,由于该问题的语速特征、间断时长及持续时长不均属于同一个自信度档次,则确定该问题的自信度档次不属于A档次且不属于B档次,即不存在第一,第二与第三自信度同时相同的情况,则确定该问题的自信度档次为空档次。Suppose that for a certain problem, use the pre-built confidence determination model to determine that the confidence level corresponding to the speech rate feature of the question is grade A, and use the pre-built confidence determination model to determine the confidence level corresponding to the intermittent duration of the question Level B, using the pre-built confidence determination model to determine that the duration of the question corresponds to the level of confidence level A, because the question’s speech rate characteristics, interruption duration, and duration are not all of the same confidence level, then It is determined that the confidence level of the question does not belong to the A level and does not belong to the B level, that is, there is no situation where the first, second and third confidence levels are the same at the same time, then the confidence level of the question is determined to be the neutral level.
为了便于对出现空档次的问题也进行计算,同时根据大数定理,空档次的问题极大可能属于最一般的情况,即C档,因此可以预先设定空档次为C档次。In order to facilitate the calculation of the problem of the neutral level, and at the same time, according to the law of large numbers, the problem of the neutral level is most likely to belong to the most general situation, that is, the C level, so the neutral level can be preset as the C level.
S16,使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度。S16: Use a pre-built confidence determination model to determine the speech rate feature and the interruption duration, and determine the interviewer's response speed.
反应速度越快的人群,整体语速特征越大,间断时长越短;反应速度越慢的人群,整体语速特征越小,间断时长越长。The faster the response speed of the population, the greater the overall speaking rate characteristics, and the shorter the interruption duration; the slower the response speed, the lower the overall speaking rate characteristics, and the longer the interruption duration.
关于所述使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定来确定所述面试者的反应速度的过程,同所述使用预先构建的自信度判定模型对所述语速特征、间断 时长及持续时长进行判定来确定所述面试者的自信度的过程,具体可参见S15及其相关描述,本申请在此不再详细阐述。Regarding the process of using the pre-built confidence determination model to determine the speech rate characteristics and the intermittent duration to determine the interviewer’s response speed, the same as the use of the pre-built confidence determination model to determine the speaking rate The process of judging characteristics, intermittent duration, and duration to determine the confidence of the interviewer is detailed in S15 and related descriptions, which will not be elaborated here in this application.
在一个可选的实施例中,所述S15与所述S16并行执行。In an optional embodiment, the S15 and the S16 are executed in parallel.
在该可选的实施例中,可以同时开启两个线程同步执行,其中一个线程用于使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,另一个线程用于使用预先构建的反应速度判定模型对所述语速特征及间断时长进行判定。由于两个线程并行执行,因此,可以提高面试者的自信度和反应速度的判定效率,缩短判定时间,提高面试筛选的效率。In this alternative embodiment, two threads can be started for synchronous execution at the same time. One thread is used to determine the speech rate feature, interruption duration, and duration using a pre-built confidence determination model, and the other thread uses To use a pre-built reaction speed judgment model to judge the speech rate characteristics and the length of the interruption. Since the two threads are executed in parallel, it can improve the interviewer's confidence and response speed judgment efficiency, shorten the judgment time, and improve the efficiency of interview screening.
S17,根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。S17: Output the interview result of the interviewer according to the emotional stability, reaction speed, and confidence.
在面试过程中,通过面试者对问题的回答语音分析出面试者的情绪稳定度、反应速度和自信度之后,可以根据面试岗位的侧重点,筛选出符合面试要求的面试者。In the interview process, after the interviewer’s voice analysis of the interviewer’s answer to the question, the interviewer’s emotional stability, reaction speed, and confidence, the interviewer can be selected according to the focus of the interview position to meet the interview requirements.
例如,对于客服岗位,需要情绪稳定且反应速度快的人员,以应对随时变化的市场,则筛选面试者时,重点关注情绪稳定度和反应速度。For example, for customer service positions, people who are emotionally stable and quick to respond are needed to cope with the ever-changing market. When selecting interviewers, focus on emotional stability and reaction speed.
再如,对于营销岗位,需要自信度高的人员,以给客户带来积极肯定的印象,促进双方的合作,则筛选面试者时,重点关注自信度。For another example, for marketing positions, people with a high degree of confidence are needed to give customers a positive impression and promote cooperation between the two parties. When selecting interviewers, focus on self-confidence.
综上,本申请所述的基于语音的面试者判定方法,通过获取面试者的每个问题的回答语音,对每个问题的回答语音进行切片,得到多个语音片段,提取每个所述语音片段的音量特征、语速特征、持续时长、间断时长,基于音量特征确定出面试者的情绪稳定度,再采用预先构建的自信度判定模型和反应速度判定模型对所述语速特征、持续时长、间断时长进行判断,确定出面试者的自信度和反应速度,根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。本申请通过对面试过程人机交互的语音进行深入分析与挖掘,确定出面试者的多个特征,例如情绪稳定度、反应速度和自信度,通过这些特征能够客观且全面的评价面试者,评价结果更为精确和准确,提高了面试的判定效率和质量。In summary, the voice-based interviewer judgment method described in this application obtains the answer voice of each question of the interviewer, slices the answer voice of each question, and obtains multiple voice fragments, and extracts each of the voices. The volume characteristics, speaking rate characteristics, duration, and intermittent duration of the fragments are used to determine the emotional stability of the interviewer based on the volume characteristics, and then the pre-built confidence determination model and reaction speed determination model are used to determine the speech rate characteristics and duration. , Intermittent time judgment, determine the confidence and reaction speed of the interviewer, and output the interview result of the interviewer according to the emotional stability, reaction speed and confidence. This application uses in-depth analysis and mining of the human-computer interaction voice during the interview process to determine multiple characteristics of the interviewer, such as emotional stability, reaction speed, and self-confidence. Through these characteristics, the interviewer can be evaluated objectively and comprehensively. The result is more precise and accurate, which improves the efficiency and quality of the interview judgment.
由以上实施例可知,本申请可应用在智慧政务等领域,从而推动智慧城市的发展。It can be seen from the above embodiments that this application can be applied in fields such as smart government affairs, so as to promote the development of smart cities.
实施例二Example two
图2是本申请实施例二提供的基于语音的面试者判定装置的结构图。Fig. 2 is a structural diagram of a voice-based interviewer judging device provided in the second embodiment of the present application.
在一些实施例中,所述基于语音的面试者判定装置20可以包括多个由计算机可读指令段所组成的功能模块。所述基于语音的面试者判定装置20中的各个程序段的计算机可读指令可以存储于终端的存储器中,并由所述至少一个处理器所执行,以执行(详见图1描述)基于语音的面试者判定的功能。In some embodiments, the voice-based interviewer determination device 20 may include multiple functional modules composed of computer-readable instruction segments. The computer-readable instructions of each program segment in the voice-based interviewer determination device 20 may be stored in the memory of the terminal and executed by the at least one processor to execute (see FIG. 1 for details). The function of the interviewer's judgment.
本实施例中,所述基于语音的面试者判定装置20根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:获取模块201、构建模块202、切片模块203、计算模块204、第一确定模块205、第二确定模块206、第三确定模块207及输出模块208。本申请所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机可读指令段,其存储在存储器中。在本实施例中,关于各模块的功能将在后续的实施例中详述。In this embodiment, the voice-based interviewer determination device 20 can be divided into multiple functional modules according to the functions it performs. The functional modules may include: an acquisition module 201, a construction module 202, a slicing module 203, a calculation module 204, a first determination module 205, a second determination module 206, a third determination module 207, and an output module 208. The module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the functions of each module will be described in detail in subsequent embodiments.
获取模块201,用于获取面试者的多个问题的回答语音。The obtaining module 201 is used to obtain the answer voices of multiple questions of the interviewer.
预先根据招聘岗位的需求设置多个问题,通过人机交互的方式,获取面试过程中面试者与机器的针对每个问题的语音,再将机器发出的提问语音和面试者的回答语音分离开来,最后筛选出面试者的回答语音。Set up multiple questions in advance according to the needs of the recruitment position, and obtain the voice of the interviewer and the machine for each question in the interview process through human-computer interaction, and then separate the question voice issued by the machine from the interviewer’s answer voice , And finally screen out the interviewer’s answer voice.
作为一种可选的实施例,在所述获取面试者的多个问题的回答语音之前,所述装置还包括:As an optional embodiment, before the obtaining the answer voices of the multiple questions of the interviewer, the apparatus further includes:
构建模块202,用于构建自信度判定模型和反应速度判定模型。The construction module 202 is used to construct a confidence degree judgment model and a reaction speed judgment model.
其中,所述自信度判定模型和反应速度判定模型的构建过程包括:Wherein, the process of constructing the confidence level judgment model and the reaction speed judgment model includes:
获取多个样本语音;Obtain multiple sample voices;
提取所述多个样本语音中的多个特征;Extracting multiple features in the multiple sample voices;
根据所述多个特征的分布情况,从所述多个特征中筛选出自信度区分度大的第一显著特 征和筛选出反应速度区分度大的第二显著特征,其中,所述第一显著特征包括:语速特征、持续时长、间断时长,所述第二显著特征包括:语速特征、间断时长;According to the distribution of the multiple features, the first significant feature with a large degree of confidence and the second significant feature with a large degree of response speed are selected from the multiple features, wherein the first significant Features include: speech rate characteristics, duration, and intermittent duration, and the second significant feature includes: speech rate features, intermittent duration;
确定所述多个第一显著特征对应的多个自信度档次以及每个所述自信度档次对应的特征范围,及确定所述多个第二显著特征对应的多个反应速度档次以及每个所述反应速度档次对应的特征范围;Determine the multiple confidence levels corresponding to the multiple first salient features and the feature range corresponding to each of the confidence levels, and determine the multiple response speed levels corresponding to the multiple second salient features and each Describe the characteristic range corresponding to the reaction speed grade;
分别判断不同自信度档次的特征范围和不同反应速度档次的特征范围是否符合极值一致性;Judge whether the characteristic ranges of different confidence levels and the characteristic ranges of different reaction speed levels are consistent with extreme values;
若不同自信度档次的特征范围符合极值一致性,基于所述多个第一显著特征、多个自信度档次以及每个所述自信度档次对应的特征范围,构建自信度判定模型;If the feature ranges of different confidence levels are consistent with extreme values, construct a confidence determination model based on the multiple first salient features, multiple confidence levels, and the feature range corresponding to each of the confidence levels;
若不同反应速度档次的特征范围符合极值一致性,基于所述多个第二显著特征、多个反应速度档次以及每个所述反应速度档次对应的特征范围,构建反应速度判定模型。If the characteristic ranges of different reaction speed grades meet the extreme consistency, a reaction speed determination model is constructed based on the plurality of second salient characteristics, the plurality of reaction speed grades, and the characteristic range corresponding to each of the reaction speed grades.
通过大量的实验:对多个面试者回答的每个问题的样本语音进行自信度、情绪稳定度、反应速度的标注,再将四个相关特征及对应的标注结果作为学习的对象建立学习的模型,发现:从每个相关特征在不同程度的自信度/情绪稳定度/反应速度上的数据分布来看,不同自信度/情绪稳定度/反应速度的人数据差异分布明显且有规律,由此可以通过面试者的四个相关特征:音量特征、语速特征、持续时长和间断时长对面试者的自信度、情绪稳定度、反应速度进行量化评定。Through a large number of experiments: the self-confidence, emotional stability, and reaction speed of the sample speech of each question answered by multiple interviewers are labeled, and then the four relevant features and the corresponding labeling results are used as the learning object to establish a learning model , Found that: from the data distribution of each relevant feature in different degrees of confidence/emotional stability/reaction speed, the data distribution of people with different confidence/emotional stability/reaction speed is obvious and regular, thus The interviewer's confidence, emotional stability, and reaction speed can be quantitatively evaluated through four relevant characteristics of the interviewer: volume characteristics, speaking rate characteristics, duration and intermittent duration.
再通过观察发现四个不同的音量特征、语速特征、持续时长、间断时长在不同自信度和不同反应速度的分布情况,从中确定出了不同自信度区分度比较大的特征类型和不同反应速度区分度比较大的特征类型。根据样本语音的四个相关特征和自信度档次,生成每个相关特征在不同自信度档次的第一箱型图和每个相关特征在不同反应速度档次的第二箱型图,并从第一箱型图中确定了在不同档次自信度区分度比较大的几种第一显著特征:语速特征、持续时长、间断时长,及从第二箱型图中确定了在不同档次反应速度区分度比较大的几种第二显著特征:语速特征、间断时长。最后基于语速特征、持续时长、间断时长这三种第一显著特征构建出了自信度判定模型。基于语速特征、间断时长这两种第二显著特征构建出了反应速度判定模型。Through observation, we found the distribution of four different volume characteristics, speaking rate characteristics, duration, and intermittent duration at different confidence levels and different reaction speeds, and determined the characteristic types with different degrees of confidence and different reaction speeds. A feature type with a relatively large degree of discrimination. According to the four relevant features and confidence levels of the sample speech, generate the first box plots of each relevant feature at different confidence levels and the second box plots of each relevant feature at different reaction speed levels, and start from the first The box chart identifies several first notable features that have a greater degree of discrimination in different levels of confidence: speaking rate characteristics, duration, and intermittent duration, and the second box chart determines the degree of discrimination between different levels of reaction speed The relatively large second significant features: the characteristics of speech rate, the length of the interruption. Finally, a self-confidence judgment model is constructed based on the three first salient features of speech rate, duration, and intermittent duration. A response speed judgment model is constructed based on the two second salient features of speech rate and intermittent duration.
其中,所述第一箱型图是由第一显著特征的特征值在不同自信度档次分布生成的,所述第二箱型图是由第二显著特征的特征值在不同反应速度档次分布生成的。Wherein, the first box diagram is generated from the distribution of the eigenvalues of the first salient feature at different confidence levels, and the second box diagram is generated from the distribution of the eigenvalues of the second salient feature at different reaction speed levels. of.
本申请实施例中,在对所述显著特征进行训练时,需要根据所述显著特征对应的在不同自信度/反应速度档次的箱型图中最大值以及最小值来确定所述显著特征对应的在不同自信度/反应速度档次的特征值范围。在确定了所述显著特征对应的在不同自信度/反应速度档次的特征值范围后,需要判断所述特征值范围是否符合极值一致性,比如一个显著特征对应的在五个自信度/反应速度档次的特征比值范围为[a1,b1],[a2,b2],[a3,b3],[a4,b4],[a5,b5],若自信度/反应速度档次在这个显著特征上是单调递增的,即,自信度/反应速度档次越高,该显著特征对应的特征比值的最大值与最小值越大,若特征比值范围满足a1<=a2<=a3<=a4<=a5,b1<=b2<=b3<=b4<=b5。这时候可以确定不同自信度/反应速度档次的特征比值范围符合极值一致性。根据所述多个第二特征中的显著特征、多个自信度/反应速度档次以及每个所述自信度/反应速度档次对应的特征值范围,生成自信度/反应速度判定模型。In the embodiment of the present application, when training the salient feature, it is necessary to determine the salient feature corresponding to the salient feature according to the maximum and minimum values corresponding to the salient feature in the box diagrams of different confidence/reaction speed levels. The range of characteristic values in different levels of confidence/reaction speed. After determining the feature value range corresponding to the salient feature at different confidence levels/reaction speed grades, it is necessary to determine whether the feature value range conforms to the extreme value consistency, for example, one salient feature corresponds to five confidence levels/reactions. The characteristic ratio range of the speed grade is [a1,b1], [a2,b2], [a3,b3], [a4,b4], [a5,b5], if the confidence level/reaction speed grade is on this significant feature Monotonically increasing, that is, the higher the level of confidence/reaction speed, the larger the maximum and minimum value of the feature ratio corresponding to the salient feature. If the range of feature ratio satisfies a1<=a2<=a3<=a4<=a5, b1<=b2<=b3<=b4<=b5. At this time, it can be determined that the range of characteristic ratios of different confidence levels/reaction speed grades conforms to the extreme value consistency. According to the salient features of the plurality of second characteristics, the plurality of confidence/reaction speed grades, and the feature value range corresponding to each of the confidence/reaction speed grades, a confidence/reaction speed determination model is generated.
可选的,若不同自信度/反应速度档次的特征比值范围不符合极值一致性时,需要变更特征值范围,比如上面例子中的显著特征对应的在五个自信度/反应速度档次的特征值范围为[a1,b1],[a2,b2],[a3,b3],[a4,b4],[a5,b5],自信度/反应速度档次在这个显著特征上是单调递增的,若某个特征比值范围不满足a1<=a2<=a3<=a4<=a5,b1<=b2<=b3<=b4<=b5,需要将该特征值范围变更为下一档次的特征值范围,比如:a1>a2<=a3<=a4<=a5,这时需要将a1的值变更为a2的值,使得a1<=a2<=a3<=a4<=a5成立。Optionally, if the feature ratio range of different confidence/reaction speed grades does not meet the extreme consistency, the feature value range needs to be changed. For example, the salient features in the above example correspond to the features in the five confidence/reaction speed grades The value range is [a1,b1],[a2,b2],[a3,b3],[a4,b4],[a5,b5], the confidence level/reaction speed grade is monotonically increasing in this significant feature, if A certain characteristic ratio range does not satisfy a1<=a2<=a3<=a4<=a5, b1<=b2<=b3<=b4<=b5, it is necessary to change the characteristic value range to the next grade characteristic value range For example: a1>a2<=a3<=a4<=a5, then the value of a1 needs to be changed to the value of a2, so that a1<=a2<=a3<=a4<=a5 holds true.
应当理解的是,还可以预先划分更多的档次或更少的档次,本申请对此不做具体的限定。It should be understood that more grades or fewer grades can also be pre-divided, which is not specifically limited in this application.
切片模块203,用于对每个问题的回答语音进行切片,得到多个语音片段。The slicing module 203 is used to slice the answer speech of each question to obtain multiple speech fragments.
面试者对每个问题进行回答之后,将面试者针对每个问题的回答语音进行切分成多个语音片段。After the interviewer answers each question, the interviewer's answer speech for each question is divided into multiple speech fragments.
示例性的,将面试者的每个问题的回答语音切分为28个语音片段。Exemplarily, the answer voice of each question of the interviewer is divided into 28 voice fragments.
计算模块204,用于根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长。The calculation module 204 is configured to calculate the volume characteristic, the speaking rate characteristic, the duration, and the intermittent duration of each question according to the multiple speech fragments.
所述音量特征是指面试者在回答问题时的声音的大小。The volume feature refers to the size of the interviewer's voice when answering questions.
所述语速特征是指面试者在回答问题时的快慢,单位时间内语音内容的多少。The speaking rate feature refers to the speed of the interviewer in answering questions, and the amount of voice content per unit time.
所述持续时长是指面试者在回答问题时连续说话的时间长短。The duration refers to the length of time that the interviewer continuously speaks when answering questions.
所述间断时长是指面试者在回答问题时未说话的时间长短。The intermittent duration refers to the length of time that the interviewer does not speak when answering questions.
每个语音片段都有四个相关特征:音量特征、语速特征、持续时长、间断时长,将同一个问题的所有语音片段的相关特征进行平均之后,即可得到每个问题每个相关特征的均值。具体而言,将每个问题的所述多个语音片段的音量特征进行平均得到每个问题的音量特征均值;将每个问题的所述多个语音片段的语速特征进行平均得到每个问题的语速特征均值;将每个问题的所述多个语音片段的持续时长进行平均得到每个问题的持续时长均值;将每个问题的所述多个语音片段的间断时长进行平均得到每个问题的间断时长均值。即,根据所述多个语音片段得到的音量特征、语速特征、持续时长和间断时长均是指均值。Each voice segment has four related features: volume feature, speaking rate feature, duration, and intermittent duration. After averaging the related features of all voice segments of the same question, you can get the relevant feature of each question. Mean. Specifically, the volume characteristics of the multiple speech fragments of each question are averaged to obtain the mean value of the volume characteristics of each question; the speech rate characteristics of the multiple speech fragments of each question are averaged to obtain each question The mean value of the speech rate characteristics; average the duration of the multiple speech fragments of each question to obtain the mean duration of each question; average the discontinuous duration of the multiple speech fragments of each question to obtain each The mean duration of the problem. That is, the volume feature, the speech rate feature, the duration and the intermittent duration obtained from the multiple speech segments all refer to the average value.
第一确定模块205,用于根据所述每个问题的音量特征确定所述面试者的情绪稳定度。The first determining module 205 is configured to determine the emotional stability of the interviewer according to the volume characteristics of each question.
通常而言,声音的大小能够反应出人的情绪稳定性,声音波动的越大,表明人的情绪越激动;声音波动的越小,表明人的情绪越平稳。因而,可以通过面试者的音量特征的分布情况来确定面试者的情绪稳定度。Generally speaking, the size of the sound can reflect the emotional stability of a person. The greater the fluctuation of the sound, the more agitated the person's emotions; the smaller the fluctuation of the sound, the more stable the emotions of the person. Therefore, the emotional stability of the interviewer can be determined by the distribution of the interviewer's volume characteristics.
优选的,所述第一确定模块205根据所述每个问题的音量特征确定所述面试者的情绪稳定度包括:Preferably, the first determining module 205 determining the emotional stability of the interviewer according to the volume characteristics of each question includes:
获取所述问题的音量特征中的最大音量特征及最小音量特征;Acquiring the maximum volume feature and the minimum volume feature among the volume features of the problem;
计算所有问题的平均音量特征;Calculate the average volume characteristics of all questions;
计算所述最大音量特征和所述最小音量特征之间的音量特征幅度值;Calculating a volume feature amplitude value between the maximum volume feature and the minimum volume feature;
根据所述每个问题的音量特征与所述所有问题的平均音量特征的差值的绝对值占所述音量特征幅度值的占比,确定出每个问题的音量波动幅度;Determine the volume fluctuation range of each question according to the percentage of the absolute value of the difference between the volume characteristic of each question and the average volume characteristic of all the questions in the volume characteristic amplitude value;
根据所有问题的音量波动幅度的平均值,确定面试者的情绪稳定度。Determine the emotional stability of the interviewee based on the average of the volume fluctuations of all questions.
预先设置了不同的音量特征幅度值与情绪稳定度之间的对应关系,一旦确定了面试者的音量特征幅度值,即可根据对应关系匹配出面试者的情绪稳定度。Correspondences between different volume characteristic amplitude values and emotional stability are preset. Once the interviewer’s volume characteristic amplitude values are determined, the emotional stability of the interviewer can be matched according to the correspondence.
示例性的,假设所有问题的音量特征中最大音量特征为max,最小音量特征为min,且所有问题的平均音量特征为avg,每个问题的音量特征为ai,则每个问题的音量波动幅度为|ai-avg|/(max-min),再计算所有问题的音量波动幅度的平均值即可得到所有问题的平均音量波动幅度。如果平均音量波动幅度小于20%,则确定面试者的情绪稳定度为第一稳定度,表明所述面试者的情绪稳定度为“高”;如果平均音量波动幅度介于20%-30%,则确定面试者的情绪稳定度为第二稳定度,表明所述面试者的情绪稳定度为“中”,如果平均音量波动幅度大于30%,则确定面试者的情绪稳定度为第三稳定度,表明所述面试者的情绪稳定度为“低”。Exemplarily, suppose that the maximum volume feature of all questions is max, the minimum volume feature is min, and the average volume feature of all questions is avg, and the volume feature of each question is ai, then the volume fluctuation range of each question Is |ai-avg|/(max-min), and then calculate the average of the volume fluctuations of all the questions to get the average volume fluctuations of all the questions. If the average volume fluctuation range is less than 20%, it is determined that the interviewer’s emotional stability is the first degree of stability, indicating that the interviewer’s emotional stability is "high"; if the average volume fluctuation range is between 20%-30%, The emotional stability of the interviewer is determined to be the second degree of stability, indicating that the emotional stability of the interviewer is "medium". If the average volume fluctuation is greater than 30%, the emotional stability of the interviewer is determined to be the third degree of stability , Indicating that the interviewee’s emotional stability is "low".
第二确定模块206,用于使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度。The second determining module 206 is configured to use a pre-built confidence level determination model to determine the speech rate feature, interruption duration, and duration duration, and determine the confidence level of the interviewer.
越自信的人,说话越快,间断时长越短,且持续时长越长;越不自信的,说话越慢,间断时长越长,持续时长越短。The more confident the person is, the faster the speech, the shorter the break, and the longer the duration; the less confident the person, the slower the speech, the longer the break, and the shorter the duration.
优选的,所述第二确定模块206使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度包括:Preferably, the second determining module 206 uses a pre-built confidence level determination model to determine the speech rate feature, interruption duration, and duration, and determining the interviewer’s confidence level includes:
使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次;Use a pre-built confidence judgment model to judge the speech rate characteristics, interruption time and duration of each question, and determine the confidence level of each question;
将所有问题得到的自信度档次转换为数值;Convert the confidence level obtained from all questions into numerical values;
对所有问题自信度档次数据取平均值;Take the average of the confidence level data of all questions;
平均值向上取整得到面试者的自信度判定结果。The average is rounded up to get the interviewer’s confidence judgment result.
示例性的,假设使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出5个问题的自信度档次分别为:问题1-自信度档次A,问题2-自信度档次B,问题3-自信度档次B,问题4-自信度档次B,问题5-自信度档次A,根据问题的序号对5个问题对应的自信度档次进行排序得到ABBBA,最后确定出ABBBA中的中心位置为B,则目标自信度档次为B,作为所述面试者面试过程的自信度的最终判定结果。Exemplarily, assuming that a pre-built confidence level determination model is used to determine the speech rate characteristics, interruption duration and duration of each question, the confidence levels of the five questions are determined as follows: Question 1-Confidence level A, Question 2-Confidence Level B, Question 3-Confidence Level B, Question 4-Confidence Level B, Question 5-Confidence Level A, sort the confidence levels corresponding to the 5 questions according to the serial number of the question. ABBBA finally determines that the center position in ABBBA is B, and the target confidence level is B, as the final judgment result of the confidence of the interviewer in the interview process.
为了避免出现偶数个问题没法确定面试者的自信度判定结果,可以将所有问题的得分进行数值转换,数值转换的结果平均后向上(大)取整得到个人的档次。比如:题1-自信度档次A-5分,问题2-自信度档次B-4分,问题3-自信度档次B-4分,问题4-自信度档次B-4分,问题5-自信度档次A-5分,平均值为4.4,向上(大)取整后得分为5分,则面试者的自信度判定结果为档次A。In order to avoid an even number of questions failing to determine the interviewer's confidence judgment results, the scores of all questions can be converted into numerical values, and the numerical conversion results are averaged and rounded up (larger) to obtain a personal grade. For example: Question 1-Confidence Level A-5 points, Question 2-Confidence Level B-4 points, Question 3-Confidence Level B-4 points, Question 4-Confidence Level B-4 points, Question 5-Confidence Level Grade A-5 points, the average is 4.4, and the score is 5 points after rounding up (larger), then the interviewer's confidence level judgment result is Grade A.
在一个可选的实施例中,所述使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次包括:In an optional embodiment, the use of a pre-built confidence determination model to determine the speech rate characteristics, interruption duration, and duration of each question, and determining the confidence level of each question includes:
使用预先构建的自信度判定模型分别识别所述语速特征、间断时长及持续时长属于的自信度档次对应的特征范围;Using a pre-built confidence determination model to respectively identify the characteristic range corresponding to the confidence level to which the speech rate feature, the interruption duration, and the duration belong;
将属于的特征范围对应的自信度档次确定为所述语速特征的第一自信度档次、所述间断时长的第二自信度档次及所述持续时长的第三自信度档次;Determining the confidence level corresponding to the characteristic range that belongs to the first confidence level of the speaking rate feature, the second confidence level of the intermittent duration, and the third confidence level of the continuous duration;
判断所述第一自信度档次、第二自信度档次及第三自信度档次是否为多个;Judging whether there are multiple levels of the first confidence level, the second confidence level, and the third confidence level;
若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且均相同时,将多个相同的档次确定为候选自信度档次;If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level and they are all the same, determine the multiple same levels as the candidate confidence level;
将多个所述候选自信度档次按自信度档次从高到低排序,获得自信度档次排序队列;Sorting the multiple candidate confidence levels in descending order of confidence level to obtain a confidence level ranking queue;
基于大数定理确定所述自信度档次排序队列的目标候选自信度档次为所述问题的自信度档次。Based on the law of large numbers, it is determined that the confidence level of the target candidate in the confidence level ranking queue is the confidence level of the problem.
在该可选的实施例中,任何一个特征箱型图(语速特征箱型图、间断时长箱型图、持续时长箱型图)中的每个自信度档次都确定了一个特征范围(范围为不同档次的最大值和最小值),只有当某一个问题的所有特征(语速特征、间断时长、持续时长)都被确定为同一档次时,则所述问题的自信度被确定为此档次。示例性的,假设一通语音的语速特征为3.4,间隔时长为1.3,持续时长为5.6,语速特征箱型图中B档次的语速特征范围为[3.2,4],间隔时长箱型图中B档次的间隔时长范围为[0.8,1.5],持续时长箱型图中B档次的间隔时长范围为[5.3,5.7],由于语速特征、间隔时长和持续时长均满足B档次的范围,因此该问题的自信度档次初次判定为B档次。In this optional embodiment, each confidence level in any feature box diagram (speaking rate feature box diagram, intermittent duration box diagram, duration box diagram) determines a feature range (range Is the maximum and minimum of different levels), only when all the characteristics of a certain question (characteristics of speech rate, interruption duration, duration) are determined to be the same level, the confidence level of the question is determined to be this level . Exemplarily, suppose that the speech rate feature of a speech is 3.4, the interval length is 1.3, and the duration is 5.6. The speech rate feature range of grade B in the speech rate feature box chart is [3.2,4], and the interval time box chart The interval duration range of the middle level B is [0.8, 1.5], and the interval duration range of the B level in the duration box chart is [5.3, 5.7]. Because of the characteristics of speech rate, interval length and duration, all satisfy the range of level B. Therefore, the confidence level of this question is judged as B level for the first time.
示例性的,假如第一自信度档次为A档次和B档次、第二自信度档次为A档次和B档次,第三自信度档次为A档次和B档次,即所述第一自信度档次、第二自信度档次和第三自信度档次均有多个且多个第一自信度档次、第二自信度档次和第三自信度档次均相同,则所述候选自信度档次有多个:A档次和B档次,自信度档次排序队列为AB,基于大数定理确定目标候选自信度为B档次,作为所述问题的自信度档次。Exemplarily, if the first confidence level is A and B, the second confidence is A and B, and the third confidence is A and B, that is, the first confidence level, There are multiple second confidence level and third confidence level and multiple first confidence level, second confidence level and third confidence level are all the same, then there are multiple candidate confidence levels: A Level and B level, the confidence level sorting queue is AB, and the target candidate’s confidence level is determined to be B level based on the law of large numbers, as the confidence level of the problem.
又如,第一自信度档次为A档次、B档次和C档次、第二自信度档次为A档次、B档次和C档次,第三自信度档次为A档次、B档次和C档次,即所述第一自信度档次、第二自信度档次和第三自信度档次均有多个且多个第一自信度档次、第二自信度档次和第三自信度档次均相同,则所述候选自信度档次有多个:A档次、B档次和C档次,自信度档次排序队列为ABC,基于大数定理确定目标候选自信度为B档次,作为所述问题的自信度档次。For another example, the first confidence level is A, B, and C, the second confidence is A, B, and C, and the third confidence is A, B, and C. If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level, and the multiple first, second, and third confidence levels are all the same, the candidate confidence level There are multiple levels: level A, level B, and level C. The confidence level ranking queue is ABC. Based on the law of large numbers, the confidence level of the target candidate is determined as level B as the confidence level of the problem.
应当理解的是,由于每个档次的特征范围都符合极值一致性,因而不会出现ABD或者BCE这种中间断档的情况。It should be understood that since the characteristic range of each grade conforms to extreme value consistency, there will be no intermediate gear disconnection such as ABD or BCE.
在一个可选的实施例中,若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且不均相同时,所述装置还包括:In an optional embodiment, if the first self-confidence level, the second self-confidence level, and the third self-confidence level are multiple and are not uniformly the same, the device further includes:
判断模块,用于判断第一自信度档次、第二自信度档次及第三自信度档次的多个档次中是否有相同的档次;The judgment module is used to judge whether the multiple levels of the first confidence level, the second confidence level, and the third confidence level have the same level;
所述判断模块,还用于若有相同的档次,则将相同的档次确定为候选自信度档次。The judgment module is also used to determine the same grade as the candidate confidence grade if there are the same grades.
示例性的,假设第一自信度档次为A档次、B档次和D档次、第二自信度档次为A档次、B档次和E档次,第三自信度档次为A档次、B档次和C档次,即所述第一自信度档次、第二自信度档次和第三自信度档次为多个且多个第一自信度档次、第二自信度档次和第三自信度档次不均相同,但第一自信度、第二自信度和第三自信度有相同的档次A和B,则将相同的A档次和B档次均作为候选自信度档次,最后基于大数定理确定出所述问题的自信度档次为B档次。For example, suppose the first confidence level is A, B, and D, the second confidence is A, B, and E, and the third confidence is A, B, and C. That is, the first, second, and third confidence levels are multiple and the multiple first, second, and third confidence levels are not the same, but the first Confidence, second confidence, and third confidence have the same grades A and B, then the same grades A and B are used as candidate confidence grades, and finally the confidence grade of the problem is determined based on the law of large numbers For the B grade.
在一个可选的实施例中,若所述第一自信度档次、第二自信度档次及第三自信度档次为一个且不均相同时,所述第三确定模块207,还用于确定所述问题的自信度档次为空档次。In an optional embodiment, if the first confidence level, the second confidence level, and the third confidence level are one and are not uniformly the same, the third determining module 207 is further configured to determine The confidence level of the problem is neutral.
其中,所述空档次是指遍历完所有的档次都不符合时的档次。Wherein, the neutral grade refers to the grade when all the grades are not met after traversing.
假设针对某一个问题,使用预先构建的自信度判定模型判断该问题的语速特征对应的自信度档次为A档次,而使用预先构建的自信度判定模型判断该问题的间断时长对应的自信度档次为B档次,使用预先构建的自信度判定模型判断该问题的持续时长对应的自信度档次为A档次,由于该问题的语速特征、间断时长及持续时长不均属于同一个自信度档次,则确定该问题的自信度档次不属于A档次且不属于B档次,即不存在第一,第二与第三自信度同时相同的情况,则确定该问题的自信度档次为空档次。Suppose that for a certain problem, use the pre-built confidence determination model to determine that the confidence level corresponding to the speech rate feature of the question is grade A, and use the pre-built confidence determination model to determine the confidence level corresponding to the intermittent duration of the question Level B, using the pre-built confidence determination model to determine that the duration of the question corresponds to the level of confidence level A, because the question’s speech rate characteristics, interruption duration, and duration are not all of the same confidence level, then It is determined that the confidence level of the question does not belong to the A level and does not belong to the B level, that is, there is no situation where the first, second and third confidence levels are the same at the same time, then the confidence level of the question is determined to be the neutral level.
为了便于对出现空档次的问题也进行计算,同时根据大数定理,空档次的问题极大可能属于最一般的情况,即C档,因此可以预先设定空档次为C档次。In order to facilitate the calculation of the problem of the neutral level, and at the same time, according to the law of large numbers, the problem of the neutral level is most likely to belong to the most general situation, that is, the C level, so the neutral level can be preset as the C level.
所述第三确定模块207,还用于使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度。The third determining module 207 is further configured to use a pre-built confidence level determination model to determine the speech rate characteristics and interruption duration, and determine the interviewer's response speed.
反应速度越快的人群,整体语速特征越大,间断时长越短;反应速度越慢的人群,整体语速特征越小,间断时长越长。The faster the response speed of the population, the greater the overall speaking rate characteristics, and the shorter the interruption duration; the slower the response speed, the lower the overall speaking rate characteristics, and the longer the interruption duration.
关于所述使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定来确定所述面试者的反应速度的过程,同所述使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定来确定所述面试者的自信度的过程,具体可参见S15及其相关描述,本申请在此不再详细阐述。Regarding the process of using the pre-built confidence determination model to determine the speech rate characteristics and the intermittent duration to determine the interviewer’s response speed, the same as the use of the pre-built confidence determination model to determine the speaking rate The process of judging characteristics, intermittent duration, and duration to determine the confidence of the interviewer is detailed in S15 and related descriptions, which will not be elaborated here in this application.
在一个可选的实施例中,所述第二确定模块206与所述第三确定模块207并行执行。In an optional embodiment, the second determining module 206 and the third determining module 207 are executed in parallel.
在该可选的实施例中,可以同时开启两个线程同步执行,其中一个线程用于使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,另一个线程用于使用预先构建的反应速度判定模型对所述语速特征及间断时长进行判定。由于两个线程并行执行,因此,可以提高面试者的自信度和反应速度的判定效率,缩短判定时间,提高面试筛选的效率。In this alternative embodiment, two threads can be started for synchronous execution at the same time. One thread is used to determine the speech rate feature, interruption duration, and duration using a pre-built confidence determination model, and the other thread uses To use a pre-built reaction speed judgment model to judge the speech rate characteristics and the length of the interruption. Since the two threads are executed in parallel, it can improve the interviewer's confidence and response speed judgment efficiency, shorten the judgment time, and improve the efficiency of interview screening.
输出模块208,用于根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The output module 208 is configured to output the interview result of the interviewer according to the emotional stability, reaction speed, and confidence.
在面试过程中,通过面试者对问题的回答语音分析出面试者的情绪稳定度、反应速度和自信度之后,可以根据面试岗位的侧重点,筛选出符合面试要求的面试者。In the interview process, after the interviewer’s voice analysis of the interviewer’s answer to the question, the interviewer’s emotional stability, reaction speed, and confidence, the interviewer can be selected according to the focus of the interview position to meet the interview requirements.
例如,对于客服岗位,需要情绪稳定且反应速度快的人员,以应对随时变化的市场,则筛选面试者时,重点关注情绪稳定度和反应速度。For example, for customer service positions, people who are emotionally stable and quick to respond are needed to cope with the ever-changing market. When selecting interviewers, focus on emotional stability and reaction speed.
再如,对于营销岗位,需要自信度高的人员,以给客户带来积极肯定的印象,促进双方的合作,则筛选面试者时,重点关注自信度。For another example, for marketing positions, people with a high degree of confidence are needed to give customers a positive impression and promote cooperation between the two parties. When selecting interviewers, focus on self-confidence.
综上,本申请所述的基于语音的面试者判定装置,通过获取面试者的每个问题的回答语音,对每个问题的回答语音进行切片,得到多个语音片段,提取每个所述语音片段的音量特征、语速特征、持续时长、间断时长,基于音量特征确定出面试者的情绪稳定度,再采用预先构建的自信度判定模型和反应速度判定模型对所述语速特征、持续时长、间断时长进行判断,确定出面试者的自信度和反应速度,根据所述情绪稳定度、反应速度和自信度输出面试 者的面试结果。本申请通过对面试过程人机交互的语音进行深入分析与挖掘,确定出面试者的多个特征,例如情绪稳定度、反应速度和自信度,通过这些特征能够客观且全面的评价面试者,评价结果更为精确和准确,提高了面试的判定效率和质量。In summary, the voice-based interviewer judgment device described in this application obtains the answer voice of each question of the interviewer, slices the answer voice of each question, and obtains multiple voice fragments, and extracts each of the voices. The volume characteristics, speaking rate characteristics, duration, and intermittent duration of the fragments are used to determine the emotional stability of the interviewer based on the volume characteristics, and then the pre-built confidence determination model and reaction speed determination model are used to determine the speech rate characteristics and duration. , Intermittent time judgment, determine the confidence and reaction speed of the interviewer, and output the interview result of the interviewer according to the emotional stability, reaction speed and confidence. This application uses in-depth analysis and mining of the human-computer interaction voice during the interview process to determine multiple characteristics of the interviewer, such as emotional stability, reaction speed, and self-confidence. Through these characteristics, the interviewer can be evaluated objectively and comprehensively. The result is more precise and accurate, which improves the efficiency and quality of the interview judgment.
由以上实施例可知,本申请可应用在智慧政务等领域,从而推动智慧城市的发展。It can be seen from the above embodiments that this application can be applied in fields such as smart government affairs, so as to promote the development of smart cities.
实施例三Example three
参阅图3所示,为本申请实施例三提供的终端的结构示意图。在本申请较佳实施例中,所述终端3包括存储器31、至少一个处理器32、至少一条通信总线33及收发器34。Refer to FIG. 3, which is a schematic structural diagram of a terminal provided in Embodiment 3 of this application. In a preferred embodiment of the present application, the terminal 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.
本领域技术人员应该了解,图3示出的终端的结构并不构成本申请实施例的限定,既可以是总线型结构,也可以是星形结构,所述终端3还可以包括比图示更多或更少的其他硬件或者软件,或者不同的部件布置。Those skilled in the art should understand that the structure of the terminal shown in FIG. 3 does not constitute a limitation of the embodiment of the present application. It may be a bus-type structure or a star structure. The terminal 3 may also include more More or less other hardware or software, or different component arrangements.
在一些实施例中,所述终端3是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路、可编程门阵列、数字处理器及嵌入式设备等。所述终端3还可包括客户设备,所述客户设备包括但不限于任何一种可与客户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、数码相机等。In some embodiments, the terminal 3 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. Its hardware includes but is not limited to a microprocessor, an application specific integrated circuit, and Programming gate arrays, digital processors and embedded devices, etc. The terminal 3 may also include client equipment. The client equipment includes, but is not limited to, any electronic product that can interact with the client through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, a personal computer. Computers, tablets, smart phones, digital cameras, etc.
需要说明的是,所述终端3仅为举例,其他现有的或今后可能出现的电子产品如可适应于本申请,也应包含在本申请的保护范围以内,并以引用方式包含于此。It should be noted that the terminal 3 is only an example. If other existing or future electronic products can be adapted to this application, they should also be included in the protection scope of this application and included here by reference.
在一些实施例中,所述存储器31用于存储计算机可读指令和各种数据,例如安装在所述终端3中的装置,并在终端3的运行过程中实现高速、自动地完成程序或数据的存取。所述存储器31包括易失性和非易失性存储器,例如,随机存取存储器(Random Access Memory,RAM)、只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子擦除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。所述计算机可读存储介质可以是非易失性,也可以是易失性的。In some embodiments, the memory 31 is used to store computer-readable instructions and various data, such as a device installed in the terminal 3, and realize high-speed and automatic completion of programs or data during the operation of the terminal 3 Access. The memory 31 includes volatile and non-volatile memory, for example, random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), and programmable read-only memory (Programmable Read-Only Memory). Only Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), and electronically erasable rewritable Read-only memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or capable of carrying or storing data Any other medium readable by the computer. The computer-readable storage medium may be non-volatile or volatile.
在一些实施例中,所述至少一个处理器32可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述至少一个处理器32是所述终端3的控制核心(Control Unit),利用各种接口和线路连接整个终端3的各个部件,通过运行或执行存储在所述存储器31内的程序或者模块,以及调用存储在所述存储器31内的数据,以执行终端3的各种功能和处理数据。In some embodiments, the at least one processor 32 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one Or a combination of multiple central processing units (CPU), microprocessors, digital processing chips, graphics processors, and various control chips. The at least one processor 32 is the control core (Control Unit) of the terminal 3. Various interfaces and lines are used to connect the various components of the entire terminal 3, and by running or executing programs or modules stored in the memory 31, And call the data stored in the memory 31 to execute various functions of the terminal 3 and process data.
在一些实施例中,所述至少一条通信总线33被设置为实现所述存储器31以及所述至少一个处理器32等之间的连接通信。In some embodiments, the at least one communication bus 33 is configured to implement connection and communication between the memory 31 and the at least one processor 32 and the like.
尽管未示出,所述终端3还可以包括给各个部件供电的电源(比如电池),优选的,电源可以通过电源管理装置与所述至少一个处理器32逻辑相连,从而通过电源管理装置实现管理充电、放电、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述终端3还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。Although not shown, the terminal 3 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 32 through a power management device, so as to realize management through the power management device. Functions such as charging, discharging, and power management. The power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The terminal 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only for illustrative purposes, and are not limited by this structure in the scope of the patent application.
上述以软件功能模块的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,终端,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分。The above-mentioned integrated unit implemented in the form of a software function module may be stored in a computer readable storage medium. The above-mentioned software function module is stored in a storage medium and includes several instructions to make a computer device (which may be a personal computer, a terminal, or a network device, etc.) or a processor execute the method described in each embodiment of the present application. section.
在进一步的实施例中,结合图2,所述至少一个处理器32可执行所述终端3的操作装置以及安装的各类应用程序、计算机可读指令等,例如,上述的各个模块。In a further embodiment, with reference to FIG. 2, the at least one processor 32 can execute the operating device of the terminal 3 and various installed applications, computer-readable instructions, etc., such as the above-mentioned modules.
所述存储器31中存储有计算机可读指令,且所述至少一个处理器32可调用所述存储器31中存储的计算机可读指令以执行相关的功能。例如,图2中所述的各个模块是存储在所述存储器31中的计算机可读指令,并由所述至少一个处理器32所执行,从而实现所述各个模块的功能。The memory 31 stores computer-readable instructions, and the at least one processor 32 can call the computer-readable instructions stored in the memory 31 to perform related functions. For example, the various modules described in FIG. 2 are computer-readable instructions stored in the memory 31 and executed by the at least one processor 32, so as to realize the functions of the various modules.
在本申请的一个实施例中,所述存储器31存储多个指令,所述多个指令被所述至少一个处理器32所执行以实现本申请所述的方法中的全部或者部分步骤。In an embodiment of the present application, the memory 31 stores multiple instructions, and the multiple instructions are executed by the at least one processor 32 to implement all or part of the steps in the method described in the present application.
具体地,所述至少一个处理器32对上述指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。Specifically, for the specific implementation method of the at least one processor 32 on the foregoing instructions, reference may be made to the description of the relevant steps in the embodiment corresponding to FIG. 1, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed device and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或,单数不排除复数。装置权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the application. Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any reference signs in the claims should not be regarded as limiting the claims involved. In addition, it is obvious that the word "including" does not exclude other elements or, and the singular does not exclude the plural. Multiple units or devices stated in the device claims can also be implemented by one unit or device through software or hardware. Words such as first and second are used to denote names, but do not denote any specific order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims (22)

  1. 一种基于语音的面试者判定方法,其中,所述方法包括:A voice-based interviewer judgment method, wherein the method includes:
    获取面试者的多个问题的回答语音;Obtain the answer voice of the interviewer's multiple questions;
    对每个问题的回答语音进行切片,得到多个语音片段;Slice the answer voice of each question to get multiple voice fragments;
    根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长;Calculate the volume characteristics, speech rate characteristics, duration, and intermittent duration of each question according to the multiple speech fragments;
    根据所述每个问题的音量特征确定所述面试者的情绪稳定度;Determine the emotional stability of the interviewer according to the volume characteristics of each question;
    使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度;Use a pre-built confidence judgment model to judge the speaking rate feature, the intermittent duration, and the duration to determine the interviewer's confidence;
    使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度;Use a pre-built confidence judgment model to judge the speech rate characteristics and the length of the interruption, and determine the interviewer's response speed;
    根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
  2. 如权利要求1所述的方法,其中,在所述获取面试者的多个问题的回答语音之前,所述方法还包括:8. The method according to claim 1, wherein, before said obtaining the answering voices of a plurality of questions of the interviewer, the method further comprises:
    构建自信度判定模型和反应速度判定模型;Establish a self-confidence judgment model and a reaction speed judgment model;
    其中,所述自信度判定模型和反应速度判定模型的构建过程包括:Wherein, the process of constructing the confidence level judgment model and the reaction speed judgment model includes:
    获取多个样本语音;Obtain multiple sample voices;
    提取所述多个样本语音中的多个特征;Extracting multiple features in the multiple sample voices;
    根据所述多个特征的分布情况,从所述多个特征中筛选出自信度区分度大的第一显著特征和筛选出反应速度区分度大的第二显著特征,其中,所述第一显著特征包括:语速特征、持续时长、间断时长,所述第二显著特征包括:语速特征、间断时长;According to the distribution of the multiple features, the first significant feature with a large degree of confidence and the second significant feature with a large degree of response speed are selected from the multiple features, wherein the first significant Features include: speech rate characteristics, duration, and intermittent duration, and the second significant feature includes: speech rate features, intermittent duration;
    确定所述多个第一显著特征对应的多个自信度档次以及每个所述自信度档次对应的特征范围,及确定所述多个第二显著特征对应的多个反应速度档次以及每个所述反应速度档次对应的特征范围;Determine the multiple confidence levels corresponding to the multiple first salient features and the feature range corresponding to each of the confidence levels, and determine the multiple response speed levels corresponding to the multiple second salient features and each Describe the characteristic range corresponding to the reaction speed grade;
    分别判断不同自信度档次的特征范围和不同反应速度档次的特征范围是否符合极值一致性;Judge whether the characteristic ranges of different confidence levels and the characteristic ranges of different reaction speed levels are consistent with extreme values;
    若不同自信度档次的特征范围符合极值一致性,基于所述多个第一显著特征、多个自信度档次以及每个所述自信度档次对应的特征范围,构建自信度判定模型;If the feature ranges of different confidence levels are consistent with extreme values, construct a confidence determination model based on the multiple first salient features, multiple confidence levels, and the feature range corresponding to each of the confidence levels;
    若不同反应速度档次的特征范围符合极值一致性,基于所述多个第二显著特征、多个反应速度档次以及每个所述反应速度档次对应的特征范围,构建反应速度判定模型。If the characteristic ranges of different reaction speed grades meet the extreme consistency, a reaction speed determination model is constructed based on the plurality of second salient characteristics, the plurality of reaction speed grades, and the characteristic range corresponding to each of the reaction speed grades.
  3. 如权利要求1所述的方法,其中,所述根据所述每个问题的音量特征确定所述面试者的情绪稳定度包括:The method of claim 1, wherein the determining the emotional stability of the interviewer according to the volume characteristics of each question comprises:
    获取所述问题的音量特征中的最大音量特征及最小音量特征;Acquiring the maximum volume feature and the minimum volume feature among the volume features of the problem;
    计算所有问题的平均音量特征;Calculate the average volume characteristics of all questions;
    计算所述最大音量特征和所述最小音量特征之间的音量特征幅度值;Calculating a volume feature amplitude value between the maximum volume feature and the minimum volume feature;
    根据所述每个问题的音量特征与所述所有问题的平均音量特征的差值的绝对值占所述音量特征幅度值的占比,确定出每个问题的音量波动幅度;Determine the volume fluctuation range of each question according to the percentage of the absolute value of the difference between the volume characteristic of each question and the average volume characteristic of all the questions in the volume characteristic amplitude value;
    根据所有问题的音量波动幅度的平均值,确定面试者的情绪稳定度。Determine the emotional stability of the interviewee based on the average of the volume fluctuations of all questions.
  4. 如权利要求1至3中任意一项所述的方法,其中,所述使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度包括:The method according to any one of claims 1 to 3, wherein the pre-built confidence determination model is used to determine the speaking rate feature, the interruption duration, and the duration to determine the confidence of the interviewer include:
    使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次;Use a pre-built confidence judgment model to judge the speech rate characteristics, interruption time and duration of each question, and determine the confidence level of each question;
    将所有问题得到的自信度档次转换为数值;Convert the confidence level obtained from all questions into numerical values;
    对所有问题自信度档次数据取平均值;Take the average of the confidence level data of all questions;
    平均值向上取整得到面试者的自信度判定结果。The average is rounded up to get the interviewer’s confidence judgment result.
  5. 如权利要求4所述的方法,其中,所述使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次包括:The method of claim 4, wherein the use of a pre-built confidence determination model to determine the speech rate characteristics, interruption duration, and duration of each question, and determining the confidence level of each question comprises :
    使用预先构建的自信度判定模型分别识别所述语速特征、间断时长及持续时长属于的自信度档次对应的特征范围;Using a pre-built confidence determination model to respectively identify the characteristic range corresponding to the confidence level to which the speech rate feature, the interruption duration, and the duration belong;
    将属于的特征范围对应的自信度档次确定为所述语速特征的第一自信度档次、所述间断时长的第二自信度档次及所述持续时长的第三自信度档次;Determining the confidence level corresponding to the characteristic range that belongs to the first confidence level of the speaking rate feature, the second confidence level of the intermittent duration, and the third confidence level of the continuous duration;
    判断所述第一自信度档次、第二自信度档次及第三自信度档次是否为多个;Judging whether there are multiple levels of the first confidence level, the second confidence level, and the third confidence level;
    若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且均相同时,将多个相同的档次确定为候选自信度档次;If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level and they are all the same, determine the multiple same levels as the candidate confidence level;
    将多个所述候选自信度档次按自信度档次从高到低排序,获得自信度档次排序队列;Sorting the multiple candidate confidence levels in descending order of confidence level to obtain a confidence level ranking queue;
    基于大数定理确定所述自信度档次排序队列的目标候选自信度档次为所述问题的自信度档次。Based on the law of large numbers, it is determined that the confidence level of the target candidate in the confidence level ranking queue is the confidence level of the problem.
  6. 如权利要求5所述的方法,其中,若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且不均相同时,所述方法还包括:5. The method of claim 5, wherein if the first confidence level, the second confidence level, and the third confidence level are multiple and are not uniformly the same, the method further comprises:
    判断第一自信度档次、第二自信度档次及第三自信度档次的多个档次中是否有相同的档次;Determine whether the multiple levels of the first confidence level, the second confidence level and the third confidence level have the same level;
    若有相同的档次,则将相同的档次确定为候选自信度档次。If there are the same grades, the same grades are determined as candidate confidence grades.
  7. 如权利要求5所述的方法,其中,若所述第一自信度档次、第二自信度档次及第三自信度档次为一个且不均相同时,所述方法还包括:5. The method of claim 5, wherein if the first confidence level, the second confidence level, and the third confidence level are one and are not uniformly the same, the method further comprises:
    确定所述问题的自信度档次为空档次。It is determined that the confidence level of the question is the neutral level.
  8. 一种基于语音的面试者判定装置,其中,所述装置包括:A voice-based interviewer judging device, wherein the device includes:
    获取模块,用于获取面试者的多个问题的回答语音;The acquisition module is used to acquire the answer voice of the interviewer’s multiple questions
    切片模块,用于对每个问题的回答语音进行切片,得到多个语音片段;The slicing module is used to slice the answer voice of each question to obtain multiple voice fragments;
    计算模块,用于根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长;The calculation module is used to calculate the volume characteristic, the speaking rate characteristic, the duration, and the intermittent duration of each question according to the multiple speech fragments;
    第一确定模块,用于根据所述每个问题的音量特征确定所述面试者的情绪稳定度;The first determining module is configured to determine the emotional stability of the interviewer according to the volume characteristics of each question;
    第二确定模块,用于使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度;The second determination module is configured to use a pre-built confidence determination model to determine the speaking rate feature, the intermittent duration, and the duration, and determine the interviewer's confidence;
    第三确定模块,用于使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度;The third determining module is configured to use a pre-built confidence level determination model to determine the speaking rate feature and the interruption duration, and determine the interviewer's response speed;
    输出模块,用于根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The output module is used to output the interview result of the interviewer according to the emotional stability, reaction speed and confidence.
  9. 一种终端,其中,所述终端包括处理器,所述处理器用于执行存储器中存储的计算机可读指令时实现以下步骤:A terminal, wherein the terminal includes a processor, and the processor is configured to implement the following steps when executing computer-readable instructions stored in a memory:
    获取面试者的多个问题的回答语音;Obtain the answer voice of the interviewer's multiple questions;
    对每个问题的回答语音进行切片,得到多个语音片段;Slice the answer voice of each question to get multiple voice fragments;
    根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长;Calculate the volume characteristics, speech rate characteristics, duration, and intermittent duration of each question according to the multiple speech fragments;
    根据所述每个问题的音量特征确定所述面试者的情绪稳定度;Determine the emotional stability of the interviewer according to the volume characteristics of each question;
    使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度;Use a pre-built confidence judgment model to judge the speaking rate feature, the intermittent duration, and the duration to determine the interviewer's confidence;
    使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度;Use a pre-built confidence judgment model to judge the speech rate characteristics and the length of the interruption, and determine the interviewer's response speed;
    根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
  10. 如权利要求9所述的终端,其中,在所述获取面试者的多个问题的回答语音之前,所述处理器执行所述计算机可读指令时还实现以下步骤:9. The terminal according to claim 9, wherein, before said acquiring the answering voices of a plurality of questions of the interviewer, the processor further implements the following steps when executing the computer-readable instructions:
    构建自信度判定模型和反应速度判定模型;Establish a self-confidence judgment model and a reaction speed judgment model;
    其中,所述自信度判定模型和反应速度判定模型的构建过程包括:Wherein, the process of constructing the confidence level judgment model and the reaction speed judgment model includes:
    获取多个样本语音;Obtain multiple sample voices;
    提取所述多个样本语音中的多个特征;Extracting multiple features in the multiple sample voices;
    根据所述多个特征的分布情况,从所述多个特征中筛选出自信度区分度大的第一显著特征和筛选出反应速度区分度大的第二显著特征,其中,所述第一显著特征包括:语速特征、持续时长、间断时长,所述第二显著特征包括:语速特征、间断时长;According to the distribution of the multiple features, the first significant feature with a large degree of confidence and the second significant feature with a large degree of response speed are selected from the multiple features, wherein the first significant Features include: speech rate characteristics, duration, and intermittent duration, and the second significant feature includes: speech rate features, intermittent duration;
    确定所述多个第一显著特征对应的多个自信度档次以及每个所述自信度档次对应的特征范围,及确定所述多个第二显著特征对应的多个反应速度档次以及每个所述反应速度档次对应的特征范围;Determine the multiple confidence levels corresponding to the multiple first salient features and the feature range corresponding to each of the confidence levels, and determine the multiple response speed levels corresponding to the multiple second salient features and each Describe the characteristic range corresponding to the reaction speed grade;
    分别判断不同自信度档次的特征范围和不同反应速度档次的特征范围是否符合极值一致性;Judge whether the characteristic ranges of different confidence levels and the characteristic ranges of different reaction speed levels are consistent with extreme values;
    若不同自信度档次的特征范围符合极值一致性,基于所述多个第一显著特征、多个自信度档次以及每个所述自信度档次对应的特征范围,构建自信度判定模型;If the feature ranges of different confidence levels are consistent with extreme values, construct a confidence determination model based on the multiple first salient features, multiple confidence levels, and the feature range corresponding to each of the confidence levels;
    若不同反应速度档次的特征范围符合极值一致性,基于所述多个第二显著特征、多个反应速度档次以及每个所述反应速度档次对应的特征范围,构建反应速度判定模型。If the characteristic ranges of different reaction speed grades meet the extreme consistency, a reaction speed determination model is constructed based on the plurality of second salient characteristics, the plurality of reaction speed grades, and the characteristic range corresponding to each of the reaction speed grades.
  11. 如权利要求9所述的终端,其中,所述处理器执行所述计算机可读指令以实现所述根据所述每个问题的音量特征确定所述面试者的情绪稳定度时,具体包括:The terminal according to claim 9, wherein the processor executes the computer-readable instructions to realize the determination of the emotional stability of the interviewer according to the volume characteristics of each question, which specifically includes:
    获取所述问题的音量特征中的最大音量特征及最小音量特征;Acquiring the maximum volume feature and the minimum volume feature among the volume features of the problem;
    计算所有问题的平均音量特征;Calculate the average volume characteristics of all questions;
    计算所述最大音量特征和所述最小音量特征之间的音量特征幅度值;Calculating a volume feature amplitude value between the maximum volume feature and the minimum volume feature;
    根据所述每个问题的音量特征与所述所有问题的平均音量特征的差值的绝对值占所述音量特征幅度值的占比,确定出每个问题的音量波动幅度;Determine the volume fluctuation range of each question according to the percentage of the absolute value of the difference between the volume characteristic of each question and the average volume characteristic of all the questions in the volume characteristic amplitude value;
    根据所有问题的音量波动幅度的平均值,确定面试者的情绪稳定度。Determine the emotional stability of the interviewee based on the average of the volume fluctuations of all questions.
  12. 如权利要求9至11中任意一项所述的终端,其中,所述处理器执行所述计算机可读指令以实现所述使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度时,具体包括:The terminal according to any one of claims 9 to 11, wherein the processor executes the computer-readable instructions to implement the use of the pre-built confidence determination model to perform the evaluation of the speech rate characteristics, the intermittent duration, and When determining the duration of the judgment and determining the confidence of the interviewer, it specifically includes:
    使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次;Use a pre-built confidence judgment model to judge the speech rate characteristics, interruption time and duration of each question, and determine the confidence level of each question;
    将所有问题得到的自信度档次转换为数值;Convert the confidence level obtained from all questions into numerical values;
    对所有问题自信度档次数据取平均值;Take the average of the confidence level data of all questions;
    平均值向上取整得到面试者的自信度判定结果。The average is rounded up to get the interviewer’s confidence judgment result.
  13. 如权利要求12所述的终端,其中,所述处理器执行所述计算机可读指令以实现所述使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次时,具体包括:The terminal according to claim 12, wherein the processor executes the computer-readable instructions to implement the use of the pre-built confidence determination model for the speech rate characteristics, interruption duration, and duration of each question When making judgments and determining the level of confidence for each question, the details include:
    使用预先构建的自信度判定模型分别识别所述语速特征、间断时长及持续时长属于的自信度档次对应的特征范围;Using a pre-built confidence determination model to respectively identify the characteristic range corresponding to the confidence level to which the speech rate feature, the interruption duration, and the duration belong;
    将属于的特征范围对应的自信度档次确定为所述语速特征的第一自信度档次、所述间断时长的第二自信度档次及所述持续时长的第三自信度档次;Determining the confidence level corresponding to the characteristic range that belongs to the first confidence level of the speaking rate feature, the second confidence level of the intermittent duration, and the third confidence level of the continuous duration;
    判断所述第一自信度档次、第二自信度档次及第三自信度档次是否为多个;Judging whether there are multiple levels of the first confidence level, the second confidence level, and the third confidence level;
    若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且均相同时,将多个相同的档次确定为候选自信度档次;If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level and they are all the same, determine the multiple same levels as the candidate confidence level;
    将多个所述候选自信度档次按自信度档次从高到低排序,获得自信度档次排序队列;Sorting the multiple candidate confidence levels in descending order of confidence level to obtain a confidence level ranking queue;
    基于大数定理确定所述自信度档次排序队列的目标候选自信度档次为所述问题的自信度档次。Based on the law of large numbers, it is determined that the confidence level of the target candidate in the confidence level ranking queue is the confidence level of the problem.
  14. 如权利要求13所述的终端,其中,若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且不均相同时,所述处理器执行所述计算机可读指令还实现以下步骤:The terminal according to claim 13, wherein, if the first confidence level, the second confidence level, and the third confidence level are multiple and are not uniformly the same, the processor executes the computer readable The instruction also implements the following steps:
    判断第一自信度档次、第二自信度档次及第三自信度档次的多个档次中是否有相同的档次;Determine whether the multiple levels of the first confidence level, the second confidence level and the third confidence level have the same level;
    若有相同的档次,则将相同的档次确定为候选自信度档次。If there are the same grades, the same grades are determined as candidate confidence grades.
  15. 如权利要求13所述的终端,其中,若所述第一自信度档次、第二自信度档次及第三自信度档次为一个且不均相同时,所述处理器执行所述计算机可读指令还实现以下步骤:The terminal according to claim 13, wherein, if the first confidence level, the second confidence level, and the third confidence level are one and the same, the processor executes the computer-readable instruction Also implement the following steps:
    确定所述问题的自信度档次为空档次。It is determined that the confidence level of the question is the neutral level.
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时实现以下步骤:A computer-readable storage medium having computer-readable instructions stored thereon, wherein the computer-readable instructions implement the following steps when executed by a processor:
    获取面试者的多个问题的回答语音;Obtain the answer voice of the interviewer's multiple questions;
    对每个问题的回答语音进行切片,得到多个语音片段;Slice the answer voice of each question to get multiple voice fragments;
    根据所述多个语音片段计算每个问题的音量特征、语速特征、持续时长、间断时长;Calculate the volume characteristics, speech rate characteristics, duration, and intermittent duration of each question according to the multiple speech fragments;
    根据所述每个问题的音量特征确定所述面试者的情绪稳定度;Determine the emotional stability of the interviewer according to the volume characteristics of each question;
    使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度;Use a pre-built confidence judgment model to judge the speaking rate feature, the intermittent duration, and the duration to determine the interviewer's confidence;
    使用预先构建的自信度判定模型对所述语速特征及间断时长进行判定,确定所述面试者的反应速度;Use a pre-built confidence judgment model to judge the speech rate characteristics and the length of the interruption, and determine the interviewer's response speed;
    根据所述情绪稳定度、反应速度和自信度输出面试者的面试结果。The interview result of the interviewer is output according to the emotional stability, reaction speed and self-confidence.
  17. 如权利要求16所述的计算机可读存储介质,其中,在所述获取面试者的多个问题的回答语音之前,所述计算机可读指令被所述处理器执行时还实现以下步骤:16. The computer-readable storage medium according to claim 16, wherein, before said obtaining the answering voices of a plurality of questions of the interviewer, the following steps are further implemented when the computer-readable instructions are executed by the processor:
    构建自信度判定模型和反应速度判定模型;Establish a self-confidence judgment model and a reaction speed judgment model;
    其中,所述自信度判定模型和反应速度判定模型的构建过程包括:Wherein, the process of constructing the confidence level judgment model and the reaction speed judgment model includes:
    获取多个样本语音;Obtain multiple sample voices;
    提取所述多个样本语音中的多个特征;Extracting multiple features in the multiple sample voices;
    根据所述多个特征的分布情况,从所述多个特征中筛选出自信度区分度大的第一显著特征和筛选出反应速度区分度大的第二显著特征,其中,所述第一显著特征包括:语速特征、持续时长、间断时长,所述第二显著特征包括:语速特征、间断时长;According to the distribution of the multiple features, the first significant feature with a large degree of confidence and the second significant feature with a large degree of response speed are selected from the multiple features, wherein the first significant Features include: speech rate characteristics, duration, and intermittent duration, and the second significant feature includes: speech rate features, intermittent duration;
    确定所述多个第一显著特征对应的多个自信度档次以及每个所述自信度档次对应的特征范围,及确定所述多个第二显著特征对应的多个反应速度档次以及每个所述反应速度档次对应的特征范围;Determine the multiple confidence levels corresponding to the multiple first salient features and the feature range corresponding to each of the confidence levels, and determine the multiple response speed levels corresponding to the multiple second salient features and each Describe the characteristic range corresponding to the reaction speed grade;
    分别判断不同自信度档次的特征范围和不同反应速度档次的特征范围是否符合极值一致性;Judge whether the characteristic ranges of different confidence levels and the characteristic ranges of different reaction speed levels are consistent with extreme values;
    若不同自信度档次的特征范围符合极值一致性,基于所述多个第一显著特征、多个自信度档次以及每个所述自信度档次对应的特征范围,构建自信度判定模型;If the feature ranges of different confidence levels are consistent with extreme values, construct a confidence determination model based on the multiple first salient features, multiple confidence levels, and the feature range corresponding to each of the confidence levels;
    若不同反应速度档次的特征范围符合极值一致性,基于所述多个第二显著特征、多个反应速度档次以及每个所述反应速度档次对应的特征范围,构建反应速度判定模型。If the characteristic ranges of different reaction speed grades meet the extreme consistency, a reaction speed determination model is constructed based on the plurality of second salient characteristics, the plurality of reaction speed grades, and the characteristic range corresponding to each of the reaction speed grades.
  18. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述根据所述每个问题的音量特征确定所述面试者的情绪稳定度时,具体包括:The computer-readable storage medium of claim 16, wherein the computer-readable instructions are executed by the processor to implement the determination of the emotional stability of the interviewer according to the volume characteristics of each question , Specifically including:
    获取所述问题的音量特征中的最大音量特征及最小音量特征;Acquiring the maximum volume feature and the minimum volume feature among the volume features of the problem;
    计算所有问题的平均音量特征;Calculate the average volume characteristics of all questions;
    计算所述最大音量特征和所述最小音量特征之间的音量特征幅度值;Calculating a volume feature amplitude value between the maximum volume feature and the minimum volume feature;
    根据所述每个问题的音量特征与所述所有问题的平均音量特征的差值的绝对值占所述音量特征幅度值的占比,确定出每个问题的音量波动幅度;Determine the volume fluctuation range of each question according to the percentage of the absolute value of the difference between the volume characteristic of each question and the average volume characteristic of all the questions in the volume characteristic amplitude value;
    根据所有问题的音量波动幅度的平均值,确定面试者的情绪稳定度。Determine the emotional stability of the interviewee based on the average of the volume fluctuations of all questions.
  19. 如权利要求16至18中任意一项所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述使用预先构建的自信度判定模型对所述语速特征、间断时长及持续时长进行判定,确定所述面试者的自信度时,具体包括:The computer-readable storage medium according to any one of claims 16 to 18, wherein the computer-readable instructions are executed by the processor to realize the use of a pre-built confidence determination model to control the speech rate The characteristics, intermittent duration and duration are judged, and when determining the confidence of the interviewer, it specifically includes:
    使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次;Use a pre-built confidence judgment model to judge the speech rate characteristics, interruption time and duration of each question, and determine the confidence level of each question;
    将所有问题得到的自信度档次转换为数值;Convert the confidence level obtained from all questions into numerical values;
    对所有问题自信度档次数据取平均值;Take the average of the confidence level data of all questions;
    平均值向上取整得到面试者的自信度判定结果。The average is rounded up to get the interviewer’s confidence judgment result.
  20. 如权利要求19所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述使用预先构建的自信度判定模型对所述每个问题的语速特征、间断时长及持续时长进行判定,确定出每个问题的自信度档次时,具体包括:19. The computer-readable storage medium of claim 19, wherein the computer-readable instructions are executed by the processor to implement the speech rate characteristics of each question using the pre-built confidence determination model, Intermittent time and continuous time are judged, and the confidence level of each question is determined, including:
    使用预先构建的自信度判定模型分别识别所述语速特征、间断时长及持续时长属于的自信度档次对应的特征范围;Using a pre-built confidence determination model to respectively identify the characteristic range corresponding to the confidence level to which the speech rate feature, the interruption duration, and the duration belong;
    将属于的特征范围对应的自信度档次确定为所述语速特征的第一自信度档次、所述间断时长的第二自信度档次及所述持续时长的第三自信度档次;Determining the confidence level corresponding to the characteristic range that belongs to the first confidence level of the speaking rate feature, the second confidence level of the intermittent duration, and the third confidence level of the continuous duration;
    判断所述第一自信度档次、第二自信度档次及第三自信度档次是否为多个;Judging whether there are multiple levels of the first confidence level, the second confidence level, and the third confidence level;
    若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且均相同时,将多个相同的档次确定为候选自信度档次;If there are multiple levels of the first confidence level, the second confidence level, and the third confidence level and they are all the same, determine the multiple same levels as the candidate confidence level;
    将多个所述候选自信度档次按自信度档次从高到低排序,获得自信度档次排序队列;Sorting the multiple candidate confidence levels in descending order of confidence level to obtain a confidence level ranking queue;
    基于大数定理确定所述自信度档次排序队列的目标候选自信度档次为所述问题的自信度档次。Based on the law of large numbers, it is determined that the confidence level of the target candidate in the confidence level ranking queue is the confidence level of the problem.
  21. 如权利要求20所述的计算机可读存储介质,其中,若所述第一自信度档次、第二自信度档次及第三自信度档次为多个且不均相同时,所述计算机可读指令被所述处理器执行还实现以下步骤:20. The computer-readable storage medium of claim 20, wherein if the first confidence level, the second confidence level, and the third confidence level are multiple and are not uniformly the same, the computer readable instruction Execution by the processor also implements the following steps:
    判断第一自信度档次、第二自信度档次及第三自信度档次的多个档次中是否有相同的档次;Determine whether the multiple levels of the first confidence level, the second confidence level and the third confidence level have the same level;
    若有相同的档次,则将相同的档次确定为候选自信度档次。If there are the same grades, the same grades are determined as candidate confidence grades.
  22. 如权利要求20所述的计算机可读存储介质,其中,若所述第一自信度档次、第二自信度档次及第三自信度档次为一个且不均相同时,所述计算机可读指令被所述处理器执行还实现以下步骤:22. The computer-readable storage medium of claim 20, wherein if the first confidence level, the second confidence level, and the third confidence level are one and the same, the computer readable instruction is The processor execution also implements the following steps:
    确定所述问题的自信度档次为空档次。It is determined that the confidence level of the question is the neutral level.
PCT/CN2020/098891 2019-09-23 2020-06-29 Voice-based interviewee determination method and device, terminal, and storage medium WO2021057146A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910900813.9A CN110827796B (en) 2019-09-23 2019-09-23 Interviewer judging method and device based on voice, terminal and storage medium
CN201910900813.9 2019-09-23

Publications (1)

Publication Number Publication Date
WO2021057146A1 true WO2021057146A1 (en) 2021-04-01

Family

ID=69548146

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/098891 WO2021057146A1 (en) 2019-09-23 2020-06-29 Voice-based interviewee determination method and device, terminal, and storage medium

Country Status (2)

Country Link
CN (1) CN110827796B (en)
WO (1) WO2021057146A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110827796B (en) * 2019-09-23 2024-05-24 平安科技(深圳)有限公司 Interviewer judging method and device based on voice, terminal and storage medium
CN112786054A (en) * 2021-02-25 2021-05-11 深圳壹账通智能科技有限公司 Intelligent interview evaluation method, device and equipment based on voice and storage medium
US11824819B2 (en) 2022-01-26 2023-11-21 International Business Machines Corporation Assertiveness module for developing mental model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103634472A (en) * 2013-12-06 2014-03-12 惠州Tcl移动通信有限公司 Method, system and mobile phone for judging mood and character of user according to call voice
WO2016014321A1 (en) * 2014-07-21 2016-01-28 Microsoft Technology Licensing, Llc Real-time emotion recognition from audio signals
CN106663383A (en) * 2014-06-23 2017-05-10 因特维欧研发股份有限公司 Method and system for analyzing subjects
WO2018093770A2 (en) * 2016-11-18 2018-05-24 IPsoft Incorporated Generating communicative behaviors for anthropomorphic virtual agents based on user's affect
WO2018112134A2 (en) * 2016-12-15 2018-06-21 Analytic Measures Inc. Computer automated method and system for measurement of user energy, attitude, and interpersonal skills
CN110211591A (en) * 2019-06-24 2019-09-06 卓尔智联(武汉)研究院有限公司 Interview data analysing method, computer installation and medium based on emotional semantic classification
CN110263326A (en) * 2019-05-21 2019-09-20 平安科技(深圳)有限公司 A kind of user's behavior prediction method, prediction meanss, storage medium and terminal device
CN110827796A (en) * 2019-09-23 2020-02-21 平安科技(深圳)有限公司 Interviewer determination method and device based on voice, terminal and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107818798B (en) * 2017-10-20 2020-08-18 百度在线网络技术(北京)有限公司 Customer service quality evaluation method, device, equipment and storage medium
CN109637520B (en) * 2018-10-16 2023-08-22 平安科技(深圳)有限公司 Sensitive content identification method, device, terminal and medium based on voice analysis
CN110135692A (en) * 2019-04-12 2019-08-16 平安普惠企业管理有限公司 Intelligence grading control method, device, computer equipment and storage medium
CN110135800A (en) * 2019-04-23 2019-08-16 南京葡萄诚信息科技有限公司 A kind of artificial intelligence video interview method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103634472A (en) * 2013-12-06 2014-03-12 惠州Tcl移动通信有限公司 Method, system and mobile phone for judging mood and character of user according to call voice
CN106663383A (en) * 2014-06-23 2017-05-10 因特维欧研发股份有限公司 Method and system for analyzing subjects
WO2016014321A1 (en) * 2014-07-21 2016-01-28 Microsoft Technology Licensing, Llc Real-time emotion recognition from audio signals
WO2018093770A2 (en) * 2016-11-18 2018-05-24 IPsoft Incorporated Generating communicative behaviors for anthropomorphic virtual agents based on user's affect
WO2018112134A2 (en) * 2016-12-15 2018-06-21 Analytic Measures Inc. Computer automated method and system for measurement of user energy, attitude, and interpersonal skills
CN110263326A (en) * 2019-05-21 2019-09-20 平安科技(深圳)有限公司 A kind of user's behavior prediction method, prediction meanss, storage medium and terminal device
CN110211591A (en) * 2019-06-24 2019-09-06 卓尔智联(武汉)研究院有限公司 Interview data analysing method, computer installation and medium based on emotional semantic classification
CN110827796A (en) * 2019-09-23 2020-02-21 平安科技(深圳)有限公司 Interviewer determination method and device based on voice, terminal and storage medium

Also Published As

Publication number Publication date
CN110827796A (en) 2020-02-21
CN110827796B (en) 2024-05-24

Similar Documents

Publication Publication Date Title
WO2021057146A1 (en) Voice-based interviewee determination method and device, terminal, and storage medium
TW202018533A (en) Data processing model construction method and device, server and client
US11170770B2 (en) Dynamic adjustment of response thresholds in a dialogue system
CN111179935B (en) Voice quality inspection method and device
CN112733042A (en) Recommendation information generation method, related device and computer program product
Bujacz et al. Psychosocial working conditions among high-skilled workers: A latent transition analysis.
CN113190372B (en) Multi-source data fault processing method and device, electronic equipment and storage medium
CN112885376A (en) Method and device for improving voice call quality inspection effect
CN114663223A (en) Credit risk assessment method, device and related equipment based on artificial intelligence
CN113256108A (en) Human resource allocation method, device, electronic equipment and storage medium
CN114242109A (en) Intelligent outbound method and device based on emotion recognition, electronic equipment and medium
US20180114173A1 (en) Cognitive service request dispatching
WO2022178933A1 (en) Context-based voice sentiment detection method and apparatus, device and storage medium
CN113158690A (en) Testing method and device for conversation robot
CN115422094B (en) Algorithm automatic testing method, central dispatching equipment and readable storage medium
CN116523188A (en) Evaluation method and device for innovation capability of enterprise
US11947894B2 (en) Contextual real-time content highlighting on shared screens
CN114925674A (en) File compliance checking method and device, electronic equipment and storage medium
US11475068B2 (en) Automatic question answering method and apparatus, storage medium and server
WO2020007349A1 (en) Intelligent knockout strategy screening method and knockout strategy screening method based on multiple knockout types
CN111522943A (en) Automatic test method, device, equipment and storage medium for logic node
CN110297544A (en) Input information response&#39;s method and device, computer system and readable storage medium storing program for executing
CN110598527A (en) Machine learning-based claims insurance policy number identification method and related equipment
CN111881251B (en) AI telephone sales testing method and device, electronic equipment and storage medium
CN113674765A (en) Voice customer service quality inspection method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20869434

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20869434

Country of ref document: EP

Kind code of ref document: A1