CN112435512A - Voice behavior assessment and evaluation method for rail transit simulation training - Google Patents
Voice behavior assessment and evaluation method for rail transit simulation training Download PDFInfo
- Publication number
- CN112435512A CN112435512A CN202011261988.9A CN202011261988A CN112435512A CN 112435512 A CN112435512 A CN 112435512A CN 202011261988 A CN202011261988 A CN 202011261988A CN 112435512 A CN112435512 A CN 112435512A
- Authority
- CN
- China
- Prior art keywords
- voice
- standard
- evaluation
- communication information
- keywords
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000011156 evaluation Methods 0.000 title claims abstract description 82
- 238000012549 training Methods 0.000 title claims abstract description 33
- 238000004088 simulation Methods 0.000 title claims abstract description 23
- 238000004891 communication Methods 0.000 claims abstract description 47
- 238000000034 method Methods 0.000 claims abstract description 30
- 230000008569 process Effects 0.000 claims abstract description 10
- 238000013528 artificial neural network Methods 0.000 claims abstract description 7
- 238000004364 calculation method Methods 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 60
- 239000011159 matrix material Substances 0.000 claims description 32
- 230000014509 gene expression Effects 0.000 claims description 18
- 238000005516 engineering process Methods 0.000 claims description 8
- 230000001419 dependent effect Effects 0.000 claims description 6
- 238000011161 development Methods 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 230000001960 triggered effect Effects 0.000 claims description 4
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 230000003137 locomotive effect Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 210000001367 artery Anatomy 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/04—Electrically-operated educational appliances with audible presentation of the material to be studied
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06398—Performance of employee with respect to a job function
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
- G06Q50/2057—Career enhancement or continuing education service
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Educational Administration (AREA)
- Theoretical Computer Science (AREA)
- Educational Technology (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- General Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- Operations Research (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a voice behavior assessment and evaluation method for rail transit simulation training, which relates to the technical field of rail transit simulation training and has the technical scheme that: converting the voice communication information into character information; extracting all keywords in the whole sentence, and classifying; automatically recognizing voice communication information through a deep neural network to obtain standard voice data, and matching the standard voice data with a voice evaluation database to obtain uniquely matched standard evaluation data; comparing and analyzing the standard voice data and the standard evaluation data, and then judging whether the semantics of the voice communication information and the trigger opportunity are correct; and performing comprehensive calculation according to the comprehensive membership function to obtain the autonomous evaluation score of the voice communication information. The invention can reliably, reasonably, accurately and quickly carry out autonomous and objective voice examination and evaluation on the voice communication information in the trainee training process, and the trainee can directly know whether the voice instruction is correct and standard or not through the evaluation score.
Description
Technical Field
The invention relates to the technical field of rail transit transportation simulation training, in particular to a voice behavior assessment and evaluation method for rail transit transportation simulation training.
Background
The rail transportation has become one of the most main transportation modes in China, has become an artery for economic development in China, and is deeply loved by the masses of the vast society due to the unique advantages of safety, large transportation volume, rapidness, punctuality, comfort and the like.
Along with the development of rail transit transportation, the demand of more and more posts on talents is continuously increased, and the success of the current talent training mode is improved compared with the traditional talent culture mode. In the traditional talent training process, whether the language instruction of the student is correct and standard is mainly evaluated manually, the evaluation result is too subjective and random, the evaluation result is easily influenced by professional knowledge of a judge, and meanwhile, the student cannot know whether the voice instruction is correct and standard in time. At present, after a part of simulation systems perform language recognition on a required voice instruction, main keywords in voice information are extracted to serve as main factors of recognition evaluation, but the method is easily influenced by environmental noise and by the direct comparison result of the keywords, so that the precision span of an evaluation result is large, the error of the evaluation result is large, and high-precision evaluation cannot be realized; in addition, the mode of pushing real-time voice to the instructor for evaluation occupies a large amount of bandwidth, increases the workload of the instructor, and is poor in flexibility.
Therefore, how to research and design a voice behavior assessment and evaluation method for rail transit simulation training is a problem which is urgently needed to be solved at present.
Disclosure of Invention
The invention aims to solve the problems that the existing rail transit full-professional training simulation system cannot objectively and accurately evaluate the voice of students in the training process of each professional student and the students cannot know whether the voice instruction is correct and standard, and provides a voice behavior assessment and evaluation method for rail transit simulation training.
The technical purpose of the invention is realized by the following technical scheme: a voice behavior assessment and evaluation method for rail transit simulation training comprises the following steps:
s101: recognizing the voice communication information of the trainees in the training process by an intelligent voice recognition technology and converting the voice communication information into character information;
s102: judging and extracting all keywords in the whole sentence through a keyword identification technology, and classifying the extracted keywords according to semantic correlation, equipment correlation and professional term correlation in sequence;
s103: automatically recognizing voice communication information through a deep neural network to obtain standard voice data, and matching the standard voice data with a voice evaluation database to obtain uniquely matched standard evaluation data;
s104: comparing and analyzing the standard voice data and the standard evaluation data, and then judging whether the semantics of the voice communication information and the trigger opportunity are correct;
s105: if the semantics and the trigger time judgment are correct, performing incidence relation comparison analysis on keywords extracted from the voice communication information and standard keywords in the standard evaluation data, and obtaining keyword membership functions related to semantics, equipment and professional terms through a fuzzy control function;
s106: and establishing a comprehensive membership function according to the semantic related keyword membership function, the equipment related keyword membership function and the weight coefficient of the professional term related keyword membership function, and performing comprehensive calculation according to the comprehensive membership function to obtain the autonomous evaluation score of the voice communication information.
Further, the voice communication information identification and conversion specifically includes: a TTS engine, an SAPI interface and a Win32API interface in a Windows Speech SDK development kit are utilized to establish an application program unit for converting the voice into the text under an MFC framework, and the voice communication information is automatically converted into the text information after being input into the application program unit.
Further, the keyword recognition and extraction specifically comprises:
comparing the voice communication information with the keyword database one by one to respectively obtain semantic related keywords, equipment related keywords and professional term related keywords contained in the voice communication information;
the keyword database is constructed by reading keywords related to semantics, equipment and professional terms in the voice evaluation database, and the structure specifically comprises the following steps:
the device related keywords are used for identifying the accuracy of describing device information among students, the semantic related keywords are used for identifying semantics of describing between the students to judge, and the professional term keywords are used for identifying the specialty of describing professional information among the students.
Further, the standard voice data recognition specifically includes:
converting voice communication information into numbers by GBK coding and converting each number into an input matrix xtOne element x in (1)iTo obtain an input matrix xt;
The current hidden layer value is obtained by calculating the voice communication information at the previous moment and the hidden value at the previous moment, which is specifically as follows:
wherein matrix U represents input matrix xtThe dimension of the weight coefficient matrix is n m, and the numerical value is switched according to different work types; s represents a hidden layer value vector with a dimension of n; w represents a weight coefficient matrix of the hidden layer value, and the dimension is n m;
calculating through an output equation to obtain an output function matrix, specifically:
Ot=g(Vst)*ξ
wherein, OtRepresenting a matrix of output functions; g is an algorithm; v represents a hidden layer weight coefficient matrix; stRepresenting a current hidden layer value; xi represents a trigger time judgment coefficient, and if the trigger time is correct, xi is assigned to be 1; when the triggering time is wrong, xi is assigned to be 0;
matrix OtGBK codes representing standard voice signals are converted into Chinese characters through the GBK codes, and standard voice data of voice communication information can be obtained.
Further, the voice evaluation database specifically includes:
the voice evaluation database comprises all standard expressions used by all professions, i pieces of voice data are shared, and the corresponding voice evaluation database has the following structure:
matching the standard voice data with a voice evaluation database to obtain uniquely matched standard semantics and trigger time ix,ixThe method comprises the following steps of representing a standard voice data signal, and carrying out autonomous and objective evaluation by taking the standard voice data signal as an input signal, wherein the specific structure is as follows:
[ number ixSemantic relevance ixDevice dependent ixTerm of expertise correlation ixStatement ix]
The serial number is used for managing the serial number of each piece of data in the voice evaluation database; semantic correlation, equipment correlation and professional term correlation are used as key factors for autonomous and objective evaluation; the sentence represents a standard description method of the sentence as a key factor for standard voice data input matching.
Further, the trigger timing determination specifically includes:
judging the voice signal triggered by the student according to the system state feedback signal and the current environment, and judging to obtain a sentence membership function through the triggering time, wherein the method specifically comprises the following steps:
wherein a represents that the trigger time is subordinate to a critical value; when the triggering time of the voice signal is correct, x is larger than a; when the trigger time of the voice signal is wrong, x is less than or equal to a.
Further, the semantic related keyword membership function specifically includes:
YDC(y,b,c)=c/(|y-b|+c)
b represents the semantic optimal membership value, the representative semantic judgment is completely consistent with the voice evaluation database, the value is the number of semantic related keywords in the voice evaluation database, and c is obtained by calculating the system characteristics and is 0.4 b; with simple expression or complex expression of semantics, the membership value of the semantic expression is reduced; when the membership value reaches b-c/b + c, the membership of the membership function drops to 1/3 where the highest membership is reached.
Further, the device-related keyword membership function specifically includes:
wherein d represents the equipment related keyword membership saturation point, and the value of the equipment membership function is 1; d-e represents a relevant point of equipment, and e is 0.6d calculated according to the system characteristics; when z ∈ (0, d-e), it indicates that the input voice signal does not use the device-related keyword; when z ∈ (d-e, d), it means that the input voice signal gradually increases using the device-related keyword; when z ∈ (d, + ∞), it means that the input speech signal uses other related device keywords in addition to all the device related keywords of the standard sentence.
Further, the term-of-expertise related keyword membership function is specifically:
wherein f represents that the term expression is completely accurate, the membership degree is 1, the membership degree of the term is continuously reduced along with the professional term expression reduction, and g is 0.5f calculated according to the system characteristics.
Further, the comprehensive membership function is specifically:
U=(εYDC+λYSC+τYRT)×100
wherein, U represents the independent and objective evaluation score of the student voice recognition and represents the voice standard level of the student in the training process; epsilon, lambda and tau are respectively the weight coefficients of the semantic related keywords, the equipment related keywords and the professional term related keywords; ε ∈ [0, 1], λ ∈ [0, 1], τ ∈ [0, 1], and ε + λ + τ ═ 1.
Compared with the prior art, the invention has the following beneficial effects: the invention can obtain completely standard voice data by matching the deep neural network technology with the voice evaluation database. Then, by carrying out keyword recognition technology on input voice and comparing with a standard voice database, standard and objective voice evaluation can be carried out on a student, manual evaluation is avoided, the objectivity of the system is increased, the workload of a teacher is reduced, and the system performance is improved by directly comparing with the voice evaluation database without occupying bandwidth; according to the invention, the comparison result is evaluated and calculated and then converted into the percentile score, so that the normative degree and the accuracy degree of the voice instruction of the student can be accurately and visually displayed, the voice assessment and evaluation of the independent and objective voice communication information in the training process of each professional student in the rail transit full-professional training simulation system can be reliably, reasonably, accurately and quickly carried out, and the student can directly know whether the voice instruction is correct and normative through the independent objective evaluation score.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:
FIG. 1 is an overall flow chart in an embodiment of the present invention;
FIG. 2 is a diagram illustrating a semantic and trigger timing determination function according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating semantic dependent membership functions in an embodiment of the present invention;
FIG. 4 is a schematic diagram of device dependent membership functions in an embodiment of the present invention;
FIG. 5 is a diagram illustrating term dependent membership functions in an embodiment of the present invention;
FIG. 6 is a schematic diagram of a deep neural network in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the following examples and accompanying fig. 1-6, wherein the exemplary embodiments and descriptions of the present invention are only used for explaining the present invention and are not to be construed as limiting the present invention.
Example 1
A voice behavior assessment and evaluation method for rail transit simulation training is shown in figure 1 and comprises the following steps:
s101: recognizing the voice communication information of the trainees in the training process by an intelligent voice recognition technology and converting the voice communication information into character information;
s102: judging and extracting all keywords in the whole sentence through a keyword identification technology, and classifying the extracted keywords according to semantic correlation, equipment correlation and professional term correlation in sequence;
s103: automatically recognizing voice communication information through a deep neural network to obtain standard voice data, and matching the standard voice data with a voice evaluation database to obtain uniquely matched standard evaluation data;
s104: comparing and analyzing the standard voice data and the standard evaluation data, and then judging whether the semantics of the voice communication information and the trigger opportunity are correct;
s105: if the semantics and the trigger time judgment are correct, performing incidence relation comparison analysis on keywords extracted from the voice communication information and standard keywords in the standard evaluation data, and obtaining keyword membership functions related to semantics, equipment and professional terms through a fuzzy control function;
s106: and establishing a comprehensive membership function according to the semantic related keyword membership function, the equipment related keyword membership function and the weight coefficient of the professional term related keyword membership function, and performing comprehensive calculation according to the comprehensive membership function to obtain the autonomous evaluation score of the voice communication information.
The voice communication information identification and conversion specifically comprises the following steps: a TTS engine, an SAPI interface and a Win32API interface in a Windows Speech SDK development kit are utilized to establish an application program unit for converting the voice into the text under an MFC framework, and the voice communication information is automatically converted into the text information after being input into the application program unit.
The keyword identification and extraction specifically comprises the following steps: comparing the voice communication information with the keyword database one by one to respectively obtain semantic related keywords, equipment related keywords and professional term related keywords contained in the voice communication information; the keyword database is constructed by reading keywords related to semantics, equipment and professional terms in the voice evaluation database, and the structure specifically comprises the following steps:
the device related keywords are used for identifying the accuracy of describing device information among students, the semantic related keywords are used for identifying semantics of describing between the students to judge, and the professional term keywords are used for identifying the specialty of describing professional information among the students.
As shown in FIG. 6, standard speech data recognitionThe method specifically comprises the following steps: converting voice communication information into numbers by GBK coding and converting each number into an input matrix xtOne element x in (1)iTo obtain an input matrix xt(ii) a The current hidden layer value is obtained by calculating the voice communication information at the previous moment and the hidden value at the previous moment, which is specifically as follows:
wherein matrix U represents input matrix xtThe dimension of the weight coefficient matrix is n m, and the numerical value is switched according to different work types; s represents a hidden layer value vector with a dimension of n; w represents a weight coefficient matrix of hidden layer values with dimension n m.
Calculating through an output equation to obtain an output function matrix, specifically:
Ot=g(Vst)*ξ
wherein, OtRepresenting a matrix of output functions; g is an algorithm; v represents a hidden layer weight coefficient matrix; stRepresenting a current hidden layer value; xi represents a trigger time judgment coefficient, and if the trigger time is correct, xi is assigned to be 1; when the triggering time is wrong, xi is assigned to be 0; matrix OtGBK codes representing standard voice signals are converted into Chinese characters through the GBK codes, and standard voice data of voice communication information can be obtained.
The voice evaluation database specifically comprises: the voice evaluation database comprises all standard expressions used by all professions, i pieces of voice data are shared, and the corresponding voice evaluation database has the following structure:
matching the standard voice data with a voice evaluation database to obtain uniquely matched standard semantics and trigger time ix,ixRepresenting a standard speech data signal, using the standard speech data signal as an input signalThe self-contained and objective evaluation is carried out, and the specific structure is as follows:
[ number ixSemantic relevance ixDevice dependent ixTerm of expertise correlation ixStatement ix]
The serial number is used for managing the serial number of each piece of data in the voice evaluation database; semantic correlation, equipment correlation and professional term correlation are used as key factors for autonomous and objective evaluation; the sentence represents a standard description method of the sentence as a key factor for standard voice data input matching.
As shown in fig. 2, the trigger timing determination specifically includes: judging the voice signal triggered by the student according to the system state feedback signal and the current environment, and judging to obtain a sentence membership function through the triggering time, wherein the method specifically comprises the following steps:
wherein a represents that the trigger time is subordinate to a critical value; when the triggering time of the voice signal is correct, x is larger than a; when the trigger time of the voice signal is wrong, x is less than or equal to a.
As shown in fig. 3, the semantic related keyword membership function specifically includes:
YDC(y,b,c)=c/(|y-b|+c)
b represents the semantic optimal membership value, the representative semantic judgment is completely consistent with the voice evaluation database, the value is the number of semantic related keywords in the voice evaluation database, and c is obtained by calculating the system characteristics and is 0.4 b; with simple expression or complex expression of semantics, the membership value of the semantic expression is reduced; when the membership value reaches b-c/b + c, the membership of the membership function drops to 1/3 where the highest membership is reached.
As shown in fig. 4, the device-related keyword membership function specifically includes:
wherein d represents the equipment related keyword membership saturation point, and the value of the equipment membership function is 1; d-e represents a relevant point of equipment, and e is 0.6d calculated according to the system characteristics; when z ∈ (0, d-e), it indicates that the input voice signal does not use the device-related keyword; when z ∈ (d-e, d), it means that the input voice signal gradually increases using the device-related keyword; when z ∈ (d, + ∞), it means that the input speech signal uses other related device keywords in addition to all the device related keywords of the standard sentence.
As shown in fig. 5, the term-of-expertise related keyword membership functions are specifically:
wherein f represents that the term expression is completely accurate, the membership degree is 1, the membership degree of the term is continuously reduced along with the professional term expression reduction, and g is 0.5f calculated according to the system characteristics.
The comprehensive membership function is specifically:
U=(εYDC+λYSC+τYRT)×100
wherein, U represents the independent and objective evaluation score of the student voice recognition and represents the voice standard level of the student in the training process; epsilon, lambda and tau are respectively the weight coefficients of the semantic related keywords, the equipment related keywords and the professional term related keywords; ε ∈ [0, 1], λ ∈ [0, 1], τ ∈ [0, 1], and ε + λ + τ ═ 1.
Example 2
The voice communication information is used as an input signal for explanation, wherein the voice communication information is used as an input signal for the situation that the locomotive slightly moves, the allowed speed is 5 kilometers per hour, and the locomotive moves to the front of an outbound signal machine. Suppose the speech assessment database speech is "locomotive allowed speed 5 km/h, before arriving at the outbound signal". After the system recognizes the voice signal, firstly, the voice information is converted into the text information through an intelligent voice recognition program developed by a TTS engine in a Windows Speech SDK development kit.
The method detects whether the input signal contains the keywords or not through a standard keyword database recognition system with large vocabulary, and the method has high response speed.
The standard keyword database is actually data in a read standard voice database, and is mainly used for reading related keywords of semantics, equipment and professional terms in the standard voice database to form a new keyword database, wherein the structure of the keyword database is specifically as follows:
by comparing with the keyword database, it can be known that there are 4 semantic related keywords, 2 device related keywords, and 2 professional term related keywords in the input signal.
The trigger time judgment is mainly carried out through the analysis of the state of the system and the current environment, and the voice signal triggered by the student is judged. Since the trigger timing is correct in this example, YSEN(x,a)=1。
The output signal matrix is obtained through deep neural network operation, and standard output voice signals are obtained through GBK Chinese character conversion, wherein the standard output voice signals are 'the speed of the locomotive is allowed to be 5 kilometers per hour, and the standard output voice signals are transmitted to the front of an outbound signal machine'.
By comparing with the speech evaluation database Q, the speech evaluation database entry corresponding to both the standard semantic meaning and the trigger time can be matched, and if the number of the speech evaluation database entry is 11253, the structure thereof is:
[11253 allowable speed of kilometer/hour locomotive, speed of outbound signal speed of 5 kilometer/hour locomotive, traveling to the front of outbound signal ]
By comparing the input signal with the voice database, 3 semantically related keywords of the voice evaluation database, 2 equipment related keywords and 1 professional term related keyword can be known.
Membership function Y for semantically related keywordsDCSince there are 3 semantic related keywords, b is 3, and it can be known from system characteristic calculation, and c is 1.2, specifically: y isDC(Y, b, c) ═ 1.2/(| Y-3| +1.2), and Y was calculatedDC=0.55。
Device-related keyword membership function YSCSince the number of device-related keywords is 2, d is 2, and e is 1.2 calculated from the system characteristics, it is possible to obtainFurther calculate YSC=1。
Term-of-expertise related keyword membership function YRTSince the related keyword of the term of expertise is 1, f is 1, and g is 1, the term of expertise can be obtainedCan know YRT=0。
Performing integration analysis on the three membership functions, establishing a new comprehensive membership function according to the weight ratio of each factor, and performing comprehensive calculation according to the comprehensive membership function to obtain the independent evaluation score of the voice information of the learner; for different scenes, the three factors have different weight coefficients, and the weight coefficients of the factor membership functions are represented by epsilon, lambda and T. By analyzing the scene, epsilon is 0.5, lambda is 0.3, and T is 0.2, and the comprehensive membership function is defined as follows:
U=(εYDC+λYSC+τYRT)×100=(0.5×0.55+0.3×1+0.2×0)×100=57.5
therefore, the voice recognition of the student can independently and objectively evaluate and assess the score of 57.5.
The above embodiments are provided to further explain the objects, technical solutions and advantages of the present invention in detail, it should be understood that the above embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (10)
1. A voice behavior assessment and evaluation method for rail transit simulation training is characterized by comprising the following steps:
s101: recognizing the voice communication information of the trainees in the training process by an intelligent voice recognition technology and converting the voice communication information into character information;
s102: judging and extracting all keywords in the whole sentence through a keyword identification technology, and classifying the extracted keywords according to semantic correlation, equipment correlation and professional term correlation in sequence;
s103: automatically recognizing voice communication information through a deep neural network to obtain standard voice data, and matching the standard voice data with a voice evaluation database to obtain uniquely matched standard evaluation data;
s104: comparing and analyzing the standard voice data and the standard evaluation data, and then judging whether the semantics of the voice communication information and the trigger opportunity are correct;
s105: if the semantics and the trigger time judgment are correct, performing incidence relation comparison analysis on keywords extracted from the voice communication information and standard keywords in the standard evaluation data, and obtaining keyword membership functions related to semantics, equipment and professional terms through a fuzzy control function;
s106: and establishing a comprehensive membership function according to the semantic related keyword membership function, the equipment related keyword membership function and the weight coefficient of the professional term related keyword membership function, and performing comprehensive calculation according to the comprehensive membership function to obtain the autonomous evaluation score of the voice communication information.
2. The method for assessing and evaluating the voice behavior of the rail transit simulation training as claimed in claim 1, wherein the voice communication information recognition and conversion specifically comprises: a TTS engine, an SAPI interface and a Win32API interface in a Windows Speech SDK development kit are utilized to establish an application program unit for converting the voice into the text under an MFC framework, and the voice communication information is automatically converted into the text information after being input into the application program unit.
3. The method for evaluating the voice behavior assessment of the rail transit simulation training as claimed in claim 1, wherein the keyword recognition and extraction specifically comprises:
comparing the voice communication information with the keyword database one by one to respectively obtain semantic related keywords, equipment related keywords and professional term related keywords contained in the voice communication information;
the keyword database is constructed by reading keywords related to semantics, equipment and professional terms in the voice evaluation database, and the structure specifically comprises the following steps:
the device related keywords are used for identifying the accuracy of describing device information among students, the semantic related keywords are used for identifying semantics of describing between the students to judge, and the professional term keywords are used for identifying the specialty of describing professional information among the students.
4. The method for assessing and evaluating the voice behavior of the rail transit simulation training as claimed in claim 1, wherein the standard voice data recognition specifically comprises:
converting voice communication information into numbers by GBK coding and converting each number into an input matrix xtOne element x in (1)iTo obtain an input matrix xt;
The current hidden layer value is obtained by calculating the voice communication information at the previous moment and the hidden value at the previous moment, which is specifically as follows:
wherein matrix U represents input matrix xtThe dimension of the weight coefficient matrix is n m, and the weight coefficient matrix is switched according to different work types of trainingA numerical value; s represents a hidden layer value vector with a dimension of n; w represents a weight coefficient matrix of the hidden layer value, and the dimension is n m;
calculating through an output equation to obtain an output function matrix, specifically:
Ot=g(Vst)*ξ
wherein, OtRepresenting a matrix of output functions; g is an algorithm; v represents a hidden layer weight coefficient matrix; stRepresenting a current hidden layer value; xi represents a trigger time judgment coefficient, and if the trigger time is correct, xi is assigned to be 1; when the triggering time is wrong, xi is assigned to be 0;
matrix OtGBK codes representing standard voice signals are converted into Chinese characters through the GBK codes, and standard voice data of voice communication information can be obtained.
5. The method for assessing and evaluating the voice behavior of the rail transit simulation training as claimed in claim 1, wherein the voice evaluation database specifically comprises:
the voice evaluation database comprises all standard expressions used by all professions, i pieces of voice data are shared, and the corresponding voice evaluation database has the following structure:
matching the standard voice data with a voice evaluation database to obtain uniquely matched standard semantics and trigger time ix,ixThe method comprises the following steps of representing a standard voice data signal, and carrying out autonomous and objective evaluation by taking the standard voice data signal as an input signal, wherein the specific structure is as follows:
[ number ixSemantic relevance ixDevice dependent ixTerm of expertise correlation ixStatement ix]
The serial number is used for managing the serial number of each piece of data in the voice evaluation database; semantic correlation, equipment correlation and professional term correlation are used as key factors for autonomous and objective evaluation; the sentence represents a standard description method of the sentence as a key factor for standard voice data input matching.
6. The method for assessing and evaluating the voice behavior of the rail transit simulation training as claimed in claim 1, wherein the triggering time judgment specifically comprises:
judging the voice signal triggered by the student according to the system state feedback signal and the current environment, and judging to obtain a sentence membership function through the triggering time, wherein the method specifically comprises the following steps:
wherein a represents that the trigger time is subordinate to a critical value; when the triggering time of the voice signal is correct, x is larger than a; when the trigger time of the voice signal is wrong, x is less than or equal to a.
7. The voice behavior assessment evaluation method for rail transit transportation simulation training according to any one of claims 1 to 6, wherein the semantic related keyword membership function is specifically:
YDC(y,b,c)=c/(|y-b|+c)
b represents the semantic optimal membership value, the representative semantic judgment is completely consistent with the voice evaluation database, the value is the number of semantic related keywords in the voice evaluation database, and c is obtained by calculating the system characteristics and is 0.4 b; with simple expression or complex expression of semantics, the membership value of the semantic expression is reduced; when the membership value reaches b-c/b + c, the membership of the membership function drops to 1/3 where the highest membership is reached.
8. The method for assessing and evaluating the voice behavior of the rail transit simulation training as claimed in claim 7, wherein the device-related keyword membership functions are specifically:
wherein d represents the equipment related keyword membership saturation point, and the value of the equipment membership function is 1; d-e represents a relevant point of equipment, and e is 0.6d calculated according to the system characteristics; when z ∈ (0, d-e), it indicates that the input voice signal does not use the device-related keyword; when z ∈ (d-e, d), it means that the input voice signal gradually increases using the device-related keyword; when z ∈ (d, + ∞), it means that the input speech signal uses other related device keywords in addition to all the device related keywords of the standard sentence.
9. The method for assessing and evaluating the voice behavior of the rail transit simulation training as claimed in claim 8, wherein the membership function of the related keywords of the professional terms is specifically as follows:
wherein f represents that the term expression is completely accurate, the membership degree is 1, the membership degree of the term is continuously reduced along with the professional term expression reduction, and g is 0.5f calculated according to the system characteristics.
10. The method for assessing and evaluating the voice behavior of the rail transit simulation training as claimed in claim 9, wherein the comprehensive membership function is specifically:
U=(εYDC+λYSC+τYRT)×100
wherein, U represents the independent and objective evaluation score of the student voice recognition and represents the standard level of the voice of the student in the training process; epsilon, lambda and T are respectively the weight coefficients of the semantic related keywords, the equipment related keywords and the professional term related keywords; ε ∈ [0, 1], λ ∈ [0, 1], τ ∈ [0, 1], and ε + λ + τ ═ 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011261988.9A CN112435512B (en) | 2020-11-12 | 2020-11-12 | Voice behavior assessment and evaluation method for rail transit simulation training |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011261988.9A CN112435512B (en) | 2020-11-12 | 2020-11-12 | Voice behavior assessment and evaluation method for rail transit simulation training |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112435512A true CN112435512A (en) | 2021-03-02 |
CN112435512B CN112435512B (en) | 2023-01-24 |
Family
ID=74701312
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011261988.9A Active CN112435512B (en) | 2020-11-12 | 2020-11-12 | Voice behavior assessment and evaluation method for rail transit simulation training |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112435512B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114373373A (en) * | 2022-01-10 | 2022-04-19 | 北京易优联科技有限公司 | Examination method and system for pulmonary function examiner |
CN115547299A (en) * | 2022-11-22 | 2022-12-30 | 中国民用航空飞行学院 | Quantitative evaluation and classification method and device for controlled voice quality division |
CN115953931A (en) * | 2023-03-14 | 2023-04-11 | 成都运达科技股份有限公司 | Rail transit practical training examination objective evaluation system and method |
Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101620853A (en) * | 2008-07-01 | 2010-01-06 | 邹采荣 | Speech-emotion recognition method based on improved fuzzy vector quantization |
CN102831506A (en) * | 2012-08-28 | 2012-12-19 | 焦玉刚 | Method and system for carrying out service procedure management through voice |
US20140032574A1 (en) * | 2012-07-23 | 2014-01-30 | Emdadur R. Khan | Natural language understanding using brain-like approach: semantic engine using brain-like approach (sebla) derives semantics of words and sentences |
CN107093431A (en) * | 2016-02-18 | 2017-08-25 | 中国移动通信集团辽宁有限公司 | A kind of method and device that quality inspection is carried out to service quality |
CN107273350A (en) * | 2017-05-16 | 2017-10-20 | 广东电网有限责任公司江门供电局 | A kind of information processing method and its device for realizing intelligent answer |
CN108428382A (en) * | 2018-02-14 | 2018-08-21 | 广东外语外贸大学 | It is a kind of spoken to repeat methods of marking and system |
CN108804661A (en) * | 2018-06-06 | 2018-11-13 | 湘潭大学 | Data de-duplication method based on fuzzy clustering in a kind of cloud storage system |
CN109817201A (en) * | 2019-03-29 | 2019-05-28 | 北京金山安全软件有限公司 | Language learning method and device, electronic equipment and readable storage medium |
CN110222183A (en) * | 2019-06-12 | 2019-09-10 | 云南电网有限责任公司大理供电局 | A kind of construction method for appraisal model of customer satisfaction of powering |
CN110322895A (en) * | 2018-03-27 | 2019-10-11 | 亿度慧达教育科技(北京)有限公司 | Speech evaluating method and computer storage medium |
CN110717018A (en) * | 2019-04-15 | 2020-01-21 | 中国石油大学(华东) | Industrial equipment fault maintenance question-answering system based on knowledge graph |
CN110807970A (en) * | 2019-12-02 | 2020-02-18 | 成都运达科技股份有限公司 | CTC (China traffic control) scheduling simulation examination and evaluation system based on scene control |
CN110827807A (en) * | 2019-11-29 | 2020-02-21 | 恒信东方文化股份有限公司 | Voice recognition method and system |
CN111326140A (en) * | 2020-03-12 | 2020-06-23 | 科大讯飞股份有限公司 | Speech recognition result discrimination method, correction method, device, equipment and storage medium |
CN111430044A (en) * | 2020-03-19 | 2020-07-17 | 郑州大学第一附属医院 | Natural language processing system and method of nursing robot |
CN111540380A (en) * | 2020-04-20 | 2020-08-14 | 深圳妙创医学技术有限公司 | Clinical training system and method |
CN111597308A (en) * | 2020-05-19 | 2020-08-28 | 中国电子科技集团公司第二十八研究所 | Knowledge graph-based voice question-answering system and application method thereof |
CN111696557A (en) * | 2020-06-23 | 2020-09-22 | 深圳壹账通智能科技有限公司 | Method, device and equipment for calibrating voice recognition result and storage medium |
-
2020
- 2020-11-12 CN CN202011261988.9A patent/CN112435512B/en active Active
Patent Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101620853A (en) * | 2008-07-01 | 2010-01-06 | 邹采荣 | Speech-emotion recognition method based on improved fuzzy vector quantization |
US20140032574A1 (en) * | 2012-07-23 | 2014-01-30 | Emdadur R. Khan | Natural language understanding using brain-like approach: semantic engine using brain-like approach (sebla) derives semantics of words and sentences |
CN102831506A (en) * | 2012-08-28 | 2012-12-19 | 焦玉刚 | Method and system for carrying out service procedure management through voice |
CN107093431A (en) * | 2016-02-18 | 2017-08-25 | 中国移动通信集团辽宁有限公司 | A kind of method and device that quality inspection is carried out to service quality |
CN107273350A (en) * | 2017-05-16 | 2017-10-20 | 广东电网有限责任公司江门供电局 | A kind of information processing method and its device for realizing intelligent answer |
CN108428382A (en) * | 2018-02-14 | 2018-08-21 | 广东外语外贸大学 | It is a kind of spoken to repeat methods of marking and system |
CN110322895A (en) * | 2018-03-27 | 2019-10-11 | 亿度慧达教育科技(北京)有限公司 | Speech evaluating method and computer storage medium |
CN108804661A (en) * | 2018-06-06 | 2018-11-13 | 湘潭大学 | Data de-duplication method based on fuzzy clustering in a kind of cloud storage system |
CN109817201A (en) * | 2019-03-29 | 2019-05-28 | 北京金山安全软件有限公司 | Language learning method and device, electronic equipment and readable storage medium |
CN110717018A (en) * | 2019-04-15 | 2020-01-21 | 中国石油大学(华东) | Industrial equipment fault maintenance question-answering system based on knowledge graph |
CN110222183A (en) * | 2019-06-12 | 2019-09-10 | 云南电网有限责任公司大理供电局 | A kind of construction method for appraisal model of customer satisfaction of powering |
CN110827807A (en) * | 2019-11-29 | 2020-02-21 | 恒信东方文化股份有限公司 | Voice recognition method and system |
CN110807970A (en) * | 2019-12-02 | 2020-02-18 | 成都运达科技股份有限公司 | CTC (China traffic control) scheduling simulation examination and evaluation system based on scene control |
CN111326140A (en) * | 2020-03-12 | 2020-06-23 | 科大讯飞股份有限公司 | Speech recognition result discrimination method, correction method, device, equipment and storage medium |
CN111430044A (en) * | 2020-03-19 | 2020-07-17 | 郑州大学第一附属医院 | Natural language processing system and method of nursing robot |
CN111540380A (en) * | 2020-04-20 | 2020-08-14 | 深圳妙创医学技术有限公司 | Clinical training system and method |
CN111597308A (en) * | 2020-05-19 | 2020-08-28 | 中国电子科技集团公司第二十八研究所 | Knowledge graph-based voice question-answering system and application method thereof |
CN111696557A (en) * | 2020-06-23 | 2020-09-22 | 深圳壹账通智能科技有限公司 | Method, device and equipment for calibrating voice recognition result and storage medium |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114373373A (en) * | 2022-01-10 | 2022-04-19 | 北京易优联科技有限公司 | Examination method and system for pulmonary function examiner |
CN115547299A (en) * | 2022-11-22 | 2022-12-30 | 中国民用航空飞行学院 | Quantitative evaluation and classification method and device for controlled voice quality division |
CN115953931A (en) * | 2023-03-14 | 2023-04-11 | 成都运达科技股份有限公司 | Rail transit practical training examination objective evaluation system and method |
Also Published As
Publication number | Publication date |
---|---|
CN112435512B (en) | 2023-01-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112435512B (en) | Voice behavior assessment and evaluation method for rail transit simulation training | |
CN110287494A (en) | A method of the short text Similarity matching based on deep learning BERT algorithm | |
CN112508334B (en) | Personalized paper grouping method and system integrating cognition characteristics and test question text information | |
CN110717018A (en) | Industrial equipment fault maintenance question-answering system based on knowledge graph | |
CN109388700A (en) | Intention identification method and system | |
CN109949637B (en) | Automatic answering method and device for objective questions | |
CN110888989B (en) | Intelligent learning platform and construction method thereof | |
CN111833853A (en) | Voice processing method and device, electronic equipment and computer readable storage medium | |
CN111831831A (en) | Knowledge graph-based personalized learning platform and construction method thereof | |
CN113283236B (en) | Entity disambiguation method in complex Chinese text | |
CN113011196A (en) | Concept-enhanced representation and one-way attention-containing subjective question automatic scoring neural network model | |
Levin | Automatic evaluation of spoken dialogue systems | |
CN116595151A (en) | Priori knowledge-based image reasoning question-answering method for inspiring large language model | |
CN114926716B (en) | Learning participation degree identification method, device, equipment and readable storage medium | |
CN118193701A (en) | Knowledge tracking and knowledge graph based personalized intelligent answering method and device | |
CN117079504B (en) | Wrong question data management method of big data accurate teaching and reading system | |
CN117591648A (en) | Power grid customer service co-emotion dialogue reply generation method based on emotion fine perception | |
Fuhriman et al. | Group process instruments: Therapeutic themes and issues | |
CN115905187B (en) | Intelligent proposition system oriented to cloud computing engineering technician authentication | |
Ye et al. | Machine learning techniques to automate scoring of constructed-response type assessments | |
CN116521872A (en) | Combined recognition method and system for cognition and emotion and electronic equipment | |
CN115689000A (en) | Learning situation intelligent prediction method and system based on whole learning behavior flow | |
CN115116474A (en) | Spoken language scoring model training method, scoring method, device and electronic equipment | |
CN114925610A (en) | Learner knowledge structure and level modeling method, system, equipment and terminal | |
CN117932044B (en) | Automatic dialogue generation method and system for psychological counseling assistant based on AI |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |