CN113807642A - Power dispatching intelligent interaction method based on program-controlled telephone - Google Patents

Power dispatching intelligent interaction method based on program-controlled telephone Download PDF

Info

Publication number
CN113807642A
CN113807642A CN202110708426.2A CN202110708426A CN113807642A CN 113807642 A CN113807642 A CN 113807642A CN 202110708426 A CN202110708426 A CN 202110708426A CN 113807642 A CN113807642 A CN 113807642A
Authority
CN
China
Prior art keywords
voice
user
power dispatching
intelligent interaction
telephone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110708426.2A
Other languages
Chinese (zh)
Inventor
崔建业
马翔
支月媚
吕磊炎
张文杰
何云良
张晖
黄剑峰
吴炳超
皮俊波
李振华
吴华华
杜浩良
方璇
谷炜
宋昕
郑翔
杨靖萍
徐立中
沈曦
吴烨
周东波
张辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Zhejiang Electric Power Co Ltd
Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
State Grid Zhejiang Electric Power Co Ltd
Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Zhejiang Electric Power Co Ltd, Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd filed Critical State Grid Zhejiang Electric Power Co Ltd
Priority to CN202110708426.2A priority Critical patent/CN113807642A/en
Publication of CN113807642A publication Critical patent/CN113807642A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06312Adjustment or analysis of established resource schedule, e.g. resource or task levelling, or dynamic rescheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4936Speech interaction details

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Acoustics & Sound (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Tourism & Hospitality (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Signal Processing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Biology (AREA)
  • Marketing (AREA)
  • Computer Security & Cryptography (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Public Health (AREA)
  • Development Economics (AREA)
  • Computer Networks & Wireless Communication (AREA)

Abstract

The invention provides a power dispatching intelligent interaction method based on a program-controlled telephone, which comprises the following steps: establishing connection; voice awakening; inputting voice; preprocessing voice; feature extraction: decoding and identifying; lexical analysis; step eight: identifying user intentions; an authorization confirmation; a user intent to execute; result feedback; analyzing a result text; fourteen steps: performing rhythm control; synthesizing voice; outputting voice; connection release; based on the traditional program-controlled telephone, the invention combines artificial intelligence technologies such as voice recognition and natural language processing, realizes effective interaction between the program-controlled telephone and the electric power application system, realizes the intelligent interaction function of the virtual seat telephone by utilizing the dispatching telephone and the automatic voice recognition and synthesis functions, can effectively improve the calling and control efficiency of the program-controlled telephone, completes a series of intelligent searching, inquiring and dispatching command works, and improves the efficiency of electric power dispatching command.

Description

Power dispatching intelligent interaction method based on program-controlled telephone
Technical Field
The invention relates to the technical field of voice recognition technology and natural language processing, in particular to a power dispatching intelligent interaction method based on a program-controlled telephone.
Background
The traditional digital program-controlled exchange technology selects time division multiplexing technology and large-scale integrated circuit, and the program-controlled digital telephone exchange has small volume, light weight, low power consumption, greatly reduced construction cost, convenient maintenance and management and high reliability. And a network management center mode is adopted for centralized maintenance management and automatic fault diagnosis. The stability in the aspect of power supply is good, and all internal communication can normally run as long as the stored program control exchange is ensured to be powered on. However, with the expansion of enterprise scale and business, the traditional digital program control technology cannot efficiently meet coordinated scheduling, through the combination of the traditional program control telephone and the artificial intelligence technologies such as voice analysis and the like, incoming call information is intelligently responded, power grid information is automatically pushed, effective interaction between the program control telephone and an application system is realized, and the intelligent interaction function of a virtual seat telephone is realized by utilizing the automatic recognition and synthesis functions of a scheduling telephone and voice.
In view of this, in order to overcome the shortcomings of the prior art, it is an urgent problem in the art to provide an intelligent interactive method for power scheduling based on a program controlled telephone.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide an intelligent power dispatching interaction method based on a program-controlled telephone, which can effectively improve the efficiency of coordinated dispatching.
In order to solve the technical problems, the invention provides an intelligent power dispatching interaction method based on a program-controlled telephone, which comprises the following steps:
the method comprises the following steps: establishing connection, wherein a user uses a program-controlled telephone to dial a fixed number to establish connection with the power dispatching intelligent interaction platform;
step two: voice awakening, namely after a user establishes connection with the power dispatching intelligent interaction platform, synchronously executing a monitoring module, and starting a voice recognition module when the voice of the user is monitored to contain awakening words;
step three: voice input, after waking up the voice recognition module, a user carries out voice description according to the self requirement, and the description content is synchronously used as the voice input;
step four: voice preprocessing, namely preprocessing voice input, wherein the processing method comprises pre-emphasis and windowing;
step five: extracting characteristics, namely analyzing a frequency spectrum to obtain characteristic parameters of a time domain and a frequency domain;
step six: decoding and identifying, namely comparing the extracted features with a model library to obtain similar templates, and obtaining an identification result through query;
step seven: and (3) lexical analysis, wherein the conditional random field tool kit CRF + + is used for the lexical analysis, and the earlier stage work of the CRF + + lexical analysis comprises the following steps: establishing an electric power corpus, determining a feature template, and training and predicting;
step eight: identifying the user intention, and performing classification judgment by using a naive Bayes method;
step ten: authorization confirmation, namely prompting a user to perform authorization confirmation after the power dispatching intelligent interaction platform correctly identifies the intention of the user, wherein the authorization mode is a user job number plus a password;
step eleven: the power dispatching intelligent interaction platform is used for executing the user intention, if the user authorization confirmation is successful, the power dispatching intelligent interaction platform immediately executes the user intention, and if the user authorization confirmation is failed, the user is prompted to execute the authorization confirmation again;
step twelve: the result feedback is carried out, and after the user intention is executed, the power dispatching intelligent interaction platform receives the result feedback in a text form;
step thirteen: analyzing a result text, namely analyzing the text according to a model type power semantic library to enable the power dispatching intelligent interaction platform to obtain a pronunciation prompt;
fourteen steps: rhythm control, according to the result of text analysis, endowing some characteristics to the synthesized voice, so that the semantic expressed by the voice is clearer and more natural;
step fifteen: voice synthesis, namely selecting matched voice primitives from the power voice database, splicing the voice primitives to obtain continuous voice, and performing prosody modification on the continuous voice to obtain finally synthesized voice;
sixthly, the steps are as follows: outputting voice, wherein the power dispatching intelligent interaction platform transmits the voice synthesized in the step fifteen to a user through a program-controlled telephone;
seventeen steps: and (4) releasing the connection, returning to the step three to continue voice input if the user has other requirements, and hanging up the phone and releasing the connection if the requirement is finished.
Optionally, the voice preprocessing method in the fourth step includes pre-emphasis and windowing, and the formula for performing pre-emphasis on the voice signal is as follows:
Figure BDA0003132332590000031
wherein, x [ t ] represents the t-th number of the audio data, and the value range of a is (0.95, 0.99);
the formula for windowing a speech signal is as follows:
Figure BDA0003132332590000032
where x [ n ] is the nth number within the window taken and w [ n ] is the weight corresponding thereto.
Optionally, in the fifth step, a calculation formula of the mel-frequency cepstrum coefficient is as follows:
Figure BDA0003132332590000033
Figure BDA0003132332590000034
Figure BDA0003132332590000035
Figure BDA0003132332590000041
wherein, wl(k) To the filter coefficients of the corresponding filters, o (l), c (l), h (l) are the lower limit frequency, center frequency and upper limit frequency of the corresponding filters on the actual frequency axis, fsAnd L is the number of filters, and F (L) is the filtering output.
Optionally, in the seventh step, the work in the early stage of the CRF + + lexical analysis includes:
establishing an electric power corpus, determining a feature template, and training and predicting;
preparing an electric power corpus: the CRF + + electric power corpus is input in a pure text form and is separated into words by spaces;
for each time instant (x)t,yt) Only lastOne row is the output result y, and the other rows are all the input variables x;
determining a characteristic template: the characteristic template is used for extracting the characteristics of an input variable x, and CRF + + supports the self-defining of vinegar and the input variable by using a specified language to generate various characteristics;
training and predicting: the CRF + + interface is used to complete training and prediction for the open CRF _ spare and CRF _ test interfaces.
Optionally, in the step eight, the naive bayes method is used in training:
firstly, learning joint probability p (X, Y) through training, and then converting the joint probability p (X, Y) into a product of prior probability distribution and probability distribution through Bayes theorem;
p(X=x,Y=ck)=p(Y=ck)p(X=x|Y=ck);
wherein the prior probability distribution p (Y ═ c) of the classk) Counting the number of samples of each category;
Figure BDA0003132332590000042
and conditional probability p (X ═ X, Y ═ c)k) Difficult to estimate;
p(X=x|Y=ck)=p(X1=x1,...,Xn=xn|Y=ck)k=1,...K;
if the ith dimension xiHas miMedium values, x is common in combination
Figure BDA0003132332590000051
A clock condition; naive bayes assumes that all features are conditionally independent, then:
Figure BDA0003132332590000052
estimation with maximum likelihood:
Figure BDA0003132332590000053
in a given known class ckIn the case where the ith dimension of the feature vector is a specific value xiIs of class ckAnd the ith dimension is xiIs divided by the class ckThe number of all samples in the case of (1);
in prediction, naive Bayes method finds a posterior probability p (X ═ X | Y ═ c) according to Bayes formulak) Maximum class ck
Figure BDA0003132332590000054
Substituting a Bayesian formula to obtain:
Figure BDA0003132332590000055
wherein the denominator and ckRegardless, omission yields:
Figure BDA0003132332590000056
namely, it is
Figure BDA0003132332590000057
Optionally, in the thirteenth step, the power scheduling model type semantics include word slot, intention, and field;
the field is a power dispatching field, the intention is a power dispatching service, and the word slot comprises time, place and operator.
Compared with the prior art, the power dispatching intelligent interaction method based on the program-controlled telephone is based on the traditional program-controlled telephone, combines artificial intelligence technologies such as voice recognition and natural language processing, realizes effective interaction between the program-controlled telephone and a power application system, realizes an intelligent interaction function of a virtual seat telephone by utilizing dispatching telephone and automatic voice recognition and synthesis functions, can effectively improve calling and control efficiency of the program-controlled telephone, completes work of a series of intelligent searching, inquiring and dispatching commands, and improves efficiency of power dispatching command.
Drawings
FIG. 1 is a speech recognition schematic of an embodiment of the present invention;
FIG. 2 is a flow chart of speech recognition according to an embodiment of the present invention;
FIG. 3 is a flow chart of speech synthesis according to an embodiment of the present invention;
fig. 4 is a semantic diagram of a power scheduling model according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1 to 4, the present invention discloses an intelligent interaction method for power dispatching based on a program controlled telephone, which is characterized by comprising the following steps:
the method comprises the following steps: establishing connection, wherein a user uses a program-controlled telephone to dial a fixed number to establish connection with the power dispatching intelligent interaction platform;
step two: voice awakening, wherein after the user is connected with the power dispatching intelligent interaction platform in order to avoid error touch of the user, the monitoring module is synchronously executed, the voice recognition module is started only when the voice of the user is monitored to contain awakening words, the voice awakening interface is a voice awakening SDK (software development kit) flying in science and science university, and the awakening words are set to be small power;
step three: voice input, after waking up the voice recognition module, a user can carry out voice description according to the self requirement, and the description content is synchronously used as the voice input;
step four: the voice preprocessing, the preprocessing work is carried out to the voice input, the influence of factors such as higher harmonic distortion, high frequency and the like to the voice signal quality can be eliminated, the processing method comprises pre-emphasis, windowing and the like, and the formula for carrying out pre-emphasis on the voice signal is as follows:
Figure BDA0003132332590000071
wherein, x [ t ] represents the t-th number of the audio data, and the value range of a is (0.95, 0.99).
The formula for windowing a speech signal is as follows:
Figure BDA0003132332590000072
wherein x [ n ] is the nth number in the window, and w [ n ] is the corresponding weight;
step five: and (4) feature extraction, namely obtaining feature parameters of a time domain and a frequency domain through analyzing the frequency spectrum. The feature extraction method used herein is mel-frequency cepstrum coefficient (MFCC), and the calculation formula of the mel-frequency cepstrum coefficient is as follows:
Figure BDA0003132332590000073
Figure BDA0003132332590000074
Figure BDA0003132332590000075
Figure BDA0003132332590000076
wl(k) filter coefficients of corresponding filters, o (l), c (l), h (l)) A lower limit frequency, a center frequency and an upper limit frequency f of the corresponding filter on the actual frequency axissTaking the sampling rate, L as the number of filters, and F (L) as the filtering output;
step six: decoding and identifying, namely comparing the extracted features with a model library to obtain the most similar template, and then obtaining an identification result through query;
step seven: and (3) lexical analysis, wherein the conditional random field tool kit CRF + + is used for the lexical analysis, and the earlier stage work of the CRF + + lexical analysis comprises the following steps: establishing an electric power corpus, determining a feature template, and training and predicting;
preparing an electric power corpus:
the CRF + + power language material is input in a pure text form and is separated into words by spaces. For each time instant (x)t,yt) Only the last row is the output result y, and the remaining rows are all the input variables x.
Determining a characteristic template:
the feature template is used to extract features of the input variable x, and CRF + + supports custom vinegar and input variables in a specified language to generate various features.
Training and predicting:
the CRF + + interface which is open to the outside is directly used for completing training and prediction.
Step eight: identifying user intention, wherein the user intention identification is actually an ultra-short text classification problem, and classification judgment is carried out by using a naive Bayes method; when training, the naive Bayes method firstly learns the joint probability p (X, Y) through training, and then converts the joint probability p into the product of prior probability distribution and probability distribution through Bayes theorem.
p(X=x,Y=ck)=p(Y=ck)p(X=x|Y=ck);
Wherein the prior probability distribution p (Y ═ c) of the classk) Only the number of samples per category needs to be counted (maximum likelihood).
Figure BDA0003132332590000091
And conditional probability p (X ═ X, Y ═ c)k) Difficult to estimate;
p(X=x|Y=ck)=p(X1=x1,...,Xn=xn|Y=ck)k=1,...K;
if the ith dimension xiHas miMedium values, x is common in combination
Figure BDA0003132332590000098
A clock condition. Naive bayes assumes that all features are condition independent. Then:
Figure BDA0003132332590000092
the estimation can then be done using maximum likelihood:
Figure BDA0003132332590000093
in a given known class ckIn the case where the ith dimension of the feature vector is a specific value xiIs of class ckAnd the ith dimension is xiIs divided by the class ckThe number of all samples in the case of (2).
In prediction, naive Bayes method finds a posterior probability p (X ═ X | Y ═ c) according to Bayes formulak) Maximum class ckNamely:
Figure BDA0003132332590000094
substituting a Bayesian formula to obtain:
Figure BDA0003132332590000095
wherein the denominator and ckIndependently, can be omitted inThe method comprises the following steps:
Figure BDA0003132332590000096
that is:
Figure BDA0003132332590000097
step ten: authorization confirmation, namely prompting a user to perform authorization confirmation after the power dispatching intelligent interaction platform correctly identifies the intention of the user, wherein the authorization mode is a user job number plus a password;
step eleven: the power dispatching intelligent interaction platform is used for executing the user intention, if the user authorization confirmation is successful, the power dispatching intelligent interaction platform immediately executes the user intention, and if the user authorization confirmation is failed, the user is prompted to execute the authorization confirmation again;
step twelve: the result feedback is carried out, and after the user intention is executed, the power dispatching intelligent interaction platform receives the result feedback in a text form;
step thirteen: analyzing a result text, namely analyzing the text according to the power dispatching model type semantics to enable the power dispatching intelligent interaction platform to obtain pronunciation prompts; the power scheduling model-like semantics include slot (slot), intent (intent), and domain (domain). The field is the field of power dispatching, the intention is power dispatching business, such as equipment to be maintained inquiry, equipment running state inquiry and the like, and the term slot comprises time, place, operators and the like.
Fourteen steps: rhythm control, according to the result of text analysis, endowing some characteristics to the synthesized voice, so that the semantic expressed by the voice is clearer and more natural;
step fifteen: voice synthesis, namely selecting matched voice primitives from the power voice database, splicing the voice primitives to obtain continuous voice, and performing prosody modification on the continuous voice to obtain finally synthesized voice;
sixthly, the steps are as follows: outputting voice, wherein the power dispatching intelligent interaction platform transmits the voice synthesized in the step fifteen to a user through a program-controlled telephone;
seventeen steps: connection release, if the user has other requirements, returning to the third step to continue voice input, if the requirements are finished, hanging up the phone, and connection release;
based on the traditional program-controlled telephone, the invention combines artificial intelligence technologies such as voice recognition and natural language processing, realizes effective interaction between the program-controlled telephone and the electric power application system, realizes the intelligent interaction function of the virtual seat telephone by utilizing the dispatching telephone and the automatic voice recognition and synthesis function, can effectively improve the calling and control efficiency of the program-controlled telephone, completes a series of intelligent searching, inquiring and dispatching command work, and improves the efficiency of electric power dispatching command.
It should be noted that, in this document, terms such as "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (6)

1. The intelligent power dispatching interaction method based on the program-controlled telephone is characterized by comprising the following steps of:
the method comprises the following steps: establishing connection, wherein a user uses a program-controlled telephone to dial a fixed number to establish connection with the power dispatching intelligent interaction platform;
step two: voice awakening, namely after a user establishes connection with the power dispatching intelligent interaction platform, synchronously executing a monitoring module, and starting a voice recognition module when the voice of the user is monitored to contain awakening words;
step three: voice input, after waking up the voice recognition module, a user carries out voice description according to the self requirement, and the description content is synchronously used as the voice input;
step four: voice preprocessing, namely preprocessing voice input, wherein the processing method comprises pre-emphasis and windowing;
step five: extracting characteristics, namely analyzing a frequency spectrum to obtain characteristic parameters of a time domain and a frequency domain;
step six: decoding and identifying, namely comparing the extracted features with a model library to obtain similar templates, and obtaining an identification result through query;
step seven: and (3) lexical analysis, wherein the conditional random field tool kit CRF + + is used for the lexical analysis, and the earlier stage work of the CRF + + lexical analysis comprises the following steps: establishing an electric power corpus, determining a feature template, and training and predicting;
step eight: identifying the user intention, and performing classification judgment by using a naive Bayes method;
step ten: authorization confirmation, namely prompting a user to perform authorization confirmation after the power dispatching intelligent interaction platform correctly identifies the intention of the user, wherein the authorization mode is a user job number plus a password;
step eleven: the power dispatching intelligent interaction platform is used for executing the user intention, if the user authorization confirmation is successful, the power dispatching intelligent interaction platform immediately executes the user intention, and if the user authorization confirmation is failed, the user is prompted to execute the authorization confirmation again;
step twelve: the result feedback is carried out, and after the user intention is executed, the power dispatching intelligent interaction platform receives the result feedback in a text form;
step thirteen: analyzing a result text, namely analyzing the text according to a model type power semantic library to enable the power dispatching intelligent interaction platform to obtain a pronunciation prompt;
fourteen steps: rhythm control, according to the result of text analysis, endowing some characteristics to the synthesized voice, so that the semantic expressed by the voice is clearer and more natural;
step fifteen: voice synthesis, namely selecting matched voice primitives from the power voice database, splicing the voice primitives to obtain continuous voice, and performing prosody modification on the continuous voice to obtain finally synthesized voice;
sixthly, the steps are as follows: outputting voice, wherein the power dispatching intelligent interaction platform transmits the voice synthesized in the step fifteen to a user through a program-controlled telephone;
seventeen steps: and (4) releasing the connection, returning to the step three to continue voice input if the user has other requirements, and hanging up the phone and releasing the connection if the requirement is finished.
2. The programmed telephone-based power dispatching intelligent interaction method of claim 1, wherein: the voice preprocessing method in the fourth step comprises pre-emphasis and windowing, and the formula for performing pre-emphasis on the voice signal is as follows:
Figure FDA0003132332580000021
wherein, x [ t ] represents the t-th number of the audio data, and the value range of a is (0.95, 0.99);
the formula for windowing a speech signal is as follows:
Figure FDA0003132332580000022
where x [ n ] is the nth number within the window taken and w [ n ] is the weight corresponding thereto.
3. The programmed telephone-based power dispatching intelligent interaction method of claim 1, wherein: in the fifth step, a calculation formula of the mel frequency cepstrum coefficient is as follows:
Figure FDA0003132332580000023
Figure FDA0003132332580000031
Figure FDA0003132332580000032
Figure FDA0003132332580000033
wherein, wl(k) To the filter coefficients of the corresponding filters, o (l), c (l), h (l) are the lower limit frequency, center frequency and upper limit frequency of the corresponding filters on the actual frequency axis, fsAnd L is the number of filters, and F (L) is the filtering output.
4. The programmed telephone-based power dispatching intelligent interaction method of claim 1, wherein: in the seventh step, the CRF + + lexical analysis early-stage work comprises the following steps:
establishing an electric power corpus, determining a feature template, and training and predicting;
preparing an electric power corpus: the CRF + + electric power corpus is input in a pure text form and is separated into words by spaces;
for each time instant (x)t,yt) Only the last row is the output result y, and the rest rows are all the input variables x;
determining a characteristic template: the characteristic template is used for extracting the characteristics of an input variable x, and CRF + + supports the self-defining of vinegar and the input variable by using a specified language to generate various characteristics;
training and predicting: the CRF + + interface is used to complete training and prediction for the open CRF _ spare and CRF _ test interfaces.
5. The programmed telephone-based power dispatching intelligent interaction method of claim 1, wherein: in the step eight, the naive Bayes method is used in training:
firstly, learning joint probability p (X, Y) through training, and then converting the joint probability p (X, Y) into a product of prior probability distribution and probability distribution through Bayes theorem;
p(X=x,Y=ck)=p(Y=ck)p(X=x|Y=ck);
wherein the prior probability distribution p (Y ═ c) of the classk) Counting the number of samples of each category;
Figure FDA0003132332580000041
and conditional probability p (X ═ X, Y ═ c)k) Difficult to estimate;
p(X=x|Y=ck)=p(X1=x1,...,Xn=xn|Y=ck)k=1,...K;
if the ith dimension xiHas miMedium values, x is common in combination
Figure FDA0003132332580000042
A clock condition; naive bayes assumes that all features are conditionally independent, then:
Figure FDA0003132332580000043
estimation with maximum likelihood:
Figure FDA0003132332580000044
in a given known class ckIn the case where the ith dimension of the feature vector is a specific value xiIs of class ckAnd the ith dimension is xiIs divided by the class ckThe number of all samples in the case of (1);
in prediction, simpleThe prime Bayes method finds the posterior probability p (X ═ X | Y ═ c) according to Bayes formulak) Maximum class ck
Figure FDA0003132332580000045
Substituting a Bayesian formula to obtain:
Figure FDA0003132332580000046
wherein the denominator and ckRegardless, omission yields:
Figure FDA0003132332580000047
namely, it is
Figure FDA0003132332580000051
6. The programmed telephone-based power dispatching intelligent interaction method of claim 1, wherein: in the thirteenth step, the power dispatching model type semantics comprise word slots, intentions and fields;
the field is a power dispatching field, the intention is a power dispatching service, and the word slot comprises time, place and operator.
CN202110708426.2A 2021-06-25 2021-06-25 Power dispatching intelligent interaction method based on program-controlled telephone Pending CN113807642A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110708426.2A CN113807642A (en) 2021-06-25 2021-06-25 Power dispatching intelligent interaction method based on program-controlled telephone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110708426.2A CN113807642A (en) 2021-06-25 2021-06-25 Power dispatching intelligent interaction method based on program-controlled telephone

Publications (1)

Publication Number Publication Date
CN113807642A true CN113807642A (en) 2021-12-17

Family

ID=78942592

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110708426.2A Pending CN113807642A (en) 2021-06-25 2021-06-25 Power dispatching intelligent interaction method based on program-controlled telephone

Country Status (1)

Country Link
CN (1) CN113807642A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108268459A (en) * 2016-12-30 2018-07-10 广东精点数据科技股份有限公司 A kind of community's speech filtration system based on naive Bayesian
WO2019144926A1 (en) * 2018-01-26 2019-08-01 上海智臻智能网络科技股份有限公司 Intelligent interaction method and apparatus, computer device and computer-readable storage medium
CN112581939A (en) * 2020-12-06 2021-03-30 中国南方电网有限责任公司 Intelligent voice analysis method applied to power dispatching normative evaluation
CN112599124A (en) * 2020-11-20 2021-04-02 内蒙古电力(集团)有限责任公司电力调度控制分公司 Voice scheduling method and system for power grid scheduling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108268459A (en) * 2016-12-30 2018-07-10 广东精点数据科技股份有限公司 A kind of community's speech filtration system based on naive Bayesian
WO2019144926A1 (en) * 2018-01-26 2019-08-01 上海智臻智能网络科技股份有限公司 Intelligent interaction method and apparatus, computer device and computer-readable storage medium
CN112599124A (en) * 2020-11-20 2021-04-02 内蒙古电力(集团)有限责任公司电力调度控制分公司 Voice scheduling method and system for power grid scheduling
CN112581939A (en) * 2020-12-06 2021-03-30 中国南方电网有限责任公司 Intelligent voice analysis method applied to power dispatching normative evaluation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈静;: "浅析语音人机交互技术在智能调度中的应用", 现代国企研究, no. 18 *

Similar Documents

Publication Publication Date Title
CN110838289B (en) Wake-up word detection method, device, equipment and medium based on artificial intelligence
CN110570873B (en) Voiceprint wake-up method and device, computer equipment and storage medium
CN1163869C (en) System and method for developing interactive speech applications
CN111489748A (en) Intelligent voice scheduling auxiliary system
CN100401375C (en) Speech-processing system and method
CN108766441B (en) Voice control method and device based on offline voiceprint recognition and voice recognition
CN109741754A (en) A kind of conference voice recognition methods and system, storage medium and terminal
CN109196495A (en) Fine granularity natural language understanding
JPH0394299A (en) Voice recognition method and method of training of voice recognition apparatus
CN107369439A (en) A kind of voice awakening method and device
CN108899013A (en) Voice search method, device and speech recognition system
CN104575497B (en) A kind of acoustic model method for building up and the tone decoding method based on the model
CN110517664A (en) Multi-party speech recognition methods, device, equipment and readable storage medium storing program for executing
CN109036395A (en) Personalized speaker control method, system, intelligent sound box and storage medium
CN103514879A (en) Local voice recognition method based on BP neural network
CN112131359A (en) Intention identification method based on graphical arrangement intelligent strategy and electronic equipment
CN110298463A (en) Meeting room preordering method, device, equipment and storage medium based on speech recognition
CN111429915A (en) Scheduling system and scheduling method based on voice recognition
CN105845139A (en) Off-line speech control method and device
CN109741735A (en) The acquisition methods and device of a kind of modeling method, acoustic model
CN116110405B (en) Land-air conversation speaker identification method and equipment based on semi-supervised learning
CN115910066A (en) Intelligent dispatching command and operation system for regional power distribution network
CN114186108A (en) Multimode man-machine interaction system oriented to electric power material service scene
CN110364147B (en) Awakening training word acquisition system and method
CN113438515A (en) IPTV terminal government affair consultation method and system based on intelligent interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination