CN109253728A

CN109253728A - Phonetic navigation method, device, computer equipment and storage medium

Info

Publication number: CN109253728A
Application number: CN201811008808.9A
Authority: CN
Inventors: 高梁梁
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-08-31
Filing date: 2018-08-31
Publication date: 2019-01-22

Abstract

The invention discloses a kind of phonetic navigation method, device, computer equipment and storage mediums, wherein, the phonetic navigation method includes: the actual measurement WIFI signal cluster for obtaining mobile terminal, based on actual measurement WIFI signal cluster, obtain mobile terminal in pre-set navigational map corresponding fixed point coordinate as starting point coordinate；Obtain the Voice Navigation request that mobile terminal is sent；Voice data to be identified is pre-processed, speech vector feature corresponding with voice data to be identified is obtained；Speech vector feature is matched, lteral data corresponding with speech vector feature is obtained, preset instruction set is matched based on lteral data, obtains target control instruction；It is instructed based on target control, obtains target control and instruct corresponding target instruction target word implementing result.The phonetic navigation method is no longer limited by the influence of mobile terminal self poisoning equipment precision, is manually entered instruction without user, improves the efficiency and accuracy for realizing indoor positioning and navigation.

Description

Phonetic navigation method, device, computer equipment and storage medium

Technical field

The present invention relates to intelligent navigation field more particularly to a kind of phonetic navigation method, device, computer equipment and storages Medium.

Background technique

With the development of the communication technology and location technology, navigation is carried out using mobile terminal and is increasingly being applied to people's In daily life and work.Wherein, carry out outdoor navigation using mobile terminal, such as by the Baidu map that is installed in mobile terminal, The application such as Amap carries out the technology of outdoor navigation comparative maturity, brings great convenience for the trip of people.But with The high stratification and complication of building, indoor navigation also become problem in the urgent need to address.

It is limited to mobile terminal self poisoning equipment precision and traditional longitude and latitude according to the geographical location where mobile terminal Information position the influence of this mode, and there is presently no the schemes for efficiently and accurately using mobile terminal to carry out indoor navigation.

Summary of the invention

The embodiment of the present invention provides phonetic navigation method, device, computer equipment and the storage medium of a kind of Voice Navigation, To solve the problems, such as that currently accurately indoor navigation can not be carried out using mobile terminal.

A kind of phonetic navigation method, comprising:

The actual measurement WIFI signal cluster for obtaining mobile terminal obtains mobile terminal in pre-set navigational based on actual measurement WIFI signal cluster Corresponding fixed point coordinate is as starting point coordinate in figure；

The Voice Navigation request that mobile terminal is sent is obtained, Voice Navigation request includes voice data to be identified；

Voice data to be identified is pre-processed, speech vector feature corresponding with voice data to be identified is obtained；

Speech vector feature is matched based on Hidden Markov Model, obtains text corresponding with speech vector feature Data match preset instruction set based on lteral data, obtain target control instruction；

Based on starting point coordinate, performance objective control instruction, and obtains the corresponding target instruction target word of performance objective control instruction and hold For row as a result, implementing result is sent to mobile terminal, control mobile terminal shows implementing result.

A kind of voice guiding device, comprising:

Measured signal cluster module is obtained, for obtaining the actual measurement WIFI signal cluster of mobile terminal, based on actual measurement WIFI signal cluster, Obtain mobile terminal in pre-set navigational map corresponding fixed point coordinate as starting point coordinate；

Obtain navigation requests module, for obtain mobile terminal transmission Voice Navigation request, Voice Navigation request include to Identify voice data；

Phonetic feature module is obtained, for being pre-processed to voice data to be identified, is obtained and voice data to be identified Corresponding speech vector feature；

Obtain control instruction module, for being matched to speech vector feature based on Hidden Markov Model, obtain with The corresponding lteral data of speech vector feature matches preset instruction set based on lteral data, obtains target control instruction；

It shows implementing result module, for being based on starting point coordinate, performance objective control instruction, and obtains performance objective control Corresponding target instruction target word implementing result is instructed, implementing result is sent to mobile terminal, control mobile terminal shows implementing result.

A kind of computer equipment, including memory, processor and storage are in the memory and can be in the processing The computer program run on device, the processor realize the step of above-mentioned phonetic navigation method when executing the computer program Suddenly.

A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, the meter The step of calculation machine program realizes above-mentioned phonetic navigation method when being executed by processor.

Above-mentioned phonetic navigation method, device, computer equipment and storage medium, the actual measurement sent by obtaining mobile terminal WIFI signal cluster can get the starting point coordinate of mobile terminal, and target control obtained by the voice data sent as mobile terminal refers to It enables, so that starting point coordinate and the corresponding movement of target control instruction execution are based on, to realize that user inputs language by mobile terminal Accurately indoor positioning and navigation can be realized in sound navigation requests, is no longer limited by the shadow of mobile terminal self poisoning equipment precision It rings, is manually entered instruction without user, improves the efficiency for realizing indoor positioning and navigation.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.

Fig. 1 is the application environment schematic diagram of phonetic navigation method in one embodiment of the invention；

Fig. 2 is the flow chart of phonetic navigation method in one embodiment of the invention；

Fig. 3 is another flow chart of phonetic navigation method in one embodiment of the invention；

Fig. 4 is the WIFI signal for pinpointing coordinate (0,0) in one embodiment of the invention and obtaining three specified hotspot respectively Schematic diagram；

Fig. 5 is another flow chart of phonetic navigation method in one embodiment of the invention；

Fig. 6 is another flow chart of phonetic navigation method in one embodiment of the invention；

Fig. 7 is another flow chart of phonetic navigation method in one embodiment of the invention；

Fig. 8 is the structural schematic diagram of preset structure participle tree in one embodiment of the invention；

Fig. 9 is another flow chart of phonetic navigation method in one embodiment of the invention；

Figure 10 is the schematic diagram of voice guiding device in one embodiment of the invention；

Figure 11 is the schematic diagram of computer equipment in one embodiment of the invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.

Phonetic navigation method provided in an embodiment of the present invention can be applicable in the application environment such as Fig. 1, the Voice Navigation side Method is applied in speech guide system, which includes client and server, wherein client by network with Server is communicated.Wherein, client is also known as user terminal, refers to corresponding with server, provides local service for client Program.The client it is mountable but be not limited to various personal computers, laptop, smart phone, tablet computer and In the computer equipments such as portable wearable device.Server can use the either multiple server compositions of independent server Server cluster is realized.

In one embodiment, as shown in Fig. 2, providing a kind of phonetic navigation method, the service in Fig. 1 is applied in this way It is illustrated, includes the following steps: for device

S10. the actual measurement WIFI signal cluster for obtaining mobile terminal is obtained mobile terminal and is led default based on actual measurement WIFI signal cluster Corresponding fixed point coordinate is as starting point coordinate in boat map.

Wherein, actual measurement WIFI signal cluster is mobile terminal at current location, and each hotspot of corresponding indoor deployment is all There are a signal strength, server records mobile terminal in all signal strengths that current location obtains, and can be obtained Actual measurement WIFI signal cluster of the mobile terminal in current location.

It specifically, can be multiple when surveying WIFI signal intensity between the server test relatively each hotspot in mobile terminal The sample average that all actual measurement WIFI signal intensity are calculated after actual measurement is recorded, with more accurate ground tracer signal intensity.Its In, mobile terminal can be realized in current location and the sample average of the signal strength of each hotspot by following formula:

1/n*∑x(i)

Wherein, x (i) is mobile terminal to specified hotspot by real every time when specified times of collection measured signal intensity The signal strength indication measured, n are specified times of collection, n≤3.Mobile terminal is in current location and at least three hotspot Signal strength sample average, formed mobile terminal in the corresponding actual measurement WIFI signal cluster in current location.

In step S10, server is by obtaining signal acquisition terminal in the actual measurement of each fixed point coordinate by default times of collection WIFI signal cluster is compared " location fingerprint " of every tree record in default random forest based on the actual measurement WIFI signal cluster, calculated The Euclidean distance between WIFI signal cluster and each " location fingerprint " is surveyed, the smallest fixed point coordinate of Euclidean distance difference is arranged For the starting point coordinate of mobile terminal.Wherein, signal acquisition terminal be equipped with reception of wireless signals module to acquire trained WIFI The mobile terminal of signal cluster, the signal strength to hotspot each in collection room.The mode of above-mentioned positioning is additional without installation Positioning device, server can obtain starting point coordinate by signal strength between mobile terminal and each specified hotspot, Positioning cost is saved, and positioning method is reliable and stable.

S20. the Voice Navigation request that mobile terminal is sent is obtained, Voice Navigation request includes voice data to be identified.

Wherein, Voice Navigation request is the voice request of microphone typing of the user by mobile terminal.Voice number to be identified According to be mobile terminal by microphone typing in the form of simulated sound wave signals existing for the digital voice data that is formed after voice conversion.

In step S20, after the voice of user's typing can be converted into digital voice data by analog-digital converter by mobile terminal Voice data to be identified is formed, carries out the analysis of target control instruction based on the voice data to be identified conducive to subsequent server.

S30. voice data to be identified is pre-processed, it is special obtains speech vector corresponding with voice data to be identified Sign.

Wherein, speech vector be characterized in voice data to be identified capable of being converted to phonetic feature that computer is capable of handling to Amount, and meet or the auditory perception property of similar human ear, voice signal can be enhanced to a certain extent, inhibit non-speech audio Representative vector.Applied to this motion, MFCC mel-frequency cepstrum coefficient feature can be used (MelFrequencyCepstrumCoefficient, hereinafter referred to as MFCC feature) is used as speech vector feature.

Specifically, voice data to be identified is carried out pretreated the realization process includes: locating in advance to voice data to be identified Reason obtains pretreatment voice data；Fast Fourier Transform (FFT) is made to pretreatment voice data, obtains the frequency of voice data to be identified It composes, and obtains the power spectrum of voice data to be identified according to frequency spectrum；Voice number to be identified is handled using melscale filter group According to power spectrum, obtain the Meier power spectrum of voice data to be identified；Cepstral analysis is carried out on Meier power spectrum, is obtained wait know The MFCC feature of other voice data can also obtain speech vector feature corresponding with voice data to be identified.

In step S30, server extracts the MFCC feature of voice data to be identified, can embody voice to be identified well Data meet user demand so that subsequent server more accurately can obtain target control instruction based on the MFCC feature.

S40. speech vector feature is matched based on Hidden Markov Model, is obtained corresponding with speech vector feature Lteral data matches preset instruction set based on lteral data, obtains target control instruction.

Wherein, Hidden Markov Model (Hidden Markov Model, hereinafter referred to as HMM model) is Markov chain One kind, its state cannot observe directly, but can be arrived by observation vector sequence inspection, and each observation vector is to pass through Certain probability density distributions show as various states, each observation vector is by a shape with corresponding probability density distribution State sequence generates.So hidden Markov model is a dual random process -- the hidden Markov with certain status number Chain and display random function collection.Since the 1980s, HMM model is widely used in field of speech recognition, obtains weight Ten-strike.

There are three parameter (initial vector observe vector, transfer matrix) in Hidden Markov Model, wherein observation vector is just It is speech characteristic vector (being in this present embodiment exactly MFCC feature).It will be first to hidden before carrying out Hidden Markov Model identification Markov model is trained, that is, is estimated above using training sample by estimation algorithm (being constantly iterated estimation) Three HMM parameters.Hidden Markov Model is used to calculate the general of sound most probable currently entered (be which in training set) Rate.

Lteral data be through Hidden Markov Model to speech vector feature carry out voice match after obtain with voice number According to corresponding word sequence data splitting.

Preset instruction set is the set of the corresponding target control instruction of control server record preset keyword.Such as it is default One group of data in instruction set: preset keyword is verb " going " and " arriving ", the corresponding target control instruction of the two keywords To trigger navigation feature, by the two keywords followed by noun be set as the terminal of target navigation route.Preset instruction set Multiple preset keywords, such as " I where " and " where this is " corresponding target control instruction can be set according to application demand To obtain current location etc..

Target control instruction is to carry out the preset keyword of text keyword and preset instructions concentration in lteral data Corresponding target control instruction is obtained after matching.

Specifically, server matches speech vector feature based on Hidden Markov Model, acquisition and speech vector The corresponding lteral data of feature the specific implementation process is as follows:

Firstly, decoder (decoder) finds the most possible word sequence w for generating MFCC feature by decoding algorithm_1:L= w₁,w₂...,w_L.In terms of mathematical angle, decoder is for solving the corresponding maximum parameter w so that posterior probability P (w | Y).That is:

w_best=argmax P (w | Y) }

Wherein, P (w | Y) is posterior probability, and w is the corresponding maximum parameter of posterior probability P (w | Y).However it is straight to P (w | Y) It is very difficult to connect modeling, so being converted above formula by Bayes' theorem are as follows:

w_best=argmax P (Y | w) P (w)/P (Y) }

Wherein, P (Y) is observation probability, and P (w) is prior probability, and P (Y | w) it is likelihood probability.Due to observation probability P (Y) It is constant in the case where given observation sequence, above formula is further simplified:

w_best=argmax P (Y | w) P (w) }

Wherein prior probability P (w) is determined by language model (language model, LM), and likelihood probability P (Y | w) by sound Model (acoustic model, AM) is learned to determine.Phoneme (phone, also at sub- word) is the basic acoustic elements of acoustic model, than Such as say that word bat is made of/tri- phonemes of b//ae//t/；It is initial consonant, simple or compound vowel of a Chinese syllable in Chinese.

By taking English identifies as an example, for a specific word w, corresponding acoustic model is by multiple phoneme models (phone model) obtained multiple phonemes by searching for Pronounceable dictionary (pronunciation dictionary) grammer Rule is spliced.The parameter (such as: emission probability, transition probability etc.) of these phoneme models is by including speech waveform and correspondence Cypher text composed by data set training estimation obtain.Language model is usually a N-gram model (N-gram Model), the probability that wherein each word occurs is only related with preceding N-1 word, and the parameter of N-gram model is to pass through meter What the probability of calculation training text corpus (text corpus) N tuple obtained.

Modern decoder using cum rights finite state converter (weighted finite-state transducer, WFST decoding process (speech recognition tools packet Kaldi is realized based on WFST currently popular)) is completed, using word lattice (word lattice) convenient in this way, effective structure saves multiple optimal word sequences.

Illustrate voice match process: firstly, user says " hello " mobile terminal, system receives the voice of " hello " Waveform, the speech vector feature (i.e. MFCC feature) of multidimensional (such as 39 dimensions) is converted by feature extraction, and acoustic model receives These speech vector features, by multiple HMM models obtain corresponding multiple sub- word (being actually initial consonant and simple or compound vowel of a Chinese syllable)/n//i// Phoneme is spliced into word by searching for Pronounceable dictionary by h//ao/, as you, Buddhist nun；It is good, number.Then language model is come on stage, language model Using syntax rule, decodes to obtain optimal sequence " hello " by viterbi algorithm and text output is formed into lteral data.It arrives This, namely speech recognition match process is completed, matching result is exactly the corresponding lteral data of speech vector feature.

In step S40, server can obtain lteral data based on HMM model, analyze the text carried in lteral data and close After key word, the text keyword can be matched in preset instructions and concentrate corresponding preset keyword, obtain the preset keyword pair The target control instruction answered, so that target control instruction is triggered, to complete the Voice Navigation request of user's proposition.This implementation In, server only needs to can trigger corresponding target control instruction after analyzing speech vector feature, improves Voice Navigation Automatization level.

S50. it is based on starting point coordinate, performance objective control instruction, and obtains the corresponding target of performance objective control instruction and refers to Implementing result is enabled, implementing result is sent to mobile terminal, control mobile terminal shows implementing result.

Specifically, current location (namely starting point coordinate) of the server based on target control instruction and mobile terminal, which can trigger, is somebody's turn to do Target control instructs corresponding movement, and obtains the target instruction target word implementing result of respective action.For example, when target control instruction is Current location is obtained, server just triggers location action, to obtain the fixed point coordinate that mobile terminal is currently located on default map. When target control instruction is obtains target navigation route, server just triggers navigation feature, to obtain from starting point coordinate to terminal The most short navigation routine of coordinate.

In step S50, the implementing result that target control instructs can be sent to mobile terminal by server, so that mobile terminal interface The implementing result for showing Voice Navigation request, the implementing result is intuitively checked conducive to user.

In the phonetic navigation method that step S10 to S50 is provided, server is by obtaining the actual measurement WIFI letter that mobile terminal is sent Number cluster can get the starting point coordinate of mobile terminal, the instruction of target control obtained by the voice data sent as mobile terminal, thus Based on starting point coordinate and the corresponding movement of target control instruction execution, to realize that user inputs Voice Navigation by mobile terminal and asks It asks, accurately indoor positioning and navigation can be realized, be no longer limited by the influence of mobile terminal self poisoning equipment precision, without User is manually entered instruction, improves the efficiency for realizing indoor positioning and navigation.

In one embodiment, as shown in figure 3, in step S10, that is, the actual measurement WIFI signal cluster of mobile terminal is obtained, based on real Survey WIFI signal cluster, obtain mobile terminal in pre-set navigational map corresponding fixed point coordinate as starting point coordinate, specifically include as Lower step:

S11. WIFI signal intensity of the mobile terminal between current location and at least three specified hotspot is obtained.

Wherein, hotspot (WirelessAccessPoint, hereinafter referred to as AP) refers to and provides in public places wirelessly The terminal of local area network (WLAN) access Internet service.Mobile terminal is the mobile terminal for carrying hotspot in itself, the mobile terminal It can be with transmitting/receiving wireless information between the hotspot that is arranged in ambient enviroment.

The present embodiment is using the difference that the thought that WIFI signal intensity carries out position positioning is exactly according to mobile terminal indoors Position, with interior, there are different signal strengths between unused hotspot, according to the collected letter of different radio hot spot institute Number layout of intensity and hotspot indoors, determines the position of mobile terminal indoors.It is to be appreciated that indoor deployment is wireless Hot spot is more, and server is more by the WIFI signal intensity record that mobile terminal obtains, then server is to mobile terminal position Positioning it is also more accurate.Prove after tested, five or six hotspot of indoor deployment can positioning to mobile terminal it is more accurate.Generally One AP of every 3 meters of settings is needed by the indoor environment that WIFI signal intensity carries out position positioning, and at least there are three AP for interior. By interior there are three for specified AP, just there are three groups of WIFI signals in mobile terminal in current location, this three groups of WIFI signal compositions are worked as The actual measurement WIFI signal cluster of front position.For ease of description, the present embodiment can dispose three hotspot indoors, to service Device positions the position where mobile terminal.

Each AP carries the unique equipment factory serial number namely MAC (Medium of manufacturer's perfusion when leaving the factory Access Control) address, to distinguish different AP.It, can be using the MAC Address of every AP as specified nothing in the present embodiment The distinguishing identifier of line hot spot.

WIFI signal intensity (also referred to as wireless reception of signals intensity, Received Signal Strength Indicator, Hereinafter referred to as RSSI) in a cdma network, the range of RSSI is between -110dbm to -20dbm.In general, if RSSI <- 95dbm illustrates that the covering of current network signal is very poor, does not almost have signal；- 95dmb < RSSI < -90dbm illustrates current network signal It covers very weak；RSSI > -90dbm illustrates that the covering of current network signal is preferable.So typically using -90dbm as critical point, Just slightly to judge that current network covering is horizontal.

Specifically, the present embodiment using " location fingerprint " mode where indoor each default fixed point fixed point coordinate and " location fingerprint " RSSI is associated, and each fixed point coordinate pair answers a unique fingerprint.This fingerprint can be one-dimensional (one The RSSI of AP) or multidimensional (RSSI of multiple AP).The present embodiment at least disposes two AP, therefore each fixed point coordinate indoors Corresponding " fingerprint " is multidimensional.

In step S11, server is by mobile terminal WIFI signal corresponding with the specified hotspot in interior at least three Intensity is recorded, and is conducive to subsequent server and is based on where above-mentioned WIFI signal intensity contrast " location fingerprint " confirmation mobile terminal Pinpoint coordinate.

S12. it is based on WIFI signal intensity, obtains mobile terminal in the actual measurement WIFI signal cluster of current location.

It specifically, can be multiple when surveying WIFI signal intensity between the server test relatively each hotspot in mobile terminal The sample average that the WIFI signal intensity of all actual measurements is calculated after actual measurement is recorded, with more accurate ground tracer signal intensity. Wherein, the sample average of the signal strength of each hotspot can be realized by following formula:

1/n*∑x(i)

Wherein, x (i) is mobile terminal to specified hotspot by real every time when specified times of collection measured signal intensity The signal strength indication measured, n are specified times of collection.

The realization process for illustrating actual measurement WIFI signal cluster is as follows, as shown in Figure 4:

1. signal acquisition terminal successively in the corresponding fixed point coordinate (0,0) of default fixed point, (may be configured as by default times of collection 10 times) it obtains and the standard signal intensity between each specified hotspot MAC1, MAC2 and MAC3 and records:

Measured signal intensity between the fixed point coordinate (0,0) collected and MAC1 is successively are as follows:

X1, X2, X3, X4 ... X9 and X10 brings X1 to X10 into formula 1/n* ∑ x (i), wherein x (i)=X1, X2, X3, X4 ... X10 }, n=10, obtain fixed point coordinate (0,0) and MAC1 between measured signal intensity sample average For X.

The sample average that can must similarly pinpoint the measured signal intensity between coordinate (0,0) and MAC2 is Y, pinpoints coordinate (0,0) sample average of the measured signal intensity between MAC3 is Z.

2. the sample average of the measured signal intensity between combination fixed point coordinate (0,0) and each specified hotspot, shape It is (X, Y, Z) at coordinate (0,0) is pinpointed in the corresponding actual measurement WIFI signal cluster in current location.

In the present embodiment, server is by obtaining signal acquisition terminal in the actual measurement of each fixed point coordinate by default times of collection WIFI signal cluster, to the subsequent indoor location preparation techniques basis for comparing acquisition place with training WIFI signal cluster.

S13. calculate actual measurement WIFI signal cluster and default random forest in each decision tree Euclidean distance, obtain it is European away from From shortest objective decision tree, objective decision tree corresponding fixed point coordinate rising as mobile terminal in pre-set navigational map is obtained Point coordinate.

Wherein, decision tree is the every a kind of cluster for constituting default random forest, and each decision tree all corresponds to a fixed point coordinate With trained WIFI signal cluster corresponding with fixed point coordinate.

Specifically, the actual measurement WIFI signal cluster comparison that server can be obtained based on mobile terminal in current location is default random gloomy " location fingerprint " of every tree record, calculates the Euclidean distance between actual measurement WIFI signal cluster and each " location fingerprint " in woods, Set the corresponding fixed point coordinate of the smallest objective decision tree of Euclidean distance difference to the starting point coordinate of mobile terminal.Wherein, European Distance is originated from two o'clock x1, the distance between x2 formula in N-dimensional Euclidean space:

In above-mentioned formula, i is the quantity of hotspot, X_1iIt is follow shot end in current location and i-th hotspot Actual measurement WIFI signal intensity, X_2iTraining WIFI signal intensity for follow shot end in current location and i-th hotspot.

Illustrate the realization for calculating the Euclidean distance of each decision tree in actual measurement WIFI signal cluster and default random forest Process:

Actual measurement WIFI signal cluster of the follow shot end in current location is (X11, X12, X13), by the actual measurement WIFI signal Cluster and the training WIFI signal cluster of each decision tree compare, wherein the corresponding trained WIFI signal cluster of fixed point coordinate (0,0) For (X21, X22, X23).

WIFI signal cluster (X11, X12, X13) will be surveyed and formula is brought in training WIFI signal cluster (X21, X22, X23) into, it can Obtain Euclidean distance d1.

The Euclidean distance d2...dx of the actual measurement WIFI signal cluster Yu other each decision trees can similarly be obtained.Wherein d1 is The smallest numerical value in all Euclidean distances, namely can determine that, the corresponding trained WIFI signal cluster (X21, X22, X23) of d1 is target Decision tree.

For the mode positioned to starting point coordinate that step S13 is provided without installing additional positioning device, server is logical The signal strength crossed between mobile terminal and each specified hotspot can obtain starting point coordinate, save positioning cost, and position Mode is reliable and stable.

Step S11 is into S13, and server is by mobile terminal WIFI corresponding with the specified hotspot in interior at least three Signal strength is recorded, and is conducive to subsequent server and is based on above-mentioned WIFI signal intensity contrast " location fingerprint " confirmation mobile terminal institute Fixed point coordinate.Server can obtain starting point seat by the signal strength between mobile terminal and each specified hotspot Mark saves positioning cost, and positioning method is reliable and stable.

In one embodiment, pre-set navigational map includes at least three default fixed points, each corresponding fixed point of default fixed point Coordinate；As shown in figure 5, before step S10, i.e., before the step of obtaining the actual measurement WIFI signal cluster of mobile terminal, the voice Air navigation aid further includes following steps:

S101. in each default fixed point, by default times of collection obtain signal acquisition terminal and each specified hotspot it Between the training WIFI signal cluster that is formed of WIFI signal intensity, and obtain the corresponding fixed point coordinate of default fixed point.

Wherein, training WIFI signal cluster is when obtaining the training stage of standard signal intensity, and mobile terminal is sat in each fixed point Mark the multidimensional standard signal cluster that the WIFI signal intensity between each specified hotspot obtained after tested is formed.

Specifically, between mobile terminal and each hotspot when measurement standard signal strength, institute is calculated after can repeatedly measuring There is the sample average of the signal strength of measurement to be recorded, with more accurate ground record standard signal strength.Wherein, each wireless The sample average of the standard signal intensity of hot spot can be realized by following formula:

1/n*∑x(i)

Wherein, x (i) be mobile terminal to specified hotspot by specified times of collection measurement standard signal strength when it is every Secondary to measure obtained standard signal intensity value, n is specified times of collection.

The realization process for illustrating training WIFI signal cluster is as follows, as shown in Figure 4:

Standard signal intensity between the fixed point coordinate (0,0) collected and MAC1 is successively are as follows:

A1, A2, A3, A4 ... A9 and A10 brings A1 to A10 into formula 1/n* ∑ x (i), wherein x (i)=A1, A2, A3, A4 ... A10 }, n=10, obtain fixed point coordinate (0,0) and MAC1 between standard signal intensity sample average For A.

The sample average that can must similarly pinpoint the standard signal intensity between coordinate (0,0) and MAC2 is B, pinpoints coordinate (0,0) sample average of the standard signal intensity between MAC3 is C.

2. the sample average of the standard signal intensity between combination fixed point coordinate (0,0) and each specified hotspot, shape It is (A, B, C) at coordinate (0,0) is pinpointed in the corresponding trained WIFI signal cluster in current location.

In the present embodiment, server is by obtaining signal acquisition terminal in the training of each fixed point coordinate by default times of collection WIFI signal cluster sends actual measurement WIFI signal cluster in different fixed point coordinates for subsequent client and carries out pair with WIFI signal cluster is trained Than preparation techniques basis.

S102. associated storage fixed point coordinate and training WIFI signal cluster, to form the corresponding decision tree of fixed point coordinate.

In step S202, server closes each fixed point coordinate and trained WIFI signal cluster corresponding with fixed point coordinate Connection storage, forms the decision tree of the fixed point coordinate, is conducive to subsequent server and is based on training WIFI signal that can carry out client Position positioning, it is simple and fast.

Step S101 is into S102, and server is by obtaining signal acquisition terminal in each fixed point coordinate by default times of collection Training WIFI signal cluster, be subsequent client different fixed point coordinates send actual measurement WIFI signal cluster with train WIFI signal cluster Compare preparation techniques basis.Server carries out each fixed point coordinate and trained WIFI signal cluster corresponding with fixed point coordinate Associated storage forms the decision tree of the fixed point coordinate, and position can be carried out to client by being conducive to subsequent server based on decision tree Positioning, it is simple and fast.

In one embodiment, as shown in fig. 6, in step S30, i.e., voice data to be identified is pre-processed, obtain with The corresponding speech vector feature of voice data to be identified, specifically comprises the following steps:

S31. voice data to be identified is pre-processed, obtains pretreatment voice data.

In a specific embodiment, in step S31, voice data to be identified is pre-processed, obtains pretreatment language Sound data, specifically comprise the following steps:

S311: preemphasis processing is made to voice data to be identified, the calculation formula of preemphasis processing is s'_n=s_n-a*s_n-1, Wherein, s_nFor the signal amplitude in time domain, s_n-1For with s_nThe signal amplitude of corresponding last moment, s'_nWhen for after preemphasis Signal amplitude on domain, a are pre emphasis factor, and the value range of a is 0.9 < a < 1.0.

Wherein, preemphasis is a kind of signal processing mode compensated in transmitting terminal to input signal high fdrequency component.With The increase of signal rate, signal be damaged in transmission process it is very big, in order to enable receiving end to obtain relatively good signal waveform, With regard to needing to compensate impaired signal.The thought of pre-emphasis technique is exactly the high frequency in the transmitting terminal enhancing signal of transmission line Ingredient enables receiving end to obtain preferable signal waveform to compensate excessive decaying of the high fdrequency component in transmission process.In advance Exacerbation does not have an impact to noise, therefore can effectively improve output signal-to-noise ratio.

In the present embodiment, preemphasis processing is made to voice data to be identified, the formula of preemphasis processing is s'_n=s_n-a* s_n-1, wherein s_nFor the signal amplitude in time domain, i.e. the amplitude (amplitude) of voice expressed in the time domain of voice data, s_n-1For With s_nThe signal amplitude of opposite last moment, s'_nFor the signal amplitude in time domain after preemphasis, a is pre emphasis factor, and a's takes Value range is 0.9 < a < 1.0, takes the effect of 0.97 preemphasis relatively good here.Sounding mistake can be eliminated by being handled using the preemphasis It is interfered caused by vocal cords and lip etc. in journey, can be with the pent-up high frequency section of effective compensation voice data to be identified, and energy The formant for enough highlighting voice data high frequency to be identified, reinforces the signal amplitude of voice data to be identified, helps to extract voice Vector characteristic.

S312: the voice data to be identified after preemphasis is subjected to sub-frame processing.

Specifically, after preemphasis voice data to be identified, sub-frame processing should also be carried out.Framing refers to whole section of voice Signal is cut into the voice processing technology of several segments, the size of every frame in the range of 10-30ms, using general 1/2 frame length as Frame moves.Frame moves the overlapping region for referring to adjacent two interframe, can be avoided adjacent two frame and changes excessive problem.To voice to be identified Data carry out sub-frame processing, and voice data to be identified can be divided into the voice data of several segments, can segment voice to be identified Data, convenient for the extraction of speech vector feature.

S313: carrying out windowing process for the voice data to be identified after framing, obtains pretreatment voice data, the meter of adding window Calculating formula isWherein, N is that window is long, and n is time, s_nFor the signal width in time domain Degree, s '_nFor the signal amplitude in time domain after adding window.

Specifically, after carrying out sub-frame processing to voice data to be identified, the initial segment of each frame and end end can all go out Existing discontinuous place, so framing is mostly also bigger with the error of voice data to be identified.This is able to solve using adding window A problem, the voice data to be identified after can making framing becomes continuously, and each frame is enabled to show periodic function Feature.Windowing process specifically refers to handle voice data to be identified using window function, and window function can choose Hamming Window, then the formula of the adding window beN is that Hamming window window is long, and n is time, s_nFor when Signal amplitude on domain, s '_nFor the signal amplitude in time domain after adding window.Windowing process is carried out to voice data to be identified, is obtained Voice data is pre-processed, the signal of voice data to be identified in the time domain after enabling to framing becomes continuously, to help to mention Take the speech vector feature of voice data to be identified.

Above-mentioned steps S311-S313 is to the pretreatment operation of voice data to be identified, to extract voice data to be identified Speech vector feature provides the foundation, and enables to the speech vector feature extracted more representative of the voice data to be identified.

S32: making Fast Fourier Transform (FFT) to pretreatment voice data, obtain the frequency spectrum of voice data to be identified, and according to Frequency spectrum obtains the power spectrum of voice data to be identified.

Wherein, Fast Fourier Transform (FFT) (Fast Fourier Transformation, abbreviation FFT), refers to and utilizes computer Calculate efficient, quick calculation method the general designation of discrete Fourier transform.Computer can be made to calculate discrete Fu using this algorithm In multiplication number required for leaf transformation be greatly reduced, the number of sampling points being especially transformed is more, the section of fft algorithm calculation amount It saves more significant.

Specifically, Fast Fourier Transform (FFT) is carried out to pretreatment voice data, voice data will be pre-processed from time domain Signal amplitude be converted to the signal amplitude (frequency spectrum) on frequency domain.The formula of the calculating frequency spectrum is 1≤k≤N, N are the size of frame, and s (k) is the signal amplitude on frequency domain, and s (n) is the signal amplitude in time domain, and n is the time, and i is Complex unit.After the frequency spectrum for obtaining pretreatment voice data, pretreatment voice data can be directly acquired according to the frequency spectrum The power spectrum for pre-processing voice data is known as the power spectrum of voice data to be identified by power spectrum below.Calculating language to be identified The formula of the power spectrum of sound data is1≤k≤N, N are the size of frame, and s (k) is the signal width on frequency domain Degree.By the way that pretreatment voice data is converted to the signal amplitude on frequency domain from the signal amplitude in time domain, further according to the frequency domain On signal amplitude obtain the power spectrum of voice data to be identified, to extract voice arrow from the power spectrum of voice data to be identified Measure feature provides important technical foundation.

S33: the power spectrum of voice data to be identified is handled using melscale filter group, obtains voice data to be identified Meier power spectrum.

It wherein, is the plum carried out to power spectrum using the power spectrum that melscale filter group handles voice data to be identified That frequency analysis, mel-frequency analysis are the analyses based on human auditory's perception.Detection discovery, human ear is just as a filter group Equally, certain specific frequency components is only focused on (sense of hearing of people is nonlinear to frequency), that is to say, that human ear receives sound The signal of frequency is limited.However these filters are not but univesral distributions on frequency coordinate axis, are had in low frequency region Many filters, they are distributed than comparatively dense, but in high-frequency region, the number of filter just becomes fewer, are distributed very dilute It dredges.It is to be appreciated that high resolution of the melscale filter group in low frequency part, the auditory properties with human ear are consistent, This is also the physical significance place of melscale.

In the present embodiment, the power spectrum of voice data to be identified is handled using melscale filter group, is obtained to be identified The Meier power spectrum of voice data carries out cutting to frequency-region signal by using melscale filter group, so that last each Frequency band corresponds to a numerical value, if the number of filter is 22, the Meier power spectrum pair of available voice data to be identified 22 energy values answered.Mel-frequency analysis is carried out by the power spectrum to voice data to be identified, so that obtaining after its analysis Meier power spectrum maintain the frequency-portions closely related with human ear characteristic, which can be well reflected out wait know The feature of other voice data.

S34: carrying out cepstral analysis on Meier power spectrum, obtains the MFCC feature of voice data to be identified.

Wherein, cepstrum (cepstrum) refers in Fu that a kind of Fourier transform spectrum of signal carries out again after logarithm operation Leaf inverse transformation, since general Fourier spectrum is complex number spectrum, thus cepstrum is also known as cepstrum.

Specifically, cepstral analysis is carried out to Meier power spectrum, according to cepstrum as a result, analyzing and obtaining voice number to be identified According to MFCC feature.It, can be excessively high by script characteristic dimension by the cepstral analysis, it is difficult to the voice number to be identified directly used According to Meier power spectrum in include feature be converted into wieldy spy by carrying out cepstral analysis on Meier power spectrum It levies (for the MFCC character vector for being trained or identifying).The MFCC feature can be as speech vector feature to difference The coefficient that voice distinguishes, the speech vector feature can reflect the difference between voice, can be used to identify and distinguish between to Identify voice data.

In a specific embodiment, in step S34, cepstral analysis is carried out on Meier power spectrum, obtains language to be identified The MFCC feature of sound data, includes the following steps:

S341: taking the logarithm of Meier power spectrum, obtains Meier power spectrum to be transformed.

Specifically, according to the definition of cepstrum, logarithm log is taken to Meier power spectrum, obtains Meier power spectrum m to be transformed.

S342: discrete cosine transform is made to Meier power spectrum to be transformed, obtains the MFCC feature of voice data to be identified.

Specifically, to Meier power spectrum m to be transformed make discrete cosine transform (Discrete Cosine Transform, DCT), the MFCC feature for obtaining corresponding voice data to be identified, generally takes the 2nd to the 13rd coefficient as speech vector Feature, the speech vector feature are able to reflect the difference between voice data.Discrete cosine transform is made to Meier power spectrum m to be transformed Formula beI=0,1,2 ..., N-1, N are frame length, and m is Meier power to be transformed Spectrum, j are the independent variable of Meier power spectrum to be transformed.Due to having overlapping between Meier filter, so using melscale There is correlation between the energy value that filter obtains, discrete cosine transform can carry out Meier power spectrum m to be transformed Dimensionality reduction is compressed and is abstracted, and obtains indirect speech vector feature, and compared to Fourier transformation, the result of discrete cosine transform does not have There is imaginary part, there is apparent advantage in terms of calculating.

Step S31-S34 carries out the processing of feature extraction based on training technique to voice data to be identified, finally obtains Speech vector feature can embody voice data to be identified well, so that the speech vector feature that training obtains is in subsequent progress Matching result when voice match is more accurate.

It should be noted that the feature extracted above is MFCC feature, speech vector feature should not be limited to herein Only MFCC feature is a kind of, and will be understood that the phonetic feature obtained using training technique, as long as can effectively reflect voice number According to feature, it all can be used as speech vector feature and carry out identification and model training.In the present embodiment, to voice number to be identified According to being pre-processed, and obtain corresponding pretreatment voice data.Pre-processing to voice data to be identified can be more preferable Ground extracts the speech vector feature of voice data to be identified, so that the speech vector feature extracted is more representative of the language to be identified Sound data, to carry out speech recognition using the speech vector feature.

In one embodiment, as shown in fig. 7, in step S40, i.e., preset instruction set is matched based on lteral data, obtains mesh Control instruction is marked, is specifically comprised the following steps:

S41. lteral data is segmented according to default word segmentation regulation, part-of-speech tagging is carried out to word segmentation result.

Wherein, default word segmentation regulation can be priority of long word principle, i.e., need the short sentence T1 that segments for one, first from the One word A starts, and finds out the longest word X1 originated by A from the dictionary prestored, and X1 is then rejected from T1 and is left T2, then Identical cutting principle used to T2, result after cutting be " X1/X2/,,, ".Such as " I needs to go for lteral data Market ", the result after participle are " I ", " needs ", " going ", " market ".

Further, server also needs to carry out part-of-speech tagging to word segmentation result, for example, " I needs to go for lteral data The part-of-speech tagging of the word segmentation result in market " can be with are as follows: " I/pronoun ", " needing/verb ", " removing/verb " and " market/noun ".

Preferably, server carries out the realization process of part-of-speech tagging to word segmentation result are as follows: according to word in general words dictionary With word respectively with the mapping relations of part of speech (for example, in general words dictionary, corresponding " market " and " amusement park " is noun, Corresponding " going " and " to " is verb)；Alternatively, according to preset word and word respectively with the mapping relations of part of speech (for example, preset Word and word are respectively and in the mapping relations of part of speech, and corresponding " market " and " amusement park " is business noun, and " going " and " arriving " is corresponding Be business verb), the corresponding part of speech of each participle after determining word segmentation processing, and being labeled.Wherein it is possible to individually use Word and word carry out part-of-speech tagging with the mapping relations of part of speech respectively in general words dictionary, can also individually using preset word and Word carries out part-of-speech tagging with the mapping relations of part of speech respectively, or carries out part-of-speech tagging using above two mapping relations simultaneously, Wherein, preset word and word are higher than word and word in general words dictionary with the part-of-speech tagging priority of the mapping relations of part of speech respectively Respectively with the mapping relations of part of speech.

In step S41, server segment according to default word segmentation regulation and marks word to each participle to lteral data Property, text keyword is matched conducive to the subsequent part of speech based on each participle and participle, and then match target control instruction.

S42. according to the sequence and part of speech of the corresponding each participle of lteral data, by the corresponding each participle structure of lteral data Build out preset structure participle tree.

Wherein, as shown in figure 8, preset structure participle tree includes multistage node, first order node is lteral data itself, the Two-level node is participle phrase, and every first nodes after the node of the second level are the participle phrases by even higher level of node according to word Property divide to obtain, i.e., every first nodes after the node of the second level are even higher level of node corresponding next stage participle or participle Phrase.

Specifically, according to the sequence and part of speech of the corresponding each participle of lteral data, lteral data is each point corresponding The process that word is built into preset structure participle tree specifically includes:

A1, in the corresponding each participle of lteral data, find out the mesh of each default part of speech (for example, noun, verb etc.) Mark participle；

A2, the sequence segmented according to the corresponding each target of lteral data, determine the corresponding participle of each second level node (preferably, A2 includes: the participle phrase for segmenting the words before the latter target participle as previous target to phrase；It will The last one target participle and its words later are as the last one participle phrase)；

If A3, a participle phrase cannot be segmented further, it is determined that the participle phrase be place node branch most Rear stage node；

If A4, a participle phrase can be segmented further, the mesh of each default part of speech in the participle phrase is found out Mark participle, and according to the sequence of the corresponding each target participle of the participle phrase, determine the next stage node pair of the participle phrase The participle or participle phrase answered；

A5, above-mentioned steps A3 and A4 are repeated, the afterbody node until determining each node branch is corresponding Participle.For example, the preset structure participle tree of building is as shown in Figure 8 with " I goes to playground to play soccer ".

In step S42, the corresponding each participle of lteral data can be constructed preset structure participle tree by server, after being conducive to It is continuous that target control instruction is matched based on preset structure participle tree.

S43. the corresponding text keyword of lteral data is parsed based on preset structure participle tree, is based on text keyword With preset instruction set, target control instruction is obtained.

Specifically, after the preset structure participle that lteral data is completed in building is set, server can be based on preset structure participle tree The each first default part of speech participle (for example, verb) is calculated at a distance from each second default part of speech participle (for example, noun) (that is: the number of nodes being separated by between each first default part of speech participle and each second default part of speech participle is distance)；It looks for respectively The second default part of speech nearest with each first default part of speech participle distance segments out, and respectively by each first default part of speech point Word with away from the nearest second default part of speech participle according to the corresponding text keyword of sequence composition in the lteral data.

Further, some verb acted is triggered because the preset keyword that preset instructions are concentrated is generally, server can The verb in text keyword is extracted to carry out matching acquisition target verb with preset keyword.Server will be in text keyword The verb followed by action object of the noun as target verb, target verb+action object is also formed into target control Instruction, and trigger the server executes target control instruction.

In step S43, server, which is based on preset structure participle tree, can parse the corresponding text keyword of lteral data, base Target control instruction is obtained in text keyword match preset instruction set, and can trigger target control instruction, matching process letter It is single quick, it saves user and passes through the time of triggering target control instruction manually.

Into S43, server segment according to default word segmentation regulation and be marked step S41 to each participle to lteral data Part of speech is infused, matches text keyword conducive to the subsequent part of speech based on each participle and participle, and then match target control and refer to It enables.Server, which is based on preset structure participle tree, can parse the corresponding text keyword of lteral data, be based on text keyword Target control instruction is obtained with preset instruction set, and can trigger target control instruction, matching process is simple and fast, saves user By the time for triggering target control instruction manually.

In one embodiment, as shown in figure 9, after step S50, that is, the step of mobile terminal shows implementing result is being controlled Later, which further includes following steps:

If S501. implementing result is to generate target navigation route, work as prelocalization every preset time acquisition mobile terminal Coordinate.

Wherein, target navigation route is that the position of Voice Navigation request is initiated using mobile terminal as starting point coordinate, with voice The destination of input is terminal point coordinate, the most short mobile route of the recommendation generated from starting point coordinate to terminal point coordinate to mobile terminal.

Preset time is detection mobile terminal present position and the time cycle whether target navigation route deviates, with this Embodiment may be configured as 60 seconds.Current positioning coordinate is that the position that mobile terminal is currently located on pre-set navigational map is corresponding fixed Point coordinate.

In step S501, whether server obtains current positioning coordinate every preset time can detect mobile terminal along mesh in time It is mobile to mark navigation routine, enhances the practicability and reliability of phonetic navigation method.

S502. current positioning coordinate and target navigation route are subjected to location matches, are led in current positioning coordinate and target When distance on the line of air route at nearest position is more than threshold value, mobile terminal is reminded to carry out route amendment.

Wherein, threshold value is acceptable farthest deviation at nearest position in current positioning coordinate and target navigation route Distance.And current positioning coordinate at position nearest on target navigation route at a distance from be more than threshold value when, illustrate mobile terminal The case where through exceeding acceptable farthest deviation distance, not being moved to terminal point coordinate according to target navigation route there are mobile terminal, In order to correct the mobile route of mobile terminal in time, server should send the modified prompting message of route to mobile terminal at this time.

In step S502, server determines that mobile terminal has exceeded acceptable farthest deviation distance, and there are mobile terminals not The case where being moved to terminal point coordinate according to target navigation route, in order to correct the mobile route of mobile terminal in time, server can be given Mobile terminal sends the modified prompting message of route, improves flexibility and the adaptability of phonetic navigation method.

Into S502, server can detect in time mobile terminal every preset time acquisition current positioning coordinate is step S501 It is no to be moved along target navigation route, enhance the practicability and reliability of phonetic navigation method.Server determines that mobile terminal has surpassed Acceptable farthest deviation distance out, the case where not being moved to terminal point coordinate according to target navigation route there are mobile terminal, in order to The mobile route of amendment mobile terminal in time, server can send the modified prompting message of route to mobile terminal, improve Voice Navigation The flexibility of method and adaptability.

In phonetic navigation method provided in this embodiment, server, which passes through, obtains the actual measurement WIFI signal cluster that mobile terminal is sent The starting point coordinate that can get mobile terminal, the instruction of the target control as obtained by the voice data of mobile terminal transmission, to be based on Starting point coordinate and the corresponding movement of target control instruction execution, thus realize that user inputs Voice Navigation request by mobile terminal, Accurately indoor positioning and navigation can be realized, be no longer limited by the influence of mobile terminal self poisoning equipment precision, without with Family is manually entered instruction, improves the efficiency for realizing indoor positioning and navigation.

Further, server is strong by mobile terminal WIFI signal corresponding with the specified hotspot in interior at least three Degree recorded, be conducive to subsequent server be based on above-mentioned WIFI signal intensity contrast " location fingerprint " confirmation mobile terminal where determine Point coordinate.Server can obtain starting point coordinate by the signal strength between mobile terminal and each specified hotspot, save Cost is positioned, and positioning method is reliable and stable.Server is by obtaining signal acquisition terminal in each fixed point by default times of collection The training WIFI signal cluster of coordinate is that subsequent client sends actual measurement WIFI signal cluster in different fixed point coordinates and trains WIFI letter Number cluster compares preparation techniques basis.Server by each fixed point coordinate and with the corresponding trained WIFI signal cluster of fixed point coordinate It is associated storage, forms the decision tree of the fixed point coordinate, client can be carried out based on decision tree by being conducive to subsequent server Position positioning, it is simple and fast.Server carries out the processing of feature extraction based on training technique to voice data to be identified, finally obtains The speech vector feature taken can embody voice data to be identified well, and server is to lteral data according to default word segmentation regulation Segment and mark part of speech to each participle, matches text key conducive to the subsequent part of speech based on each participle and participle Word, and then match target control instruction.Server, which is based on preset structure participle tree, can parse the corresponding text of lteral data Keyword obtains target control instruction based on text keyword match preset instruction set, and can trigger target control instruction, It is simple and fast with process, it saves user and passes through the time of triggering target control instruction manually.So that the speech vector that training obtains Matching result of the feature in subsequent progress voice match is more accurate.Server obtains current positioning coordinate every preset time It can detect whether mobile terminal moves along target navigation route in time, enhance the practicability and reliability of phonetic navigation method.Service Device determines that mobile terminal has exceeded acceptable farthest deviation distance, and there are mobile terminals not to be moved to end according to target navigation route The case where point coordinate, in order to correct the mobile route of mobile terminal in time, server can send the modified prompting of route to mobile terminal Information improves flexibility and the adaptability of phonetic navigation method.

It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.

In one embodiment, a kind of voice guiding device is provided, which leads with voice in above-described embodiment Boat method corresponds.As shown in Figure 10, which includes obtaining measured signal cluster module 10, obtaining navigation requests Module 20 obtains phonetic feature module 30, obtains control instruction module 40 and display implementing result module 50.Each functional module is detailed Carefully it is described as follows:

Measured signal cluster module 10 is obtained, for obtaining the actual measurement WIFI signal cluster of mobile terminal, based on actual measurement WIFI signal Cluster, obtain mobile terminal in pre-set navigational map corresponding fixed point coordinate as starting point coordinate.

Navigation requests module 20 is obtained, for obtaining the Voice Navigation request of mobile terminal transmission, Voice Navigation request includes Voice data to be identified.

Phonetic feature module 30 is obtained, for being pre-processed to voice data to be identified, is obtained and voice number to be identified According to corresponding speech vector feature.

Control instruction module 40 is obtained, for being matched based on Hidden Markov Model to speech vector feature, is obtained Lteral data corresponding with speech vector feature matches preset instruction set based on lteral data, obtains target control instruction.

It shows implementing result module 50, for being based on starting point coordinate, performance objective control instruction, and obtains performance objective control System instructs corresponding target instruction target word implementing result, implementing result is sent to mobile terminal, control mobile terminal shows implementing result.

Preferably, which includes obtaining signal strength unit, obtaining measured signal cluster list and obtain Take starting point coordinate unit.

Signal strength unit is obtained, for obtaining mobile terminal between current location and at least three specified hotspot WIFI signal intensity.

Measured signal cluster unit is obtained, for being based on WIFI signal intensity, obtains mobile terminal in the actual measurement of current location WIFI signal cluster

Starting point coordinate unit is obtained, for calculating the Europe of each decision tree in actual measurement WIFI signal cluster and default random forest Formula distance obtains the shortest objective decision tree of Euclidean distance, obtains objective decision tree corresponding fixed point in pre-set navigational map Starting point coordinate of the coordinate as mobile terminal.

Preferably, which further includes forming training signal cluster module and formation decision tree module.

Form training signal cluster module, in each default fixed point, by default times of collection obtain signal acquisition terminal with The training WIFI signal cluster that WIFI signal intensity between each specified hotspot is formed, and it is corresponding fixed to obtain default fixed point Point coordinate.

Decision tree module is formed, for associated storage fixed point coordinate and the trained WIFI signal cluster, to form the fixed point The corresponding decision tree of coordinate.

Preferably, which includes obtaining units of speech data, obtaining data spectrum unit, obtain plum That power spectrum unit and acquisition phonetic feature unit.

Units of speech data is obtained, for pre-processing to voice data, obtains pretreatment voice data.

Data spectrum unit is obtained, for making Fast Fourier Transform (FFT) to pretreatment voice data, obtains voice data Frequency spectrum, and according to the power spectrum of frequency spectrum acquisition voice data.

Meier power spectrum unit is obtained, for the power spectrum using melscale filter group processing voice data, is obtained The Meier power spectrum of voice data.

Phonetic feature unit is obtained, for carrying out cepstral analysis on Meier power spectrum, obtains the voice arrow of voice data Measure feature.

Preferably, obtaining control instruction module includes carrying out part-of-speech tagging unit, building participle counting unit and obtaining target Controling instructin unit.

Carry out part-of-speech tagging unit, for lteral data to be segmented according to default word segmentation regulation, to word segmentation result into Row part-of-speech tagging.

Building participle counting unit, for the sequence and part of speech according to the corresponding each participle of lteral data, by lteral data Corresponding each participle constructs preset structure participle tree.

Target control command unit is obtained, is closed for parsing the corresponding text of lteral data based on preset structure participle tree Key word is based on text keyword match preset instruction set, obtains target control instruction.

Preferably, which further includes obtaining changing coordinates module and prompting route repair module.

Changing coordinates module is obtained, if being to generate target navigation route for implementing result, is obtained every preset time The current positioning coordinate of mobile terminal.

Route repair module is reminded, for current positioning coordinate and target navigation route to be carried out location matches, current Position coordinate with position nearest on target navigation route at a distance from be more than threshold value when, remind mobile terminal progress route amendment.

Specific about voice guiding device limits the restriction that may refer to above for phonetic navigation method, herein not It repeats again.Modules in above-mentioned voice guiding device can be realized fully or partially through software, hardware and combinations thereof.On Stating each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also store in a software form In memory in computer equipment, the corresponding operation of the above modules is executed in order to which processor calls.

In one embodiment, a kind of computer equipment is provided, which can be server, internal structure Figure can be as shown in figure 11.The computer equipment includes processor, the memory, network interface sum number connected by system bus According to library.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory of the computer equipment includes Non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The database of machine equipment is used for the relevant data of phonetic navigation method.The network interface of the computer equipment is used for and external end End passes through network connection communication.To realize a kind of phonetic navigation method when the computer program is executed by processor.

In one embodiment, a kind of computer equipment is provided, including memory, processor and storage are on a memory and can The computer program run on a processor, processor realize above-described embodiment phonetic navigation method when executing computer program Step, such as step S10 shown in Fig. 2 to step S50.Alternatively, processor realizes above-described embodiment when executing computer program The function of each module/unit of middle voice guiding device, such as module 10 shown in Figure 10 is to the function of module 50.To avoid weight Multiple, details are not described herein again.

In one embodiment, a kind of computer readable storage medium is provided, computer program, computer journey are stored thereon with Above-described embodiment phonetic navigation method, such as step S10 shown in Fig. 2 to step S50 are realized when sequence is executed by processor.Or Person realizes in above-mentioned apparatus embodiment each module/unit in voice guiding device when the computer program is executed by processor Function, such as module 10 shown in Figure 10 is to the function of module 50.To avoid repeating, details are not described herein again.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, it is readable which can be stored in a non-volatile computer It takes in storage medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, this Shen Please any reference used in each embodiment to memory, storage, database or other media, may each comprise non-volatile And/or volatile memory.Nonvolatile memory may include that read-only memory (ROM), programming ROM (PROM), electricity can be compiled Journey ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) directly RAM (RDRAM), straight Connect memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.

The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations；Although with reference to the foregoing embodiments Invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each implementation Technical solution documented by example is modified or equivalent replacement of some of the technical features；And these modification or Replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all include Within protection scope of the present invention.

Claims

1. a kind of phonetic navigation method characterized by comprising

The actual measurement WIFI signal cluster of mobile terminal is obtained, the actual measurement WIFI signal cluster is based on, obtains mobile terminal in pre-set navigational Corresponding fixed point coordinate is as starting point coordinate in figure；

The Voice Navigation request that mobile terminal is sent is obtained, the Voice Navigation request includes voice data to be identified；

The voice data to be identified is pre-processed, speech vector spy corresponding with the voice data to be identified is obtained Sign；

The speech vector feature is matched based on Hidden Markov Model, is obtained corresponding with the speech vector feature Lteral data matches preset instruction set based on the lteral data, obtains target control instruction；

Based on the starting point coordinate, the target control instruction is executed, and obtains and executes the corresponding mesh of the target control instruction Instruction execution result is marked, the implementing result is sent to mobile terminal, control mobile terminal shows the implementing result.

2. phonetic navigation method as described in claim 1, which is characterized in that the actual measurement WIFI signal for obtaining mobile terminal Cluster is based on the actual measurement WIFI signal cluster, obtains mobile terminal corresponding fixed point coordinate in pre-set navigational map and sits as starting point Mark, comprising:

Obtain WIFI signal intensity of the mobile terminal between current location and at least three specified hotspot；

Based on the WIFI signal intensity, mobile terminal is obtained in the actual measurement WIFI signal cluster of current location；

The Euclidean distance of each decision tree in the actual measurement WIFI signal cluster and default random forest is calculated, obtains Euclidean distance most Short objective decision tree, obtain the objective decision tree in the pre-set navigational map corresponding fixed point coordinate as the shifting The starting point coordinate of moved end.

3. phonetic navigation method as claimed in claim 2, which is characterized in that the pre-set navigational map includes at least three pre- Set point, each corresponding fixed point coordinate of default fixed point；

It is described obtain mobile terminal actual measurement WIFI signal cluster the step of before, the phonetic navigation method further include:

In each default fixed point, obtained between signal acquisition terminal and each specified hotspot by default times of collection The training WIFI signal cluster that WIFI signal intensity is formed, and obtain the corresponding fixed point coordinate of the default fixed point；

Fixed point coordinate and the trained WIFI signal cluster described in associated storage, to form the corresponding decision tree of fixed point coordinate.

4. phonetic navigation method as described in claim 1, which is characterized in that described to be carried out in advance to the voice data to be identified Processing obtains speech vector feature corresponding with the voice data to be identified, comprising:

The voice data to be identified is pre-processed, pretreatment voice data is obtained；

Fast Fourier Transform (FFT) is made to the pretreatment voice data, obtains the frequency spectrum of voice data to be identified, and according to described Frequency spectrum obtains the power spectrum of voice data to be identified；

The power spectrum that the voice data to be identified is handled using melscale filter group, obtains the plum of voice data to be identified That power spectrum；

Cepstral analysis is carried out on the Meier power spectrum, obtains the speech vector feature of voice data to be identified.

5. phonetic navigation method as described in claim 1, which is characterized in that described based on the default finger of lteral data matching Collection is enabled, target control instruction is obtained, comprising:

The lteral data is segmented according to default word segmentation regulation, part-of-speech tagging is carried out to word segmentation result；

According to the sequence and part of speech of the corresponding each participle of the lteral data, by the corresponding each participle structure of the lteral data Build out preset structure participle tree；

The corresponding text keyword of the lteral data is parsed based on preset structure participle tree, it is crucial based on the text Word matches preset instruction set, obtains target control instruction.

6. phonetic navigation method as described in claim 1, which is characterized in that shown in the control mobile terminal and described execute knot After the step of fruit, the phonetic navigation method further include:

If the implementing result is to generate target navigation route, the current positioning coordinate of mobile terminal is obtained every preset time；

The current positioning coordinate and the target navigation route are subjected to location matches, in current positioning coordinate and target navigation When distance on route at nearest position is more than threshold value, mobile terminal is reminded to carry out route amendment.

7. a kind of voice guiding device characterized by comprising

Measured signal cluster module is obtained, for obtaining the actual measurement WIFI signal cluster of mobile terminal, is based on the actual measurement WIFI signal cluster, Obtain mobile terminal in pre-set navigational map corresponding fixed point coordinate as starting point coordinate；

Obtain navigation requests module, for obtaining the Voice Navigation request of mobile terminal transmission, Voice Navigation request include to Identify voice data；

Phonetic feature module is obtained, for being pre-processed to the voice data to be identified, is obtained and the voice to be identified The corresponding speech vector feature of data；

Obtain control instruction module, for being matched to the speech vector feature based on Hidden Markov Model, obtain with The corresponding lteral data of the speech vector feature matches preset instruction set based on the lteral data, obtains target control and refers to It enables；

It shows implementing result module, for being based on the starting point coordinate, executes the target control instruction, and obtain described in execution Target control instructs corresponding target instruction target word implementing result, the implementing result is sent to mobile terminal, control mobile terminal is shown The implementing result.

8. voice guiding device as claimed in claim 7, which is characterized in that the acquisition measured signal cluster module includes:

Signal strength unit is obtained, for obtaining WIFI of the mobile terminal between current location and at least three specified hotspot Signal strength；

Measured signal cluster unit is obtained, for being based on the WIFI signal intensity, obtains mobile terminal in the actual measurement of current location WIFI signal cluster；

Starting point coordinate unit is obtained, for calculating the Europe of each decision tree in the actual measurement WIFI signal cluster and default random forest Formula distance obtains the shortest objective decision tree of Euclidean distance, it is right in the pre-set navigational map to obtain the objective decision tree Starting point coordinate of the fixed point coordinate answered as the mobile terminal.

9. a kind of computer equipment, including memory, processor and storage are in the memory and can be in the processor The computer program of upper operation, which is characterized in that the processor realized when executing the computer program as claim 1 to The step of any one of 6 phonetic navigation method.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In the step of realization phonetic navigation method as described in any one of claim 1 to 6 when the computer program is executed by processor Suddenly.