CN108896962B

CN108896962B - Iterative positioning method based on sound position fingerprint

Info

Publication number: CN108896962B
Application number: CN201810611952.5A
Authority: CN
Inventors: 杨鹏; 徐静; 孙昊; 王硕朋; 张晓萌
Original assignee: Hebei University of Technology
Current assignee: Hebei University of Technology
Priority date: 2018-06-14
Filing date: 2018-06-14
Publication date: 2022-02-08
Anticipated expiration: 2038-06-14
Also published as: CN108896962A

Abstract

The invention discloses an iterative positioning method based on sound position fingerprints, relates to a technology for determining the position of a signal source by using sound waves, and discloses a scene analysis method, which takes sound arrival time difference as the positioning precision of position fingerprints and comprises the following two steps: A. acquiring two-dimensional space coordinates → constructing a voice position fingerprint database in an off-line acquisition stage of voice position fingerprints, and clustering the voice position fingerprint database by adopting a voice position-based clustering method to form a cluster 1, a cluster 2, a cluster … and a cluster K, wherein the cluster centers are K in total; B. in the stage of online positioning, namely online positioning of the sound position fingerprints, the computer extracts the sound position characteristic vector of the sound source to be positioned → selective clustering → iterative positioning, namely position calculation of the sound source to be positioned → positioning result, namely the final positioning result of the sound source to be positioned is output. The invention overcomes the defects of high model dependence and low positioning precision in an unstructured space of the existing sound source positioning method.

Description

Iterative positioning method based on sound position fingerprint

Technical Field

The technical scheme of the invention relates to a technology for determining the position of a signal source by using sound waves, in particular to an iterative positioning method based on sound position fingerprints.

Background

With the progress of internet technology and the portable development of mobile devices, the demand for location information-based services is increasing. The development of outdoor positioning technologies such as GPS and the like is mature, the outdoor positioning technology is widely applied to the fields of outdoor navigation, satellite monitoring and the like, accurate and quick positioning service can be provided for users, and the outdoor positioning technology has very important significance. However, GPS signals are weak in penetration, susceptible to refraction and reflection by buildings, and difficult to achieve accurate positioning in a complex indoor environment. Therefore, a rapid, accurate and timely indoor positioning method is imperative to be found.

Currently, the commonly used indoor positioning technologies can be divided into two categories: a positioning method based on a signal propagation model and a positioning method based on a position fingerprint. The positioning accuracy of the positioning method based on the signal propagation model depends on the model, the signal is very easily influenced by the external environment during propagation, and the parameter estimation has large errors, so that the constructed signal propagation model has large errors and low positioning accuracy. The positioning method based on the position fingerprint has the advantages of high positioning precision, low model dependence degree and the like, and is widely concerned by researchers in recent years. The positioning method based on the position fingerprint is mainly divided into two operation stages: an off-line sampling phase and an on-line positioning phase. In order to achieve high positioning accuracy, the positioning method based on the position fingerprint in the prior art usually needs to collect a large number of samples in an offline stage, which not only costs huge manpower and material resources, but also increases the time for searching the position fingerprint database in the online positioning stage, and increases the complexity of the positioning method.

In the positioning method based on the sound position fingerprint, sound source positioning refers to a process that after a mobile robot in a positioning system sends sound signals at unknown position points, characteristics of the sound signals, including signal strength, signal-to-noise ratio and sound arrival time difference, are extracted by a central processing platform, and the spatial position of the sound-emitting robot is determined by comparing the characteristics with the position characteristics of the known position points. The development of artificial intelligence and voice signal processing technology enables the positioning method based on the voice position fingerprint to have important application in the fields of industry, audio and video conferences, man-machine voice interaction and the like.

CN104865555B discloses an indoor sound source positioning method based on sound position fingerprint, which determines the size of the grid according to the indoor area and positioning accuracy, and has a situation that it is easy to cause redundancy of reference point layout, increasing the cost of constructing the database, bringing huge system overhead, and hindering large-scale application of the sound source positioning technology. A Wangshu paper research on a distributed microphone array positioning method (full-text database information technology set (journal) 2013 of a complete paper of Chinese superior Master academic thesis, No. 09) introduces a method for constructing a database by using energy of signals received by a microphone as a sound position fingerprint, and the defects that the positioning accuracy is poor due to the fact that the energy ratio of the signals received by the microphone array is used as the fingerprint, the indoor environment is complex, and the reflection and diffraction of the signals are difficult to estimate exist due to various reasons. Wuxie 'paper research on mobile robot sound source localization method based on time delay estimation' (full-text database information technology collection (journal) 2014, in the thesis of excellent master academic parlance in china) introduced a method based on geometric model localization, which has the defects that the localization method based on geometric model is not suitable for indoor environment and it is difficult to correctly estimate a propagation model of a sound signal in the indoor environment of unstructured space.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the iterative positioning method based on the sound position fingerprints is a scene analysis method, adopts a large distributed wheat array, takes the sound arrival time difference as the positioning precision of the position fingerprints, and overcomes the defects of high model dependence and low positioning precision in an unstructured space of the existing sound source positioning method.

The technical scheme adopted by the invention for solving the technical problem is as follows: the iterative positioning method based on the sound position fingerprint comprises the following specific steps:

A. and in the off-line acquisition stage of the sound position fingerprints, constructing a sound position fingerprint database and clustering:

first, positioning the arrangement of the scene:

(1.1) arranging a distributed microphone array consisting of four array elements M0, M1, M2 and M3 on the localization map, wherein the microphone M0 is a reference microphone;

(1.2) arranging 5 reference points at the four vertexes and the middle point of the positioning map of the step (1.1) of the first step;

thereby completing the placement of the positioning scene;

step two, acquiring the sound position fingerprint:

(2.1) driving the mobile robot in the first stepReleasing position sound at each reference point selected in the step (1.2), calculating the starting time of each microphone starting to receive position sound in the distributed microphone array formed in the step (1.1) of the first step by adopting a double-parameter double-threshold voice endpoint detection method, extracting the time difference between the reference microphone M0 and other microphones, namely the sound arrival time difference, as the sound position feature vector of the reference point, and recording the sound position feature vector acquired by the ith reference point at the moment t as the sound position feature vector of the reference point

Wherein

Representing the m-th sound position characteristic obtained at the i-th reference point at the time t; m represents the number of sound position features contained in each fingerprint;

(2.2) the mobile robot respectively collects signals at each reference point for T times, and the average value of the sound position characteristic vectors collected by the signals for T times is used as the sound position characteristic vector of the reference point to be stored, so that the sound position characteristic vector of the ith reference point is expressed as R_i＝[r_i1,r_i2,...,r_im,...,r_iM]Wherein

An mth sound position feature representing an ith reference point;

(2.3) setting L_i＝[x_i,y_i]Is the two-dimensional space coordinate of the ith reference point, the sound position feature vector R of the ith reference point obtained in the step (2.2) of the second step is_iThe corresponding two-dimensional space coordinates are combined to form a group of sound position fingerprints, which are marked as F_i＝[R_i,L_i]＝[r_i1,r_i2,...,r_im,...,r_iM,x_i,y_i]Wherein x is_iAbscissa, y, representing the ith reference point_iA ordinate representing the ith reference point;

thereby completing the acquisition of the sound position fingerprint;

thirdly, constructing a voice position fingerprint database, and clustering the voice position fingerprint database:

(3.1) combining the sound position fingerprints of all the reference points obtained in the step (2.3) of the second step to form a sound position fingerprint database of an initial state, and recording the sound position fingerprint database as: f ═ F₁ F₂ ... F_i ... F_I]^TIn which F is_iA sound position fingerprint representing an ith reference point;

(3.2) clustering the voice position fingerprint database forming the initial state in the step (3.1) of the third step by adopting a voice position-based clustering method, and defining a clustering center, wherein the specific operation is as follows: dividing the positioning map in the step (1.1) of the first step into non-overlapping triangular positioning areas formed by adjacent reference points, and numbering the positioning areas clockwise as follows: zone Z₁…, zone Z_KIf the reference points in the same positioning area belong to the same cluster, the sound position fingerprints of the reference points in the same cluster form a sound position fingerprint set as follows:

wherein, F _ Z_knRepresenting the sound position fingerprint of the nth reference point in a cluster K, K representing the number of clusters, N representing the number of reference points in the cluster K until all the reference points are classified into the clusters where the reference points are located, then defining cluster centers for each cluster, forming cluster 1, cluster 2, …, cluster K, cluster … and cluster K, wherein the cluster centers are K in total, and the feature vector of each cluster center

The average value of all reference point feature vectors in the cluster is obtained to obtain a final sound position fingerprint database;

thus, a sound position fingerprint database is constructed, and the sound position fingerprint database is clustered;

thus finishing the off-line collection of the sound position fingerprints, and constructing and clustering a sound position fingerprint database;

B. in the on-line positioning stage of the sound position fingerprint, the final positioning result of the sound source to be positioned is obtained:

and fourthly, selecting clusters, namely determining the cluster where the sound source to be positioned is located:

(4.1) the mobile robot releases a sound signal at the sound source to be positioned, the distributed microphone array formed in the step (1.1) of the first step captures the sound signal and uploads the sound signal to the computer, and the computer extracts the sound position feature vector of the sound source to be positioned into R_x＝[r₁,r₂,...,r_m,...,r_M]Wherein r is_mAn mth sound position characteristic representing a sound source to be located;

(4.2) calculating the sound position feature vector R of the sound source to be positioned by using Euclidean distance_xFeature vectors corresponding to each of the cluster centers in the step (3.2) of the above-mentioned third step

The similarity of (A) is calculated by the formula

Wherein d is_kIs R_xAnd

the Euclidean distance between;

(4.3) reuse of the formula arg min d_kDetermining the cluster where the sound source to be positioned is located;

fifthly, iterative localization, namely calculating the position of the sound source to be localized:

(5.1) Sound position fingerprint set of reference points within the clustering using the (3.2) step of the third step described above

Starting an iterative process with F _ Z as an initial input_knA sound location fingerprint representing the nth neighboring reference point within cluster k;

(5.2) Adjacent virtual reference Point

With the continuous update of the iteration process, after the ith iteration, the sound position fingerprint adjacent to the virtual reference point is represented as:

wherein

Representing the sound position fingerprint of the nth adjacent virtual reference point after the i iterations;

(5.3) after the (l + 1) th iteration, the sound position fingerprint of the n-th adjacent reference point

The calculation formula of (a) is as follows:

in the formula (1), the first and second groups,

and representing the weight coefficient of the nth adjacent virtual reference point after the ith iteration, wherein the weight coefficient is calculated by the following formula:

in the formula (2), the first and second groups,

representing the Euclidean distance between the nth adjacent virtual reference point and a sound source to be positioned after the ith iteration, wherein the random number epsilon has the function of avoiding the denominator being 0;

(5.4) calculating the coordinates of the sound source to be positioned by using a weighted K nearest neighbor algorithm, wherein the calculation formula is as follows:

in the formula (3), (x)_kn,y_kn) Representing the position coordinates, w, of the nth neighboring virtual reference point in the cluster k generated by iterating the last iteration_knWeight coefficient, w, representing the nth neighboring virtual reference point in cluster k generated by iterating the last iteration_knThe calculation formula of (a) is as follows:

in the formula (4), d_knRepresenting the Euclidean distance between the nth adjacent virtual reference point which is generated in the cluster k by iteration for the latest time and the sound source to be positioned, and the calculation method comprises the following steps:

in the formula (5), r_mRepresenting the mth sound position characteristic, r, of the sound source to be located_knmAn mth sound position feature representing an nth neighboring virtual reference point after the last iteration;

completing iterative positioning through the steps (5.1) - (5.4) in the fifth step, and calculating the position of the sound source to be positioned;

sixthly, calculating the positioning error of the sound source to be positioned:

the calculation method of the positioning error of the sound source to be positioned is expressed as follows:

in the formula (6), x represents the abscissa of the actual physical position of the sound source to be positioned, y represents the ordinate of the actual physical position of the sound source to be positioned,

to indicate a waitThe abscissa of the actual test position of the localized sound source,

a vertical coordinate representing an actual test position of a sound source to be positioned;

calculating the positioning error of the sound source to be positioned according to the formula (6);

step seven, outputting a final positioning result of the sound source to be positioned:

(7.1) continuously repeating the steps of the fifth step and the sixth step, and storing a positioning result and a positioning error obtained by calculation after each iteration;

(7.2) ending the iterative process of the calculation operation until the positioning error is stable, and outputting a final positioning result of the sound source to be positioned;

therefore, the on-line positioning stage of the sound position fingerprint is completed, and the final positioning result of the sound source to be positioned is obtained.

The iterative positioning method based on the sound position fingerprint, the positioning map, the double-parameter double-threshold voice endpoint detection method, the mobile robot used by the method and the formula arg min d_kAnd weighted K nearest neighbor algorithms are well known in the art.

The invention has the beneficial effects that: compared with the prior art, the invention has the prominent substantive characteristics as follows:

(1) the method is a scene analysis method, and adopts a large-scale distributed wheat array and takes the time difference of arrival as a positioning method of position fingerprints. In the off-line sampling stage of the method, the density of reference points arranged in a positioning map is reduced to reduce the workload of off-line acquired data processing, then in the construction stage of the sound position fingerprint database, a clustering method based on sound positions is adopted to cluster the reference points, and a clustering center is defined for each cluster to reduce the calculated amount of searching the sound position fingerprint database in the on-line positioning process; in the online stage of the method, the characteristic vector of the sound position measured at the sound source to be positioned is compared with the characteristic vector of the clustering center in the sound position fingerprint database obtained in the offline stage, the clustering center with the shortest Euclidean distance to the sound source to be positioned is selected, and the iteration method is combined with the weighted K nearest neighbor algorithm to realize accurate positioning in the clustering, so that the positioning complexity is further reduced, and the defects of high model dependence and low positioning accuracy in an unstructured space of the existing sound source positioning method are overcome.

(2) Compared with the technology of an indoor sound source positioning method based on sound position fingerprints disclosed in CN104865555B, the method has the following substantial distinguishing features and significant progress:

the method for calculating the time difference of arrival is completely different from CN104865555B, the method adopts a double-parameter double-threshold voice endpoint detection technology to calculate the time difference of arrival, and the time difference of arrival in CN104865555B is calculated through a generalized cross-correlation function; the database clustering algorithm and the iterative positioning algorithm of the method are also technologies which are not available in CN 104865555B; CN104865555B is an early technical result of the present applicant team, so the applicant can explicitly point out: although the method of the present invention is further developed and obtained on the basis of these results, the method has the following substantial distinguishing features and significant progress in the technical scheme and research focus compared with CN 104865555B:

1) the key point of the research in CN104865555B is how to use sound information to perform positioning, and the size of the grid is determined according to the indoor area and the positioning accuracy, which is likely to cause the situation of redundant reference point arrangement, increase the cost of constructing the database, bring huge system overhead, and hinder the large-scale application of the sound source positioning technology. The method uses a small number of reference points to construct a database, and the research is mainly on how to reduce the complexity of the sound source positioning method and realize a high-precision sound source positioning target under low fingerprint density.

2) The generalized cross-correlation function adopted in CN104865555B is only suitable for an environment with a single acoustic signal, however, since the indoor environment is usually very complex, false peaks are easily generated in the indoor environment with reflection and diffraction, so that the technology disclosed in CN104865555B cannot meet the requirement of accuracy of sound source localization. The invention adopts the voice endpoint detection technology with double parameters and double thresholds, and can obtain satisfactory voice endpoint detection precision under the condition of low signal-to-noise ratio through repeated verification of experiments, thereby being used in indoor environment.

3) Only the process of calculating the sound arrival time difference by the generalized cross-correlation function is described in CN 104865555B. The method not only introduces the method for calculating the time difference of sound arrival, but also emphasizes the filtering processing of sound signals, and the characteristic vector mean value is measured for multiple times to construct a database so as to improve the quality of the database and the positioning precision of the system.

(3) Compared with the technology disclosed in the Wangshu paper research on the distributed microphone array positioning method (hereinafter referred to as paper F), the method has the following substantial distinguishing characteristics and remarkable progress: in the paper F, the microphone receiving signal energy is used as the sound location fingerprint to construct the database, and since the propagation of the sound signal is very unstable, the positioning accuracy is deteriorated and the indoor environment is complicated when the microphone array receiving signal energy ratio is used as the fingerprint in the paper F, which causes the signal reflection and diffraction to be difficult to estimate, although the method for estimating the background noise energy is used in the paper F to optimize the sound signal, it is also difficult to perform complete environmental compensation. Compared with the paper F, the database is constructed by taking the sound arrival time difference as the position characteristic of the reference point, according to a great deal of previous researches of the inventor of the invention, the time difference of sound reaching each path of microphone is slightly influenced by the environment, and the defects of the sound fingerprint in the paper F can be effectively overcome by performing filtering processing on the sound signal in the previous period and performing multiple tests to obtain the fingerprint average value when the database is constructed. Through repeated comparison experiments and comprehensive data analysis, the positioning accuracy of the invention taking the time difference of arrival as the position fingerprint is much higher than that of the thesis F taking the energy ratio of the received signal as the position fingerprint.

(4) Compared with the technology disclosed in wuxiu paper "research on mobile robot sound source localization method based on time delay estimation" (hereinafter referred to as paper J), the present invention has the following substantive distinguishing features and significant progress: the method of the invention is based on sound position fingerprint positioning, and the method in paper J is based on geometric model positioning; the "method based on geometric model positioning" described in paper J requires a suitable signal propagation model because the positioning accuracy is highly dependent on the constructed signal model, the higher the model accuracy, the more accurate the positioning, and it is difficult to correctly estimate the propagation model of the acoustic signal in the indoor environment of the unstructured space, so the positioning method based on geometric model is not suitable for the indoor environment. The method is based on sound position fingerprint positioning, is a common scene analysis method, and solves the problems of high model dependence and low positioning precision in an unstructured space of a sound source positioning method. In addition, the measuring device of the method of the present invention is substantially different from article J: paper J uses a small microphone array that can only be used to estimate the azimuth of the sound source. The method adopts a large distributed wheat array, is suitable for more diversified positioning environments, and can accurately estimate the position of a sound source.

Compared with the prior art, the invention has the following remarkable improvements:

(1) the invention greatly reduces the number of reference points, reduces the workload in the off-line acquisition stage and improves the operability of the positioning method based on the sound position fingerprints in the actual application process.

(2) According to the invention, the reference points are clustered by adopting a position-based clustering method in the stage of constructing the sound position fingerprint database offline, so that the time for searching the database in the stage of online positioning is reduced, and the sound source positioning efficiency is effectively improved.

(3) The invention adopts an iterative algorithm to generate a virtual adjacent reference point, gradually approaches the position of the sound source to be positioned, and finally combines a weighted K nearest algorithm to realize the positioning function, thereby reducing the complexity of the method and improving the positioning precision compared with the existing positioning method based on position fingerprints.

Drawings

The invention is further illustrated with reference to the following figures and examples.

FIG. 1 is a schematic block diagram of the overall design of the iterative localization method based on sound location fingerprint of the present invention.

FIG. 2 is a schematic block diagram of the flow of the offline acquisition phase in the iterative localization method based on sound location fingerprint of the present invention.

FIG. 3 is a schematic block diagram of the flow of the online location phase in the iterative location method based on sound location fingerprints according to the present invention.

Detailed Description

The embodiment shown in fig. 1 shows that the overall design of the iterative localization method based on the sound position fingerprint of the present invention is two major steps: A. off-line collection, namely, the off-line collection stage of the sound position fingerprint, the two-dimensional space coordinates are respectively obtained as [ x [ ]₁,y₁]、[x₂,y₂]…[x_i,y_i]Corresponding sound position fingerprint [ r ]₁₁,r₁₂,…，r_1M]、[r₂₁,r₂₂,…，r_2M]…[r_i1,r_i2,…，r_iM]→ database → cluster 1, cluster 2, …, namely the construction of the sound position fingerprint database, clustering the sound position fingerprint database by adopting a clustering method based on sound position, and forming cluster 1, cluster 2, cluster … and cluster K, wherein the total number of the cluster centers is K; B. on-line positioning, i.e. on-line positioning stage of sound position fingerprint, the computer extracts the sound position characteristic vector of sound source to be positioned as r₁,r₁,…，r_M]→ selecting cluster → cluster 1, cluster 2, …, namely determining the cluster where the sound source to be positioned is located → iteratively positioning, namely calculating the position of the sound source to be positioned → positioning result, namely outputting the final positioning result of the sound source to be positioned.

The embodiment shown in fig. 2 shows that the process of the offline acquisition stage in the iterative positioning method based on the sound location fingerprint of the present invention is that the offline acquisition stage of the sound location fingerprint: arranging a positioning scene → acquiring sound position fingerprints → clustering a sound position fingerprint database by adopting a sound position-based clustering method, and constructing the sound position fingerprint database.

The embodiment shown in fig. 3 shows that, in the iterative localization method based on the sound location fingerprint of the present invention, the process of the on-line localization phase, and the on-line localization phase of the sound location fingerprint, obtain the final localization result of the sound source to be localized: selecting clustering, namely determining the cluster where the sound source to be positioned is located → performing iterative positioning, namely calculating the position of the sound source to be positioned → calculating the positioning error of the sound source to be positioned → outputting the final positioning result of the sound source to be positioned: and continuously repeating the fifth step and the sixth step, storing the positioning result and the positioning error obtained by calculation after each iteration until the positioning error is stable, ending the iteration process of the calculation operation, and outputting the final positioning result of the sound source to be positioned.

Examples

The iterative positioning method based on the sound position fingerprint of the embodiment specifically comprises the following steps:

first, positioning the arrangement of the scene:

(1.2) establishing a horizontal coordinate reference system by taking the north-south direction as a horizontal axis and the east-west direction as a vertical axis, and arranging 5 reference points at four vertexes and a midpoint of the positioning map in the step (1.1) of the first step;

thereby completing the placement of the positioning scene;

step two, acquiring the sound position fingerprint:

(2.1) driving the mobile robot to release position sound at each reference point selected in the step (1.2) of the first step, calculating the starting time of each microphone starting to receive the position sound in the distributed microphone array formed in the step (1.1) of the first step by adopting a two-parameter and two-threshold voice endpoint detection method, extracting the sound position feature vector as the reference point by using the sound arrival time difference which is the time difference between the reference microphone M0 and other microphones starting to receive the sound signals, and recording the sound position feature vector collected by the ith reference point at the moment t as the sound position feature vector of the reference point

Wherein

(2.2) in order to reduce the influence of noise and obstacle factors on sound signal measurement, the mobile robot respectively collects signals at each reference point for T times, the average value of sound position characteristic vectors collected by the T times of signals is used as the sound position characteristic vector of the reference point to be stored, and then the sound position characteristic vector of the ith reference point is expressed as R_i＝[r_i1,r_i2,...,r_im,...,r_iM]Wherein

An mth sound position feature representing an ith reference point;

thereby completing the acquisition of the sound position fingerprint;

(3.2) the set of all reference point location fingerprints constitutes a location fingerprint database, in order to reduce positioningAnd (3) clustering the voice position fingerprint database forming the initial state in the step (3.1) of the third step by adopting a voice position-based clustering method, and defining a clustering center, wherein the specific operation is as follows: dividing the positioning map in the step (1.1) of the first step into non-overlapping triangular positioning areas formed by adjacent reference points, and numbering the positioning areas clockwise as follows: zone Z₁…, zone Z_KIf the reference points in the same positioning area belong to the same cluster, the sound position fingerprints of the reference points in the same cluster form a sound position fingerprint set as follows:

(4.1) the mobile robot releases the sound signal at the sound source to be positioned, and the distributed microphone array formed by the step (1.1) of the first step will capture the sound signalThe sound signal is uploaded to a computer, and the computer extracts the sound position characteristic vector of the sound source to be positioned as R_x＝[r₁,r₂,...,r_m,...,r_M]Wherein r is_mAn mth sound position characteristic representing a sound source to be located;

The similarity of (A) is calculated by the formula

Wherein d is_kIs R_xAnd

the Euclidean distance between; the Euclidean distance can represent the similarity of the position characteristics of the to-be-positioned point and each clustering center, the shorter the Euclidean distance is, the more similar the position characteristics of the to-be-positioned point and each clustering center is, the shorter the distance between the to-be-positioned point and each clustering center is, the higher the possibility of the to-be-positioned point in the clustering is, otherwise, the more different the position characteristics of the to-be-positioned point and each clustering center is, the larger the distance between the to-be-positioned point and each clustering center is, and the lower the possibility of the to-be-positioned point in the clustering is;

(5.2) Adjacent virtual reference Point

wherein

The calculation formula of (a) is as follows:

in the formula (1), the first and second groups,

in the formula (2), the first and second groups,

an abscissa representing the actual test position of the sound source to be located,

Claims

1. The iterative positioning method based on the sound position fingerprint is characterized by comprising the following specific steps:

first, positioning the arrangement of the scene:

thereby completing the placement of the positioning scene;

step two, acquiring the sound position fingerprint:

(2.1) driving the mobile robot to release the reference points selected in the step (1.2) of the first stepPosition sound, calculating the starting time of each microphone starting to receive position sound in the distributed microphone array formed by the step (1.1) of the first step by adopting a two-parameter two-threshold voice endpoint detection method, extracting the time difference between the reference microphone M0 and other microphones, namely the sound arrival time difference, as the sound position feature vector of a reference point, and recording the sound position feature vector acquired by the ith reference point at the moment t as the sound position feature vector of the reference point

Wherein

M is more than or equal to 1 and less than or equal to M, and the mth sound position characteristic obtained at the ith reference point at the moment t is represented; m-3 represents the number of sound location features contained by each fingerprint;

(2.2) the mobile robot respectively collects signals at each reference point for S times, and the average value of the sound position characteristic vectors collected by the signals for S times is used as the sound position characteristic vector of the reference point to be stored, so that the sound position characteristic vector of the ith reference point is expressed as R_i＝[r_i1，r_i2，r_i3]Wherein

An mth sound position feature representing an ith reference point,

representing the mth sound position characteristic acquired at the ith reference point for the s time;

(2.3) setting L_i＝[x_i，y_i]Is the two-dimensional space coordinate of the ith reference point, the sound position feature vector R of the ith reference point obtained in the step (2.2) of the second step_iThe corresponding two-dimensional space coordinates are combined to form a group of sound position fingerprints, which are marked as F_i＝[R_i，L_i]＝[r_i1，r_i2，r_i3，x_i，y_i]Wherein x is_iAbscissa, y, representing the ith reference point_iA ordinate representing the ith reference point;

thereby completing the acquisition of the sound position fingerprint;

(3.1) combining the sound position fingerprints of all the reference points obtained in the step (2.3) of the second step to form a sound position fingerprint database of an initial state, and recording the sound position fingerprint database as: f ═ F₁F₂...F_i...F_I]^TIn which F is_iA sound position fingerprint representing an ith reference point;

Is the average value of all reference point feature vectors in the cluster to obtain the final valueA sound location fingerprint database;

(4.1) the mobile robot releases a sound signal at the sound source to be positioned, the distributed microphone array formed in the step (1.1) of the first step captures the sound signal and uploads the sound signal to the computer, and the computer extracts the sound position feature vector of the sound source to be positioned into R_x＝[r₁，r₂，r₃]Wherein r is_mAn mth sound position characteristic representing a sound source to be located;

The similarity of (A) is calculated by the formula

Wherein d is_kIs R_xAnd

the Euclidean distance between;

(4.3) reuse formula argmin d_kDetermining the cluster where the sound source to be positioned is located;

(5.1) starting an iterative process by using the cluster where the sound source to be positioned is located, determined in the step (4.3) as an initial input, wherein F _ Z_knRepresenting the nth neighbor within cluster kA sound location fingerprint near a reference point;

(5.2) Adjacent virtual reference Point

wherein

(5.3) after the (l + 1) th iteration, the sound position fingerprint of the n-th adjacent virtual reference point

The calculation formula of (a) is as follows:

in the formula (1), the first and second groups,

in the formula (2), the first and second groups,

in the formula (3), (x)_kn，y_kn) Representing the position coordinates, w, of the nth neighboring virtual reference point in the cluster k generated by iterating the last iteration_knWeight coefficient, w, representing the nth neighboring virtual reference point in cluster k generated by iterating the last iteration_knThe calculation formula of (a) is as follows:

in the formula (6), x represents undeterminedThe abscissa of the actual physical position of the localized sound source, y represents the ordinate of the actual physical position of the sound source to be localized,