CN107766372A - A kind of methods, devices and systems for safeguarding animal data storehouse - Google Patents
A kind of methods, devices and systems for safeguarding animal data storehouse Download PDFInfo
- Publication number
- CN107766372A CN107766372A CN201610694221.2A CN201610694221A CN107766372A CN 107766372 A CN107766372 A CN 107766372A CN 201610694221 A CN201610694221 A CN 201610694221A CN 107766372 A CN107766372 A CN 107766372A
- Authority
- CN
- China
- Prior art keywords
- sound
- voiceprint
- species
- word bank
- animal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/43—Querying
- G06F16/432—Query formulation
- G06F16/433—Query formulation using audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
Abstract
The invention provides a kind of method for safeguarding animal data storehouse,Device and system,The animal data storehouse pre-established includes at least one species word bank,Each species word bank includes voiceprint storehouse and multimedia information lib,When obtaining the sound of surrounding,The first voiceprint is extracted from the source acoustical signal got,It is matched with the sound-groove model in the voiceprint storehouse in each species word bank,If the match is successful,Then the source acoustical signal is stored in corresponding multimedia information lib,Then the second voiceprint of each source acoustical signal is analyzed,Source acoustical signal in each species word bank is clustered,It is achieved thereby that the function of animal sounds database just can be safeguarded without manual operation,Substantially increase the efficiency that animal data is collected in regional,Simultaneously by collection region realize the programming count of size of animal,Provide convenience in the individuation data storehouse for establishing animal individual.
Description
Technical field
The present invention relates to ecological monitoring and Underwater Acoustic channels field, more particularly to a kind of method for safeguarding animal data storehouse,
Device and system.
Background technology
The idea got along amiably and peacefully with animal protection, the mankind and animal is increasingly popularized, monitoring animal behavior, research animal
Habit is also increasingly valued by people.Sound is the important way of progress information interchange between animal, by animal sound
Signal study the communication it will be seen that between animal individual, such as species identification and individual identification, Mate choice.Animal sound
The collection of signal can also help to find new species, assess status hierarchical relationship, analyzing animal development growth situation in animal population
Deng.Therefore, research of the animal vocalization to Animal behaviour is most important.The mankind have the language of the mankind, and animal also has the language of animal,
The Different Individual of each species and each species has unique cry feature.How the difference sent using each animal
Cry, each animal in region is sorted out, turns into biology and a statistical big problem.
The content of the invention
The invention provides a kind of methods, devices and systems for safeguarding animal data storehouse, solve and how to utilize animal
The problem of sound is by animal automatic clustering in certain area.
In order to solve the above-mentioned technical problem, the invention provides a kind of method for safeguarding animal data storehouse, including:
The sound that different sound sources are sent is gathered, obtains at least one source acoustical signal;
The first voiceprint is extracted from the source acoustical signal;
By in the voiceprint storehouse in each species word bank in first voiceprint extracted and animal data storehouse
Sound-groove model matched respectively;When the match is successful, source acoustical signal corresponding to the first voiceprint that the match is successful is protected
In the presence of in the multimedia information lib of corresponding species word bank;
The second voiceprint of each source acoustical signal in the species word bank is analyzed, and according to analysis result to the species
Source acoustical signal in word bank carries out cluster operation, determines the sound of the Different Individual in the species word bank;Wherein, it is described dynamic
Thing database includes at least one species word bank, and each species word bank includes voiceprint storehouse and multimedia information lib.
Further, the sound that the different sound sources of the collection are sent includes:Whether the sound around detection has movable sound
Sound;If detecting movable sound, the movable sound is acquired, and carry out sound cutting.
Further, it is described to gather the movable sound that different sound sources are sent respectively and include:To the movable sound
Sound source direction is carried out, the direction of collection is pointed to the orientation that the movable sound is sent.
Further, when the sound that different sound sources are sent is gathered respectively, in addition to:According to the time interval of setting
Taken pictures and/or imaged, obtain at least one picture signal and/or vision signal.
Further, when the match is successful, in addition to:The described image of time same with source acoustical signal collection is believed
Number and/or vision signal be correspondingly stored in the multimedia information lib of species word bank.
Further, it is described by source acoustical signal corresponding to the first voiceprint that the match is successful be stored in corresponding to species
After the multimedia information lib in storehouse, in addition to:The acquisition time of the source acoustical signal is recorded in the multimedia information lib.
Further, the sound around the detection, and the sound that different sound sources are sent is gathered respectively, obtain at least one
Individual source acoustical signal includes:Sound around user's active detecting, and the obtained source acoustical signal is sent to server.
Further, source acoustical signal corresponding to the first voiceprint that the match is successful is stored in corresponding species described
After in the multimedia information lib of word bank, in addition to:The vocal print corresponding to first voiceprint that the match is successful renewal is believed
Cease storehouse.
Further, before the sound around the detection, in addition to the sound-groove model is obtained, including:Collection is each
The sounding sample of animal species corresponding to the individual species word bank;The first voiceprint in the sounding sample is extracted, and it is raw
Into corresponding sound-groove model.
Further, the vocal print by each species word bank in the first voiceprint extracted and animal data storehouse
Sound-groove model in information bank carries out matching respectively to be included:If the vocal print letter in first voiceprint and each species word bank
Sound-groove model in breath storehouse mismatches, then, creates a new species word bank, first voiceprint is stored in described
In the multimedia information lib of new species word bank.
Further, in the analyte seed bank each source acoustical signal the second voiceprint, and according to analysis result
Carrying out cluster operation to the source acoustical signal in the species word bank includes:
By in the multimedia information lib in the species word bank the voice signal fragment that forms of active acoustical signal the
Two voiceprints are compared two-by-two, therefrom select two minimum voice signal fragments of the second vocal print information gap;
By the gap of the second voiceprint of the two voice signal fragments compared with preset value;
If the second vocal print information gap of the two voice signal fragments is less than preset value, by the two voice signal pieces
Section synthesizes a voice signal fragment;
The second voiceprint of all voice signal fragments is two-by-two from the multimedia information lib in the word bank by species
It is compared and starts to repeat all of above step, until the second vocal print information gap of any two voice signal fragment is all higher than
Preset value, then it is different animal individuals corresponding to these voice signal fragments, then these voice signal fragments is not closed
Into.
Further, after the source acoustical signal in the word bank to species carries out cluster operation, in addition to:Judge thing
The quantity of animal individual in seed bank, and each animal individual and the information and animal individual in the multimedia information lib
Between corresponding relation.
Further, after the source acoustical signal in the word bank to species carries out cluster operation, there is new source sound letter
During the multimedia information lib of number deposit species word bank, in addition to:By corresponding to each animal individual in existing cluster result
The active acoustical signal set of institute clusters initial input as one, while initial defeated using each new source acoustical signal as a cluster
Enter, clustered;
Or, the active acoustical signal one of existing institute in the new source acoustical signal and the species word bank is re-started into cluster
Operation.
Further, after the source acoustical signal in the word bank to species carries out cluster operation, there is new source sound letter
During the multimedia information lib of number deposit species word bank, in addition to:Based on existing all voice signals in the species word bank,
Animal individual identification is carried out to the new source acoustical signal;If the new source acoustical signal belongs to existing in the species word bank
Some animal individual, then the new source acoustical signal is labeled as belonging to the animal individual;If the new source acoustical signal is not
Any existing animal individual belonged in the species word bank, then create a new animal individual mark.
In order to solve the above-mentioned technical problem, present invention also offers a kind of device for safeguarding animal data storehouse, including:
Sound acquisition module, the sound sent for gathering different sound sources, obtain at least one source acoustical signal;
Extraction module, for extracting corresponding first voiceprint from the source acoustical signal;
Matching module, for by each species word bank in first voiceprint extracted and animal data storehouse
Sound-groove model in voiceprint storehouse is matched respectively;
Classifying module, for when the match is successful for the matching module, by first voiceprint pair that the match is successful
The source acoustical signal answered is stored in the multimedia information lib of corresponding species word bank;
Cluster module, for analyzing the second voiceprint of each sound-source signal in the species word bank, and according to analysis
As a result cluster operation is carried out to the source acoustical signal in the species word bank, determines the sound of the Different Individual in the species word bank
Sound.
Further, the sound acquisition module includes judging submodule and collection submodule, and the judging submodule is used
Whether there is movable sound in the sound for judging surrounding;The collection submodule is used for when the judge module judges there is movable sound
When, the movable sound is acquired, and carry out sound cutting.
Further, the sound acquisition module also includes orientation submodule, for carrying out sound source to the movable sound
Orientation, the direction that the collection submodule gathers is set to point to the orientation that the movable sound is sent.
Further, in addition to image collecting module, for gathering different sound source institutes respectively in the sound acquisition module
When the sound sent, taken pictures and/or imaged according to the time interval of setting, obtain at least one picture signal and/or
Vision signal.
Further, the image collecting module also includes preserving submodule, for will the match is successful with matched sub-block
The described image signal that collects of source acoustical signal same time and/or vision signal be correspondingly stored in the multimedia of species word bank
In information bank.
Further, in addition to time recording module, the source acoustical signal that the match is successful is stored in pair for classifying module
After in the multimedia information lib answered, the acquisition time of the source acoustical signal is recorded in the multimedia information lib.
Further, in addition to update module, corresponding to for classifying module, the source acoustical signal that the match is successful is stored in
After in multimedia information lib, the voiceprint storehouse corresponding to first voiceprint that the match is successful renewal.
Further, in addition to sound-groove model acquisition module, for the sound around sound acquisition module detection
Before, the sound-groove model is obtained, is specifically included:Gather the sounding sample of animal species corresponding to each species word bank;
Extract the first voiceprint in the sounding sample, and sound-groove model corresponding to generation.
Further, the classifying module includes newly-built submodule, for first voiceprint and each species
When the sound-groove model in voiceprint storehouse in storehouse mismatches, a new species word bank is created, first vocal print is believed
Breath is stored in the multimedia information lib of the new species word bank.
Further, the cluster module includes comparing submodule, analysis submodule, cluster submodule, compares submodule
For by the multimedia information lib in the species word bank the rising tone of voice signal fragment that forms of active acoustical signal
Line information is compared two-by-two;Analysis submodule is used for the second vocal print information gap minimum for drawing the comparison submodule
Two voice signal fragments are compared with preset value;Submodule is clustered to be used for when the analysis result of the analysis submodule
When being less than preset value for the second vocal print information gap of two voice signal fragments, the two voice signal fragments are synthesized one
Individual voice signal fragment.
In order to solve the above-mentioned technical problem, present invention also offers a kind of system for safeguarding animal data storehouse, including animal
Database and the above-mentioned device for safeguarding animal data storehouse, the animal data storehouse include at least one species word bank, Mei Gesuo
Stating species word bank includes voiceprint storehouse and multimedia information lib, and the voiceprint storehouse includes at least one sound-groove model;
Animal species corresponding to each species word bank in the animal data storehouse are respectively different.
Beneficial effects of the present invention:
The invention provides a kind of methods, devices and systems for safeguarding animal data storehouse, the animal data pre-established
Storehouse includes at least one species word bank, and each species word bank includes voiceprint storehouse and multimedia information lib, obtains surrounding
During sound, the first voiceprint is extracted from the source acoustical signal got, by itself and the voiceprint in each species word bank
Sound-groove model in storehouse is matched, if the match is successful, the source acoustical signal is stored in corresponding multimedia information lib,
Then the source acoustical signal in each species word bank is clustered, the source acoustical signal that same animal individual is sent is closed
Into.Realizing just can safeguard the function of animal sounds database without manual operation, substantially increase to collect in regional and move
The efficiency of goods and materials material, at the same by collection region realize the programming count of size of animal, establish the personalized number of animal individual
Provided convenience according to storehouse.
Brief description of the drawings
Fig. 1 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention one provides;
Fig. 2 is a kind of method flow diagram for realizing cluster that the embodiment of the present invention one provides;
Fig. 3 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention two provides;
Fig. 4 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention three provides;
Fig. 5 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention four provides;
Fig. 6 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention five provides;
Fig. 7 is a kind of schematic device in maintenance animal data storehouse that the embodiment of the present invention six provides.
Embodiment
Inventive conception is that:Using the difference of various animal vocalization features, believed according to the sample of each animal vocalization
Breath extracts its voiceprint and establishes sound-groove model, and forms based on this in species word bank different in animal data storehouse
Voiceprint storehouse, is then detected and is obtained to the sound of surrounding, then extracts the voiceprint of the source acoustical signal got,
It is matched with the sound-groove model in the voiceprint storehouse in each species word bank, if the match is successful, then it is assumed that
The source acoustical signal is by being sent with the animal of the same species of species word bank, is then stored in the source acoustical signal corresponding more
In media information storehouse, cluster operation then is carried out to the source acoustical signal in each multimedia information lib, determines Different Individual
Sound, so statistic of classification effectively can be carried out to each animal species, to realize the programming count of size of animal, establishing animal
The individuation data storehouse of individual provides facility, and need not implement behavior, the maximumlly life not to animal such as to arrest and do
Disturb.
The specific implementation of the present invention is further described below in conjunction with the accompanying drawings.
Embodiment one
A kind of method for safeguarding animal data storehouse is present embodiments provided, refer to Fig. 1, including:
The sound that S101, the different sound sources of collection are sent, obtains at least one source acoustical signal;
S102, the first voiceprint is extracted from the acoustical signal of source;
S103, by the voiceprint storehouse in each species word bank in the first voiceprint extracted and animal data storehouse
Sound-groove model matched respectively, when the match is successful, source acoustical signal corresponding to the first voiceprint that the match is successful is protected
In the presence of in the multimedia information lib of corresponding species word bank;
Second voiceprint of each source acoustical signal in S104, the same species word bank of analysis, and according to analysis result to thing
Source acoustical signal in seed bank carries out cluster operation, determines the sound of Different Individual in species word bank;Wherein, animal data storehouse
Include at least one species word bank, each species word bank includes voiceprint storehouse and multimedia information lib.
Animal data storehouse in the present embodiment, is established as needed, specifically according to classification and the animal of statistics
The needs of species;Animal data storehouse includes at least one species word bank, and each species word bank corresponds to an animal species, so
Animal species corresponding to different species word banks are different;In each species word bank, including voiceprint storehouse and multimedia letter
Storehouse is ceased, wherein, voiceprint storehouse is the sounding sample for gathering animal corresponding to the species word bank, extracts first in sounding sample
Voiceprint is made up of these sound-groove models as sound-groove model, and multimedia information lib, is to be used to corresponding preserve gather
The source acoustical signal arrived, and corresponding preservation picture signal and vision signal, that is to say, that more when just being established in animal data storehouse
The content in media information storehouse is sky.The structural representation in animal data storehouse refer to table 1:
Table 1
In animal data storehouse, the structural representation of the multimedia information lib in each species word bank refer to table 2, every in table
Individual project corresponds to some individual multi-media signal of the animal species and relevant information in the collection of some time respectively.Initialization
When multimedia information lib in content for sky.
Table 2
Sound around detecting, is that periodically the sound of surrounding is monitored, and the purpose of monitoring, which mainly obtains, to be moved
The cry of thing;After getting the cry of animal, it is possible to which the cry of animal is acquired.It is worth noting that, sounding and
It is not necessarily an animal individual, it is likely that have multiple individuals, it will not be same sound source hair that can use sound cutting technique
The sound gone out makes a distinction, specifically, be exactly the time point that different sounder status transformations are found out in continuous sound, so as to
It is in time that the signal for belonging to different animals individual is separated;Sound cutting technique, the pattra leaves based on model can be utilized
This algorithm, intersect the methods of likelihood ratio algorithm to realize.
Further, in order to obtain the simplicity of animal sounds, detecting the sound of surrounding includes:In sound around detecting
Whether movable sound is had, if so, then gathering the movable sound.Here movable sound, refer to that sound signal intensity reaches certain
The sound of degree;In the environment of surrounding, the general information of our generally known animal species, the intensity of phonation model of each species
Enclosing to determine, that is to say, that the sound that audible signal intensity reaches minimum animal vocalization intensity can be directly collected, so as to
Avoid collecting the sound of excessive non-animal, influence the correctness of matching.
In addition, when gathering sound, sound source direction can also be carried out to sound, the direction of collection is pointed to what sound was sent
Orientation;Specifically, pick up facility when gathering sound, can use the modes such as microphone array, the sound is positioned, made
Point to the orientation where sound in pickup direction.
While sound is gathered, it can also be taken pictures and/or be imaged, picture signal corresponding to collection and/or video
Signal, picture signal and vision signal can be acquired according to the acquisition interval of setting;Similar, in collection picture signal
And/or during vision signal, the direction of collection can be pointed to the direction in direction, i.e. sound source that sound is sent, determine method and sound
The mode of source positioning is consistent.
First voiceprint is a kind of characteristic information of voice signal, and it includes but is not limited to MFCC in the present embodiment
(Mel-Frequency Cepstral Coefficients, mel-frequency cepstrum coefficient), LPCC (Linear Predict
Cepstral Coefficients, linear prediction residue error) etc. a kind of voiceprint that can distinguish different animals species.
The first voiceprint is extracted from the acoustical signal of source, is extracted in the sound exactly sent from these each sound sources collected
The parameters such as MFCC, LPCC, different animal species, itself MFCC and LPCC parameter is different.
After the first voiceprint of each source acoustical signal is extracted, by these voiceprints and each species word bank
Sound-groove model in voiceprint storehouse is matched;When being matched, two kinds of possibility that the match is successful He it fails to match be present,
With success, refer to that the first voiceprint matches with the sound-groove model in some voiceprint storehouse, now assert first vocal print
Source acoustical signal corresponding to information is that the animal for belonging to animal species corresponding to this voiceprint storehouse issues;And match and lose
Lose, then refer to can not find any one voiceprint storehouse for first voiceprint it is matched, in this case, can
To establish a new species word bank based on this first voiceprint that it fails to match, this species word bank equally includes
Voiceprint storehouse and multimedia information lib, train to obtain sound-groove model and be saved according to the first voiceprint of the source acoustical signal
In the species voiceprint storehouse of the species word bank, and corresponding source acoustical signal is stored in the multimedia information lib of the species word bank
In, while the information such as picture signal and/or vision signal, acquisition time, collecting location corresponding to preservation.
After the match is successful in the first voiceprint, then source acoustical signal corresponding to this first voiceprint is stored in first
In multimedia information lib corresponding to the voiceprint voiceprint storehouse that the match is successful, each source sound in a multimedia information lib
Signal individually preserves;, should if also being corresponded to when gathering the source acoustical signal and acquiring picture signal and/or vision signal
Picture signal and/or vision signal preservation corresponding with the source acoustical signal concurrently gathered;Further, can also will be with collection
The unidirectional picture signal of source acoustical signal and/or vision signal are corresponding to be preserved, the highly preferred mode of the present embodiment, be by with
Collection source acoustical signal same time and the corresponding preservation of unidirectional picture signal and/or vision signal.
In addition, when preserving source acoustical signal and corresponding picture signal and/or vision signal, can also record
Gather the time of signal;In addition to recording acquisition time, the place of collection can also be recorded, the definition of collecting location can be
Position where the animal individual of sounding, the recording mode in this place can be various, can use the mode of mark,
Coordinate system can be pre-established, collecting location is recorded as coordinate or coordinate range.And the determination of specific collecting location,
It can be position taken pictures/imaged etc. from the position of the position of each collecting device, such as pick up facility.
, can be with when the match is successful for the sound-groove model in the first voiceprint of source acoustical signal and some voiceprint storehouse
The voiceprint storehouse is updated with first voiceprint that the match is successful, i.e. root first voiceprint trains to obtain corresponding sound
Line model.
The present embodiment to different animal species in addition to it can sort out, in respective species word bank, we
Have been obtained for the not homologous acoustical signal for belonging to identical species;Because each animal individual is not only to send out an infrasonic sound, that is,
Say, an animal individual is on the premise of multiple sound is sent, and its final result is all sound that an animal individual is sent, especially
It is the sound sent the different time, the multimedia information lib of same species word bank can be stored in as different source acoustical signals
In.So, multiple source acoustical signals of same animal individual will be preserved in same multimedia information lib;So in order to just
In the individual amount of the same species of statistics, while multimedia database is established to each individual, so as to support to each animal
The behavioral study of body, the second voiceprint of the source acoustical signal in the multimedia information lib in each species word bank can be carried out
Then these source acoustical signals are clustered by analysis according to analysis result.
Specifically, the process of above-mentioned cluster can be described as following steps:By the source sound letter in each multimedia information lib
Number the second voiceprint of voice signal fragment formed is compared two-by-two;From each comparison result, two gaps are selected
The second minimum voiceprint, by its gap value compared with preset value;Result of the comparison is divided into two kinds of situations:It is first, poor
It is more than preset value away from value, then stops comparing, processing procedure terminates;Second, gap value is less than preset value, now then by the two
Voice signal fragment corresponding to two voiceprints synthesizes, then by all including the voice signal fragment after this synthesis
Voice signal fragment be compared two-by-two again, the voiceprint two of the voice signal fragment from by each multimedia information lib
Two are compared and start to repeat above step, until not having any two second sound signal piece in the multimedia information lib
Untill gap value between section is less than preset value.At the end of analysis and processing procedure, due to the sound of each animal individual being believed
Number fragment is merged, then now easily can count corresponding species according to the situation of the voice signal fragment after merging
How many animal individual in word bank.The each animal individual identified is marked, the species word bank obtained according to cluster
The corresponding relation of project and animal individual in the corresponding relation namely multimedia information lib of middle source acoustical signal and animal individual, will
Animal individual corresponding to each project is recorded in " affiliated animal individual " row of multimedia information lib shown in table 2.
Further, Fig. 2 is refer to, the specific implementation step of cluster is as follows:
S104a, estimate each voice signal fragment feature samples set Gauss model (μ, Σ):To voice signal
Feature samples set { the X of fragmenti, i=1 ..., k }, according to the following formula calculate sample set mean μ and covariance Σ:
Wherein k represents the sample points in voice signal fragment, and i represents sample point index, feature samples XiIt is column vector,
(*)TRepresent transposition operation.Feature includes but is not limited to the parameters such as MFCC or LPCC described in this implementation example.
S104b, calculate the Generalized Likelihood Ratio distance of all voice signal fragments between any two.Generalized Likelihood Ratio distance is every
The product of the individual individually log-likelihood of voice signal fragment sample set by two independent voice signal fragments with being mixed
Sample set log-likelihood ratio.The log-likelihood of sample set is that own in voice signal fragment sample set
Log-likelihood sum of the sample point to Gauss model:
Wherein d represents sample vector dimension, | * | representing matrix determinant.
S104c, calculate similarity Δ BIC value of the Generalized Likelihood Ratio apart from two voice signal fragments corresponding to minimum value.
Δ BIC subtracts 0.5 λ (d+0.5d (d+1)) logN with Generalized Likelihood Ratio distance and obtained, and λ is penalty threshold, and d ties up for sample vector
Number, N are sample number total in two voice signal fragments.If Δ BIC is less than 0, merge two voice signal fragments, estimation merges
The Gauss model of voice signal fragment afterwards, and go to S104b and continue executing with;Otherwise, cluster is completed.
Similitude between voice signal fragment except Generalized Likelihood Ratio distance can be utilized to calculate, can also utilize pair
Divergence algorithm is claimed to calculate.
Species multimedia information lib for carrying out cluster operation, the information bank is saved into when there is new source acoustical signal
When, quick clustering is carried out, i.e.,:Using the set of all voice signals corresponding to each animal individual in existing cluster result as one
Individual cluster initial input, while using each new source acoustical signal as a cluster initial input, clustered.
Species multimedia information lib for carrying out cluster operation, the information bank is saved into when there is new source acoustical signal
When, it can also proceed as follows:By existing all voice signals in the information bank with new source acoustical signal together as cluster
Initial input, clustered.
Species multimedia information lib for carrying out cluster operation, the information bank is saved into when there is new source acoustical signal
When, it can also proceed as follows:Based on existing all voice signals in the information bank, action is entered to the new source acoustical signal
Thing individual identification., will if the new source acoustical signal is recognized as belonging to some existing animal individual in the information bank
" affiliated animal individual " information of the new affiliated project of source acoustical signal is designated as the mark of the animal individual;It is if described new
Source acoustical signal is recognized as being not belonging to any existing animal individual in the information bank, then creates a new animal individual mark
Note, and " affiliated animal individual " information of the new affiliated project of source acoustical signal is designated as the mark.
When the source acoustical signal preserved in the multimedia information lib to species word bank clusters, it can also impose a condition,
Such as only when the active acoustical signal total amount of institute of preservation reaches predetermined threshold value, then carry out cluster operation.
The first voiceprint in the present embodiment is used in the voiceprint storehouse in source acoustical signal and each species word bank
Sound-groove model is matched, and the second voiceprint is used to enter the source acoustical signal in the multimedia information lib in each species word bank
Row cluster, because it is used for different scenes, the first voiceprint is applied to the differentiation of species, and the second voiceprint is then used for same
Differentiation between one species Different Individual, therefore its first voiceprint and the second voiceprint are probably different parameters, when
So, it is also possible to which identical parameter, this does not have strict restriction, as long as selected parameter can reach required effect i.e.
Can.
Embodiment two
A kind of method for safeguarding animal data storehouse is present embodiments provided, as shown in figure 3, including:
S200, initialization:According to the animal species to be monitored of setting, animal data storehouse is built, animal data storehouse includes each
Individual species word bank, species word bank include voiceprint storehouse and multimedia information lib.The topology example in animal data storehouse is shown in Table 1.
When voiceprint storehouse is established under species word bank, the sounding sample of animal species is gathered, the of extraction sounding sample
One voiceprint establishes sound-groove model, forms voiceprint storehouse.First voiceprint described in this implementation example includes but unlimited
In:The one kind such as MFCC or LPCC can distinguish the vocal print feature information of different animals species.
The topology example of multimedia information lib is shown in Table 3, each project in table correspond to respectively in the collection of some time should
Some individual multi-media signal of animal species and relevant information.Content is sky in multimedia information lib when initialization.
Table 3
S201, single fixed pick up facility carry out continuous pickup and detection to the sound of surrounding, when detection judgement has work
When dynamic sound occurs, start sound cutting, untill detection judges the not active sound of sound at current time, obtain at least
One source acoustical signal.Movable sound refers to that signal energy reaches the sound of certain numerical value.Sound cutting is looked into continuous sound
The time point of different sounder status transformations is found out, so as in time that the signal for belonging to different animals individual is separated.
Specifically, the technology of sound cutting can utilize the bayesian algorithm based on model, intersect the methods of likelihood ratio algorithm to realize.
While sound cutting is carried out, continuously taken pictures according to the time interval of setting, obtain at least one image
Signal.
When detecting that movable sound occurs, microphone array can also be used to carry out sound source direction to the sound, make to pick up
Orientation where sound is pointed in sound direction and direction of taking pictures.
S202, each voiceprint of source acoustic signal extraction first obtained to cutting, with each vocal print in animal data storehouse
Sound-groove model in information bank is matched.If the match is successful, one is created in the multimedia information lib of corresponding animal species
Individual new project, source acoustical signal is preserved into the voice signal into the project, while preserve corresponding picture signal and acquisition time
Deng multimedia messages;If matching is unsuccessful, the information such as source acoustical signal and corresponding picture signal and acquisition time is abandoned, or
The newly-built species word bank of person, the information such as the unsuccessful source acoustical signal of the matching and corresponding picture signal and acquisition time are preserved
In the new species word bank.
In step S202, if the match is successful, the first voiceprint of source acoustical signal can also be utilized to corresponding animal
Sound-groove model in the voiceprint storehouse of species is updated.
S203, the multimedia information lib for creating new projects, analysis is passed through to all voice signals wherein preserved
Its second voiceprint is clustered, and judges the quantity of animal individual in multimedia information lib, to each individual identified
It is marked, the corresponding relation namely multimedia of voice signal and animal individual in the multimedia information lib obtained according to cluster
The corresponding relation of project and animal individual in information bank, animal individual corresponding to each project is recorded in more shown in table 3
In " affiliated animal individual " row in media information storehouse.
Cluster is according to the similitude between voice signal, similar voice signal is classified as into one kind, it is believed that be classified as one kind
Different voice signal fragments be to come from same animal individual.Fig. 2 is refer to, the specific implementation step of cluster is as follows:
The Gauss model (μ, Σ) of the feature samples set for the voice signal fragment that S104a, estimation source acoustical signal are formed:It is right
Feature samples set { the X of voice signal fragmenti, i=1 ..., k }, according to the following formula calculate sample set mean μ and covariance
Σ:
Wherein k represents the sample points in voice signal fragment, and i represents sample point index, feature samples XiIt is column vector,
(*)TRepresent transposition operation.Feature includes but is not limited to the parameters such as MFCC or LPCC described in this implementation example.
S104b, calculate the Generalized Likelihood Ratio distance of all voice signal fragments between any two.Generalized Likelihood Ratio distance is every
The product of the individual individually log-likelihood of voice signal fragment sample set by two independent voice signal fragments with being mixed
Sample set log-likelihood ratio.The log-likelihood of sample set is that own in voice signal fragment sample set
Log-likelihood sum of the sample point to Gauss model:
Wherein d represents sample vector dimension, | * | representing matrix determinant.
S104c, calculate similarity Δ BIC value of the Generalized Likelihood Ratio apart from two voice signal fragments corresponding to minimum value.
Δ BIC subtracts 0.5 λ (d+0.5d (d+1)) logN with Generalized Likelihood Ratio distance and obtained, and λ is penalty threshold, and d ties up for sample vector
Number, N are sample number total in two voice signal fragments.If Δ BIC is less than 0, merge two voice signal fragments, estimation merges
The Gauss model of voice signal fragment afterwards, and go to S104b and continue executing with;Otherwise, cluster is completed.
Similitude between voice signal fragment except Generalized Likelihood Ratio distance can be utilized to calculate, can also utilize pair
Divergence algorithm is claimed to calculate.
Multimedia information lib for carrying out cluster operation, when there is new source acoustical signal to be saved into the information bank, enter
Row quick clustering, i.e.,:It is poly- using the set of all voice signals corresponding to each animal individual in existing cluster result as one
Class initial input, while using each new source acoustical signal as a cluster initial input, clustered.
Multimedia information lib for carrying out cluster operation, when there is new source acoustical signal to be saved into the information bank,
It can proceed as follows:Existing active acoustical signal in the information bank is initial defeated together as cluster with new source acoustical signal
Enter, clustered.
Species multimedia information lib for carrying out cluster operation, the information bank is saved into when there is new source acoustical signal
When, it can also proceed as follows:Based on existing all voice signals in the information bank, action is entered to the new source acoustical signal
Thing individual identification., will if the new source acoustical signal is recognized as belonging to some existing animal individual in the information bank
" affiliated animal individual " information of the new affiliated project of source acoustical signal is designated as the mark of the animal individual;It is if described new
Source acoustical signal is recognized as being not belonging to any existing animal individual in the information bank, then creates a new animal individual mark
Note, and " affiliated animal individual " information of the new affiliated project of source acoustical signal is designated as the mark.
When the source acoustical signal preserved in the multimedia information lib to species word bank clusters, it can also impose a condition,
Such as only when the active acoustical signal total amount of institute of preservation reaches predetermined threshold value, then carry out cluster operation.Embodiment three
This implementation example describes another method for safeguarding animal data storehouse.Specific implementation flow such as Fig. 4 of this method
Shown, this method includes:
S300, initialization:According to the animal species to be monitored of setting, animal data storehouse is built, animal data storehouse includes each
Individual species word bank, species word bank include voiceprint storehouse and multimedia information lib.The topology example in animal data storehouse is shown in Table 1.
When voiceprint storehouse is established under species word bank, the sounding sample of animal species is gathered, the of extraction sounding sample
One voiceprint establishes sound-groove model, forms voiceprint storehouse.First voiceprint described in this implementation example includes but unlimited
In:The one kind such as MFCC or LPCC can distinguish the vocal print feature information of different animals species.
The topology example of multimedia information lib is shown in Table 4, each project in table correspond to respectively in the collection of some time should
Some individual multi-media signal of animal species and relevant information.Content is sky in multimedia information lib when initialization.
Table 4
S301, single fixed pick up facility carry out continuous pickup and detection to the sound of surrounding, when detection judgement has work
When dynamic sound occurs, start sound cutting, untill detection judges the not active sound of sound at current time, obtain at least
One source acoustical signal.Movable sound refers to that signal energy reaches the sound of certain numerical value.Sound cutting is looked into continuous sound
The time point of different sounder status transformations is found out, so as in time that the signal for belonging to different animals individual is separated.
Specifically, the technology of sound cutting can utilize the bayesian algorithm based on model, intersect the methods of likelihood ratio algorithm to realize.
While sound cutting is carried out, recorded a video.
When detecting that movable sound occurs, microphone array can also be used to carry out sound source direction to the sound, make to pick up
Orientation where sound is pointed in sound direction and video recording direction.
The sliced time point for cutting to obtain using sound, the video obtained to video recording are temporally split, obtained accordingly
It is associated at least one video segment, and with the source acoustical signal of corresponding time.
S302, each voiceprint of source acoustic signal extraction first obtained to cutting, with each vocal print in animal data storehouse
Sound-groove model in information bank is matched.If the match is successful, one is created in the multimedia information lib of corresponding animal species
Individual new project, source acoustical signal is preserved into the voice signal into the project, while preserve corresponding video segment and acquisition time
Deng multimedia messages;If matching is unsuccessful, the information such as source acoustical signal and corresponding video segment and acquisition time is abandoned, or
The newly-built species word bank of person, the information such as the unsuccessful source acoustical signal of the matching and corresponding picture signal and acquisition time are preserved
In the new species word bank.
In step s 302, if the match is successful, the first voiceprint of source acoustical signal can also be utilized to corresponding animal
Sound-groove model in the voiceprint storehouse of species is updated.
S303, the multimedia information lib for creating new projects, analysis is passed through to all voice signals wherein preserved
Its second voiceprint is clustered, and judges the quantity of animal individual in multimedia information lib, to each individual identified
It is marked, the corresponding relation namely multimedia of voice signal and animal individual in the multimedia information lib obtained according to cluster
The corresponding relation of project and animal individual in information bank, animal individual corresponding to each project is recorded in more shown in table 4
In " affiliated animal individual " row in media information storehouse.
Wherein, the mode of cluster is consistent with embodiment one and embodiment two, repeats no more here.
Example IV
Another method for safeguarding animal data storehouse is present embodiments provided, refer to Fig. 5, including:
S400, initialization:According to the animal species to be monitored of setting, animal data storehouse, number of animals are built in server end
Each species word bank is included according to storehouse, species word bank includes voiceprint storehouse and multimedia information lib.The structure in animal data storehouse is shown
Example is shown in Table 1.
When voiceprint storehouse is established under species word bank, the sounding sample of animal species is gathered, the of extraction sounding sample
One voiceprint establishes sound-groove model, forms voiceprint storehouse.First voiceprint described in this implementation example includes but unlimited
In:The one kind such as MFCC or LPCC can distinguish the vocal print feature information of different animals species.
The topology example of multimedia information lib is shown in Table 5, and each project in table corresponds in some place of some time respectively
Some individual multi-media signal of the animal species and relevant information of collection.Content in multimedia information lib when initialization
For sky.
Table 5
S401, sound progress continuous pickup and detection of each client of different location to surrounding are distributed in, work as inspection
When survey judgement is that movable sound occurs, start sound cutting, until detection judges that the not active sound of sound at current time is
Only, at least one source acoustical signal is obtained.Movable sound refers to that signal energy reaches the sound of certain numerical value.Carrying out sound cutting
While, continuously taken pictures according to the time interval of setting, obtain at least one picture signal.Obtained source sound letter will be cut
Number picture signal associated with the time, together with corresponding acquisition time and collecting location information, is sent to server end.
Sound cutting is the time point that different sounder status transformations are found out in continuous sound, different so as to belong in time
The signal of animal individual is separated.Specifically, the technology of sound cutting can utilize the bayesian algorithm based on model, intersect
The methods of likelihood ratio algorithm, is realized.
When client detection judgement is that movable sound occurs, first can also be cut without sound, but will sentence from detection
Disconnected is that the voice signal that movable sound occurs untill detection judges the not active sound of sound at current time preserves, together
When during this period of time continuously taken pictures, obtain at least one picture signal.By voice signal and picture signal, together with corresponding
Acquisition time and collecting location information together, be sent to server end.Voice signal is cut in server end, obtained
The picture signal that at least one source acoustical signal associates with the time.
S402, in server end, each obtained source is cut to each the source acoustical signal received or after receiving
The voiceprint of acoustic signal extraction first, matched with the sound-groove model in each voiceprint storehouse in animal data storehouse.If
With success, a new project is created in the multimedia information lib of corresponding animal species, source acoustical signal is preserved into the project
Voice signal, while preserve the multimedia messages such as corresponding picture signal, acquisition time, collecting location;If matching not into
Work(, the information such as source acoustical signal and corresponding picture signal and acquisition time, or newly-built species word bank are abandoned, by the matching not
The successful information such as source acoustical signal and corresponding picture signal and acquisition time is stored in the new species word bank.
In step S402, if the match is successful, the first voiceprint of source acoustical signal can also be utilized to corresponding animal
Sound-groove model in the voiceprint storehouse of species is updated.
S403, in server end, the multimedia information lib for creating new projects, all sound wherein preserved are believed
Number clustered by analyzing its second voiceprint, the quantity of animal individual in multimedia information lib is judged, to identifying
Each individual be marked, according to the corresponding relation of voice signal and animal individual in the obtained multimedia information lib of cluster,
Namely the corresponding relation of project and animal individual in multimedia information lib, animal individual corresponding to each project is recorded in
In " affiliated animal individual " row of multimedia information lib shown in table 5.
Embodiment five
Another method for safeguarding animal data storehouse is present embodiments provided, refer to Fig. 6, including:
S600, initialization:According to the animal species to be monitored of setting, animal data storehouse, number of animals are built in server end
Each animal species word bank is included according to storehouse, species word bank includes species voiceprint storehouse and species multimedia information lib.Number of animals
1 is shown in Table according to the topology example in storehouse.
When species voiceprint storehouse is established under species word bank, the sounding sample of animal species is gathered, extracts sounding sample
The first voiceprint establish sound-groove model, form species voiceprint storehouse.First voiceprint bag described in this implementation example
Include but be not limited to:The one kind such as MFCC or LPCC can distinguish the vocal print feature information of different animals species.
The topology example of species multimedia information lib is shown in Table 5, each project in table correspond to respectively some time some
Some individual multi-media signal of the animal species and relevant information of place collection.Species multimedia messages when initialization
Content is sky in storehouse.
S601, client user actively enroll animal sounds signal and obtain source acoustical signal, and during together with corresponding collection
Between and collecting location information together, be sent to server end;Animal painting can also be shot, is associated with source acoustical signal together with simultaneously
It is sent to server end.As explorer records one section of animal sounds signal, some animal paintings of shooting in the wild, and it is sent to clothes
Business device end.
S602, in server end, to each voiceprint of source acoustic signal extraction first received, with animal data storehouse
In sound-groove model in each species voiceprint storehouse matched.If the match is successful, in the multimedia letter of corresponding animal species
Cease and a new project is created in storehouse, source acoustical signal is preserved into the voice signal into the project, while preserve corresponding image letter
Number, the multimedia messages such as acquisition time, collecting location;If matching is unsuccessful, newly-built species in animal data storehouse
Storehouse, train to obtain sound-groove model and be saved in the species voiceprint storehouse of the species word bank according to source acoustical signal, in the species
A new project is created in the multimedia information lib of word bank, source acoustical signal is preserved into the voice signal into the project, protected simultaneously
Deposit the multimedia messages such as corresponding picture signal, acquisition time, collecting location.
In step S602, if the match is successful, the first voiceprint of source acoustical signal can also be utilized to corresponding animal
Sound-groove model in the species voiceprint storehouse of species is updated.When sound-groove model is updated, it can impose a condition, than
As being only just updated when corresponding to the animal species to be monitored that animal species are not belonging to set during initialization.
S603, in server end, the species multimedia information lib for creating new projects is sound to the institute that wherein preserves
Sound signal is clustered by analyzing its second voiceprint, judges the quantity of animal individual in species multimedia information lib,
The each individual identified is marked, voice signal and animal individual in the species multimedia information lib obtained according to cluster
Corresponding relation namely species multimedia information lib in the corresponding relation of project and animal individual, will corresponding to each project move
Thing individual mark is recorded in " affiliated animal individual " row of multimedia information lib shown in table 5.
Embodiment six
A kind of device for safeguarding animal data storehouse is present embodiments provided, refer to Fig. 7, including:
Sound acquisition module 51, the sound sent for gathering different sound sources, obtain at least one source acoustical signal;
Extraction module 52, for extracting corresponding first voiceprint from the acoustical signal of source;
Matching module 53, for the first voiceprint and the sound in each species word bank in animal data storehouse that will be extracted
Sound-groove model in line information bank is matched respectively;
Classifying module 54, for when the match is successful for matching module, by source corresponding to the first voiceprint that the match is successful
Acoustical signal is stored in the multimedia information lib of corresponding species word bank;
Cluster module 56, for analyzing the second voiceprint of each source acoustical signal in same species word bank, and according to point
Analyse result and cluster operation is carried out to the source acoustical signal in the species word bank, determine the sound of Different Individual in the species word bank
Sound.
Also include sound-groove model acquisition module 59;Animal data storehouse in the present embodiment, is established as needed, tool
Body is the needs of the animal species according to classification and statistics;Animal data storehouse includes at least one species word bank, each species
The corresponding animal species of word bank, so animal species corresponding to different species word banks are different;In each species word bank,
Including voiceprint storehouse and multimedia information lib, wherein, sound-groove model acquisition module 59 is to be used to obtain sound-groove model, specifically
Sounding sample including gathering animal species corresponding to each species word bank, extracts the first voiceprint in sounding sample
As sound-groove model, voiceprint storehouse is formed by these sound-groove models, and multimedia information lib, it is to be used to corresponding preservation collect
Source acoustical signal, and corresponding preserve picture signal and vision signal, that is to say, that when just being established in animal data storehouse, more matchmakers
The content of body information bank is sky.
Sound around detecting, is that periodically the sound of surrounding is monitored, and the purpose of monitoring, which mainly obtains, to be moved
The cry of thing;After getting the cry of animal, it is possible to which the cry of animal is acquired.It is worth noting that, sounding and
It is not necessarily an animal individual, it is likely that have multiple individuals, it will not be same sound source hair that can use sound cutting technique
The sound gone out makes a distinction, specifically, be exactly the time point that different sounder status transformations are found out in continuous sound, so as to
It is in time that the signal for belonging to different animals individual is separated;Sound cutting technique, the pattra leaves based on model can be utilized
This algorithm, intersect the methods of likelihood ratio algorithm to realize.
Sound acquisition module 51 includes judging submodule 511 and collection submodule 512;Judging submodule 511 is used to judge
Whether there is movable sound in the sound of surrounding, if so, collection submodule 512 gathers the movable sound.Here movable sound, it is
Refer to sound signal intensity and reach a certain degree of sound;In the environment of surrounding, the general letter of our generally known animal species
Breath, the range of voice of each species can also determine, that is to say, that can directly collect range of voice and reach minimum animal vocalization
The sound of intensity, so as to avoid the sound for collecting excessive non-animal, influence the correctness of matching.
Sound acquisition module 51 also includes orientation submodule 513, for being oriented to movable sound, makes the direction of collection
Point to the orientation that sound is sent;Specifically, orientation submodule 513 here, which can be microphone array etc., can realize that sound is determined
The device of position.
Also include image collecting module 55, for being taken pictures and/or being imaged while sound is gathered, corresponding to collection
The collection of picture signal and/or vision signal, picture signal and vision signal can be by way of timing acquiring, that is, sets
Fixed certain acquisition interval is acquired;Similar, can be by the side of collection when gathering picture signal and/or vision signal
To consistent with collection sound, the direction in direction, i.e. sound source that sound is sent is pointed to, determines method and the mode one of auditory localization
Cause.
Classifying module 54 includes newly-built submodule 541;After the first voiceprint of each source acoustical signal is extracted, matching
Module 53 is matched these first voiceprints with the sound-groove model in the voiceprint storehouse in each species word bank;Since
It is matching, certainly with the possibility that the match is successful He it fails to match, the match is successful, refers to voiceprint and some voiceprint storehouse
In any one sound-groove model the match is successful, now then assert that source acoustical signal corresponding to the voiceprint is and this voiceprint
The consistent animal of animal species corresponding to species word bank corresponding to storehouse issues;And it fails to match, then it represents that the vocal print is believed
Breath all can not find matching sound-groove model in any one voiceprint storehouse, and such situation typically has two kinds:Its
One, this voiceprint is not the voiceprint of animal, it may be possible to the first voiceprint of other abiotic sound sent;
Second, this first voiceprint is for we does not add corresponding sound-groove model when establishing animal data storehouse, in other words,
It is considered that this animal without addition, is not the scope that we classify and counted;Newly-built submodule 541 is to be used for this
A new species word bank is established based on the individual voiceprint that it fails to match, this species word bank equally includes voiceprint storehouse
And multimedia information lib, it is stored in voiceprint storehouse, is classified as sound-groove model after first voiceprint is trained
And statistics.
After in voiceprint, the match is successful, classifying module 54 then preserves source acoustical signal corresponding to this first voiceprint
It is each in a multimedia information lib in multimedia information lib corresponding to the first voiceprint voiceprint storehouse that the match is successful
Individual source acoustical signal individually preserves;Image collecting module 55 also includes preserving submodule 551, if for gathering the source sound letter
Number when also correspond to and acquire picture signal and/or vision signal, then adopt the picture signal and/or vision signal and concurrently
The source acoustical signal of collection is corresponding to be preserved;Further, can also will be with the unidirectional picture signal of collection source acoustical signal and/or regarding
Frequency signal is corresponding to be preserved, the highly preferred mode of the present embodiment, is by time same with collection source acoustical signal and unidirectional
Picture signal and/or the corresponding preservation of vision signal.
In addition, in addition to logging modle 57, for preserving source acoustical signal and corresponding picture signal and/or regarding
During frequency signal, the time of collection signal is recorded;In addition to recording acquisition time, the place of collection can also be recorded, is gathered
The definition in place can be the position where the animal individual of sounding, the recording mode in this place can be it is various, can be with
With the mode of mark, coordinate system can also be pre-established, collecting location is recorded as coordinate or coordinate range.And have
The determination of body collecting location, information and the direction of collection that can be with reference to picture signal and/or in vision signal determine jointly;
In addition, the definition of this collecting location can also be the position of each collecting device, such as the position of pick up facility, clap
According to the position etc. of/shooting.
The present embodiment additionally provides update module 58, for when the first voiceprint and some voiceprint of source acoustical signal
Sound-groove model in storehouse updates the voiceprint storehouse, i.e., by this when the match is successful with first voiceprint that the match is successful
After the voiceprint training that the match is successful, added in voiceprint storehouse, as sound-groove model, it can so follow up in real time each
The sounding situation of animal species, directive property are clearer and more definite.
The present embodiment to different animal species in addition to it can sort out, in respective species word bank, we
Have been obtained for the not homologous acoustical signal for belonging to identical species;Because each animal individual is not only to send out an infrasonic sound, that is,
Say, an animal individual is on the premise of multiple sound is sent, and its final result is all sound that an animal individual is sent, especially
It is the sound sent the different time, the multimedia information lib of same species word bank can be stored in as different source acoustical signals
In.So, multiple source acoustical signals of same animal individual will be preserved in same multimedia information lib;So in order to just
In the individual amount of the same species of statistics, while multimedia database is established to each individual, so as to support to each animal
The behavioral study of body, the device for safeguarding animal data storehouse in the present embodiment also includes cluster module 56, for each species
Second voiceprint of the source acoustical signal in the multimedia information lib in word bank is analyzed, then according to analysis result to these
Source acoustical signal is clustered.
Specifically, cluster module 56 includes comparing submodule 561, analysis submodule 562, cluster submodule 563;Comparer
Module 561 is used for the second voiceprint of the voice signal fragment for forming the source acoustical signal in each multimedia information lib two-by-two
It is compared;Analyze submodule 562 to be used for from each comparison result, select the second minimum voiceprint of two gaps, will
Its gap value is compared with preset value;Result of the comparison is divided into two kinds of situations:First, gap value is more than preset value, then
Stop comparing, processing procedure terminates;Second, gap value is less than preset value, now by cluster submodule 563 by the two rising tones
Voice signal fragment corresponding to line information synthesizes a voice signal fragment;Then, analysis submodule 562 includes all
The voice signal fragment including voice signal fragment after this synthesis is compared two-by-two again, repeats above step, directly
There is no the gap value between the second voiceprint of any two voice signal fragment to be less than into the multimedia information lib default
Untill value.
Also include statistical module 50 in the present embodiment, due to cluster process at the end of, the multimedia of all species word banks
Voice signal fragment in information bank is that different animal individuals is sent, and statistical module 50 is for direct at this moment
The quantity of the voice signal fragment in the multimedia information lib of each species word bank is counted, it is true according to the quantity of voice signal fragment
How many animal individual in fixed corresponding animal word bank.
Further, Fig. 2 is refer to, the specific implementation step of cluster is as follows:
S104a, estimate each voice signal fragment feature samples set Gauss model (μ, Σ):To voice signal
Feature samples set { the X of fragmenti, i=1 ..., k }, according to the following formula calculate sample set mean μ and covariance Σ:
Wherein k represents the sample points in voice signal fragment, and i represents sample point index, feature samples XiIt is column vector,
(*)TRepresent transposition operation.Feature includes but is not limited to the parameters such as MFCC or LPCC described in this implementation example.
S104b, calculate the Generalized Likelihood Ratio distance of all voice signal fragments between any two.Generalized Likelihood Ratio distance is every
The product of the individual individually log-likelihood of voice signal fragment sample set by two independent voice signal fragments with being mixed
Sample set log-likelihood ratio.The log-likelihood of sample set is that own in voice signal fragment sample set
Log-likelihood sum of the sample point to Gauss model:
Wherein d represents sample vector dimension, | * | representing matrix determinant.
S104c, calculate similarity Δ BIC value of the Generalized Likelihood Ratio apart from two voice signal fragments corresponding to minimum value.
Δ BIC subtracts 0.5 λ (d+0.5d (d+1)) logN with Generalized Likelihood Ratio distance and obtained, and λ is penalty threshold, and d ties up for sample vector
Number, N are sample number total in two voice signal fragments.If Δ BIC is less than 0, merge two voice signal fragments, estimation merges
The Gauss model of voice signal fragment afterwards, and go to 104b and continue executing with;Otherwise, cluster is completed.
Similitude between voice signal fragment except Generalized Likelihood Ratio distance can be utilized to calculate, can also utilize pair
Divergence algorithm is claimed to calculate.
When the source acoustical signal preserved in the multimedia information lib to species word bank clusters, it can also impose a condition,
Such as only when the active acoustical signal total amount of institute of preservation reaches predetermined threshold value, then carry out cluster operation.
In addition, the present embodiment additionally provides a kind of system for safeguarding animal data storehouse, including animal data storehouse and upper
That states safeguards the device in animal data storehouse, and animal data storehouse includes at least one species word bank, and each species word bank includes vocal print
Information bank and multimedia information lib, voiceprint storehouse include at least one sound-groove model;Each species in animal data storehouse
Animal species are respectively different corresponding to word bank.
One of ordinary skill in the art will appreciate that all or part of step in the above method can be instructed by program
Related hardware is completed, and described program can be stored in computer-readable recording medium, such as read-only storage, disk or CD
Deng.Alternatively, all or part of step of above-described embodiment can also be realized using one or more integrated circuits, accordingly
Ground, each module/unit in above-described embodiment can be realized in the form of hardware, can also use the shape of software function module
Formula is realized.The present invention is not restricted to the combination of the hardware and software of any particular form.
Above content is to combine specific embodiment further description made for the present invention, it is impossible to assert this hair
Bright specific implementation is confined to these explanations.For general technical staff of the technical field of the invention, do not taking off
On the premise of from present inventive concept, some simple deduction or replace can also be made, should all be considered as belonging to the protection of the present invention
Scope.
Claims (25)
- A kind of 1. method for safeguarding animal data storehouse, it is characterised in that including:The sound that different sound sources are sent is gathered, obtains at least one source acoustical signal;The first voiceprint is extracted from the source acoustical signal;By the sound in the voiceprint storehouse in each species word bank in first voiceprint extracted and animal data storehouse Line model is matched respectively;When the match is successful, source acoustical signal corresponding to the first voiceprint that the match is successful is stored in In the multimedia information lib of corresponding species word bank;The second voiceprint of each source acoustical signal in the species word bank is analyzed, and according to analysis result to the species word bank In source acoustical signal carry out cluster operation, determine the sound of the Different Individual in the species word bank;Wherein, the number of animals Include at least one species word bank according to storehouse, each species word bank includes voiceprint storehouse and multimedia information lib.
- 2. the as claimed in claim 1 method for safeguarding animal data storehouse, it is characterised in that the different sound sources of the collection are sent Sound include:Whether the sound around detection has movable sound;If detecting movable sound, the movable sound is carried out Collection, and carry out sound cutting.
- 3. the method as claimed in claim 2 for safeguarding animal data storehouse, it is characterised in that described to be carried out to the movable sound Collection includes:Sound source direction is carried out to the movable sound, the direction of collection is pointed to the orientation that the movable sound is sent.
- 4. the method as claimed in claim 1 for safeguarding animal data storehouse, it is characterised in that sent out gathering different sound sources respectively When the sound gone out, in addition to:Taken pictures and/or imaged according to the time interval of setting, obtain at least one picture signal And/or vision signal.
- 5. the method as claimed in claim 4 for safeguarding animal data storehouse, it is characterised in that when the match is successful, in addition to:Will The corresponding more matchmakers for being stored in species word bank of the described image signal and/or vision signal of the collection of same with source acoustical signal time In body information bank.
- 6. the as claimed in claim 1 method for safeguarding animal data storehouse, it is characterised in that described by the first sound that the match is successful Source acoustical signal corresponding to line information is stored in after the multimedia information lib of corresponding species word bank, in addition to:In more matchmakers The acquisition time of the source acoustical signal is recorded in body information bank.
- 7. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that described to gather not in unison The sound that source is sent, obtaining at least one source acoustical signal includes:Sound around user's active detecting, and described in obtaining Source acoustical signal is sent to server.
- 8. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that it is described will matching into After source acoustical signal corresponding to first voiceprint of work(is stored in the multimedia information lib of corresponding species word bank, also wrap Include:The voiceprint storehouse corresponding to first voiceprint that the match is successful renewal.
- 9. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that around the detection Sound before, in addition to obtain the sound-groove model, including:Gather the hair of animal species corresponding to each species word bank Sound sample;Extract the first voiceprint in the sounding sample, and sound-groove model corresponding to generation.
- 10. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that described to extract The first voiceprint and animal data storehouse in sound-groove model in voiceprint storehouse in each species word bank carry out respectively With including:If the sound-groove model in voiceprint storehouse in first voiceprint and each species word bank mismatches, A new species word bank is created, first voiceprint is stored in the multimedia information lib of the new species word bank In.
- 11. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that described in the analysis Second voiceprint of each source acoustical signal in species word bank, and according to analysis result to the source acoustical signal in the species word bank Carrying out cluster operation includes:By in the multimedia information lib in the species word bank the rising tone of voice signal fragment that forms of active acoustical signal Line information is compared two-by-two, therefrom selects two minimum voice signal fragments of the second vocal print information gap;By the gap of the second voiceprint of the two voice signal fragments compared with preset value;If the second vocal print information gap of the two voice signal fragments is less than preset value, the two voice signal fragments are closed As a voice signal fragment;The second voiceprint of all voice signal fragments is carried out two-by-two from the multimedia information lib in the word bank by species Comparison starts to repeat all of above step, until the second vocal print information gap of any two voice signal fragment is all higher than presetting Value, then be different animal individuals corresponding to these voice signal fragments, then these voice signal fragments do not synthesized.
- 12. the method as claimed in claim 11 for safeguarding animal data storehouse, it is characterised in that in the word bank to species After source acoustical signal carries out cluster operation, in addition to:Judge the quantity of the animal individual in species word bank, and each animal Corresponding relation between individual information and animal individual with the multimedia information lib.
- 13. the method as claimed in claim 11 for safeguarding animal data storehouse, it is characterised in that in the word bank to species After source acoustical signal carries out cluster operation, when there is new source acoustical signal to be stored in the multimedia information lib of the species word bank, also Including:Using corresponding to each animal individual in existing cluster result active acoustical signal set it is initial defeated as a cluster Enter, while using each new source acoustical signal as a cluster initial input, clustered;Or, by the new source acoustical signal with re-starting cluster behaviour together with the existing active acoustical signal of institute in the species word bank Make.
- 14. the method as claimed in claim 11 for safeguarding animal data storehouse, it is characterised in that in the word bank to species After source acoustical signal carries out cluster operation, when having the multimedia information lib of the new source acoustical signal deposit species word bank, also wrap Include:Based on existing all voice signals in the species word bank, animal individual identification is carried out to the new source acoustical signal;If The new source acoustical signal belongs to some existing animal individual in the species word bank, then is labeled as the new source acoustical signal Belong to the animal individual;If the new source acoustical signal is not belonging to any existing animal individual in the species word bank, create Build a new animal individual mark.
- A kind of 15. device for safeguarding animal data storehouse, it is characterised in that including:Sound acquisition module, the sound sent for gathering different sound sources, obtain at least one source acoustical signal;Extraction module, for extracting corresponding first voiceprint from the source acoustical signal;Matching module, for by the vocal print in each species word bank in first voiceprint extracted and animal data storehouse Sound-groove model in information bank is matched respectively;Classifying module, for when the match is successful for the matching module, by corresponding to first voiceprint that the match is successful Source acoustical signal is stored in the multimedia information lib of corresponding species word bank;Cluster module, for analyzing the second voiceprint of each sound-source signal in the species word bank, and according to analysis result Cluster operation is carried out to the source acoustical signal in the species word bank, determines the sound of the Different Individual in the species word bank.
- 16. the device as claimed in claim 15 for safeguarding animal data storehouse, it is characterised in that the sound acquisition module includes Judging submodule and collection submodule, the judging submodule are used to judge whether the sound of surrounding has movable sound;It is described to adopt Collection submodule is used to, when the judge module judges to have movable sound, be acquired the movable sound, and carry out sound Cutting.
- 17. the device as claimed in claim 16 for safeguarding animal data storehouse, it is characterised in that the sound acquisition module is also wrapped Orientation submodule is included, for carrying out sound source direction to the movable sound, the direction that the collection submodule gathers is pointed to institute State the orientation that movable sound is sent.
- 18. the as claimed in claim 15 device for safeguarding animal data storehouse, it is characterised in that also including image collecting module, For when the sound acquisition module gathers the sound that different sound sources are sent respectively, being carried out according to the time interval of setting Take pictures and/or image, obtain at least one picture signal and/or vision signal.
- 19. the device as claimed in claim 18 for safeguarding animal data storehouse, it is characterised in that the image collecting module is also wrapped Preservation submodule is included, for the described image signal that will be collected with the matched sub-block source acoustical signal same time that the match is successful And/or vision signal is correspondingly stored in the multimedia information lib of species word bank.
- 20. the as claimed in claim 15 device for safeguarding animal data storehouse, it is characterised in that also including time recording module, After for classifying module, the source acoustical signal that the match is successful is stored in corresponding multimedia information lib, believe in the multimedia The acquisition time of the source acoustical signal is recorded in breath storehouse.
- 21. the device for safeguarding animal data storehouse as described in claim any one of 15-20, it is characterised in that also include renewal Module, after the source acoustical signal that the match is successful is stored in corresponding multimedia information lib for classifying module, with described With voiceprint storehouse corresponding to the renewal of successful first voiceprint.
- 22. the method for safeguarding animal data storehouse as described in claim any one of 15-20, it is characterised in that also including vocal print Model acquisition module, before the sound around sound acquisition module detection, the sound-groove model is obtained, specific bag Include:Gather the sounding sample of animal species corresponding to each species word bank;Extract the first vocal print in the sounding sample Information, and sound-groove model corresponding to generation.
- 23. the method for safeguarding animal data storehouse as described in claim any one of 15-20, it is characterised in that the classification mould Block includes newly-built submodule, for the sound-groove model in the voiceprint storehouse in first voiceprint and each species word bank When mismatching, a new species word bank is created, first voiceprint is stored in the more of the new species word bank In media information storehouse.
- 24. the device for safeguarding animal data storehouse as described in claim any one of 15-20, it is characterised in that the cluster mould Block includes comparing submodule, analysis submodule, cluster submodule, compares submodule and is used for the multimedia in the species word bank In information bank the second voiceprint of voice signal fragment for forming of active acoustical signal be compared two-by-two;Analyze submodule The two voice signal fragments minimum for the second vocal print information gap for drawing the comparison submodule are carried out with preset value Compare analysis;Submodule is clustered to be used for when the second vocal print that the analysis result of the analysis submodule is two voice signal fragments When information gap is less than preset value, the two voice signal fragments are synthesized into a voice signal fragment.
- 25. a kind of system for safeguarding animal data storehouse, it is characterised in that including animal data storehouse and as claim 15-24 appoints The device for safeguarding animal data storehouse described in one, the animal data storehouse include at least one species word bank, each thing Seed bank includes voiceprint storehouse and multimedia information lib, and the voiceprint storehouse includes at least one sound-groove model;It is described Animal species corresponding to each species word bank in animal data storehouse are respectively different.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610694221.2A CN107766372A (en) | 2016-08-19 | 2016-08-19 | A kind of methods, devices and systems for safeguarding animal data storehouse |
PCT/CN2017/094405 WO2018032946A1 (en) | 2016-08-19 | 2017-07-25 | Method, device, and system for maintaining animal database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610694221.2A CN107766372A (en) | 2016-08-19 | 2016-08-19 | A kind of methods, devices and systems for safeguarding animal data storehouse |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107766372A true CN107766372A (en) | 2018-03-06 |
Family
ID=61196345
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610694221.2A Pending CN107766372A (en) | 2016-08-19 | 2016-08-19 | A kind of methods, devices and systems for safeguarding animal data storehouse |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107766372A (en) |
WO (1) | WO2018032946A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112472355A (en) * | 2021-01-06 | 2021-03-12 | 南京北数游电子玩具有限公司 | Wild animal sperm collection equipment based on database |
CN113448975A (en) * | 2021-05-26 | 2021-09-28 | 科大讯飞股份有限公司 | Method, device and system for updating character image library and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101976564A (en) * | 2010-10-15 | 2011-02-16 | 中国林业科学研究院森林生态环境与保护研究所 | Method for identifying insect voice |
CN103117061B (en) * | 2013-02-05 | 2016-01-20 | 广东欧珀移动通信有限公司 | A kind of voice-based animals recognition method and device |
CN105161093B (en) * | 2015-10-14 | 2019-07-09 | 科大讯飞股份有限公司 | A kind of method and system judging speaker's number |
-
2016
- 2016-08-19 CN CN201610694221.2A patent/CN107766372A/en active Pending
-
2017
- 2017-07-25 WO PCT/CN2017/094405 patent/WO2018032946A1/en active Application Filing
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112472355A (en) * | 2021-01-06 | 2021-03-12 | 南京北数游电子玩具有限公司 | Wild animal sperm collection equipment based on database |
CN113448975A (en) * | 2021-05-26 | 2021-09-28 | 科大讯飞股份有限公司 | Method, device and system for updating character image library and storage medium |
CN113448975B (en) * | 2021-05-26 | 2023-01-17 | 科大讯飞股份有限公司 | Method, device and system for updating character image library and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2018032946A1 (en) | 2018-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11900947B2 (en) | Method and system for automatically diarising a sound recording | |
Temko et al. | Acoustic event detection in meeting-room environments | |
Moattar et al. | A review on speaker diarization systems and approaches | |
Imoto | Introduction to acoustic event and scene analysis | |
CN106791579A (en) | The processing method and system of a kind of Video Frequency Conference Quality | |
Ajmera et al. | Clustering and segmenting speakers and their locations in meetings | |
CN109410956A (en) | A kind of object identifying method of audio data, device, equipment and storage medium | |
CN108876951A (en) | A kind of teaching Work attendance method based on voice recognition | |
Vivek et al. | Acoustic scene classification in hearing aid using deep learning | |
Serizel et al. | Machine listening techniques as a complement to video image analysis in forensics | |
Schröter et al. | Segmentation, classification, and visualization of orca calls using deep learning | |
Hagiwara et al. | BEANS: The benchmark of animal sounds | |
CN107766372A (en) | A kind of methods, devices and systems for safeguarding animal data storehouse | |
CN109920447A (en) | Recording fraud detection method based on sef-adapting filter Amplitude & Phase feature extraction | |
Mertens et al. | On the applicability of speaker diarization to audio concept detection for multimedia retrieval | |
Jadhav et al. | Machine learning approach to classify birds on the basis of their sound | |
Najafian et al. | Employing speech and location information for automatic assessment of child language environments | |
Lei et al. | User verification: Matching the uploaders of videos across accounts | |
Andono et al. | Feature Selection on Gammatone Cepstral Coefficients for Bird Voice Classification Using Particle Swarm Optimization. | |
CN108629024A (en) | A kind of teaching Work attendance method based on voice recognition | |
Ortega et al. | Bird Identification from the Thamnophilidae Family at the Andean Region of Colombia | |
Bhor et al. | Automated Bird Species Identification using Audio Signal Processing and Neural Network | |
Wang et al. | Augmented strategy for polyphonic sound event detection | |
Baumann et al. | Influence of utterance and speaker characteristics on the classification of children with cleft lip and palate | |
Changapur et al. | Bioacoustics Monitoring to Improve Conservation Efforts for Endangered Species |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180306 |