CN107766372A - A kind of methods, devices and systems for safeguarding animal data storehouse - Google Patents

A kind of methods, devices and systems for safeguarding animal data storehouse Download PDF

Info

Publication number
CN107766372A
CN107766372A CN201610694221.2A CN201610694221A CN107766372A CN 107766372 A CN107766372 A CN 107766372A CN 201610694221 A CN201610694221 A CN 201610694221A CN 107766372 A CN107766372 A CN 107766372A
Authority
CN
China
Prior art keywords
sound
voiceprint
species
word bank
animal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610694221.2A
Other languages
Chinese (zh)
Inventor
彭科
程光伟
朱长宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201610694221.2A priority Critical patent/CN107766372A/en
Priority to PCT/CN2017/094405 priority patent/WO2018032946A1/en
Publication of CN107766372A publication Critical patent/CN107766372A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/432Query formulation
    • G06F16/433Query formulation using audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices

Abstract

The invention provides a kind of method for safeguarding animal data storehouse,Device and system,The animal data storehouse pre-established includes at least one species word bank,Each species word bank includes voiceprint storehouse and multimedia information lib,When obtaining the sound of surrounding,The first voiceprint is extracted from the source acoustical signal got,It is matched with the sound-groove model in the voiceprint storehouse in each species word bank,If the match is successful,Then the source acoustical signal is stored in corresponding multimedia information lib,Then the second voiceprint of each source acoustical signal is analyzed,Source acoustical signal in each species word bank is clustered,It is achieved thereby that the function of animal sounds database just can be safeguarded without manual operation,Substantially increase the efficiency that animal data is collected in regional,Simultaneously by collection region realize the programming count of size of animal,Provide convenience in the individuation data storehouse for establishing animal individual.

Description

A kind of methods, devices and systems for safeguarding animal data storehouse
Technical field
The present invention relates to ecological monitoring and Underwater Acoustic channels field, more particularly to a kind of method for safeguarding animal data storehouse, Device and system.
Background technology
The idea got along amiably and peacefully with animal protection, the mankind and animal is increasingly popularized, monitoring animal behavior, research animal Habit is also increasingly valued by people.Sound is the important way of progress information interchange between animal, by animal sound Signal study the communication it will be seen that between animal individual, such as species identification and individual identification, Mate choice.Animal sound The collection of signal can also help to find new species, assess status hierarchical relationship, analyzing animal development growth situation in animal population Deng.Therefore, research of the animal vocalization to Animal behaviour is most important.The mankind have the language of the mankind, and animal also has the language of animal, The Different Individual of each species and each species has unique cry feature.How the difference sent using each animal Cry, each animal in region is sorted out, turns into biology and a statistical big problem.
The content of the invention
The invention provides a kind of methods, devices and systems for safeguarding animal data storehouse, solve and how to utilize animal The problem of sound is by animal automatic clustering in certain area.
In order to solve the above-mentioned technical problem, the invention provides a kind of method for safeguarding animal data storehouse, including:
The sound that different sound sources are sent is gathered, obtains at least one source acoustical signal;
The first voiceprint is extracted from the source acoustical signal;
By in the voiceprint storehouse in each species word bank in first voiceprint extracted and animal data storehouse Sound-groove model matched respectively;When the match is successful, source acoustical signal corresponding to the first voiceprint that the match is successful is protected In the presence of in the multimedia information lib of corresponding species word bank;
The second voiceprint of each source acoustical signal in the species word bank is analyzed, and according to analysis result to the species Source acoustical signal in word bank carries out cluster operation, determines the sound of the Different Individual in the species word bank;Wherein, it is described dynamic Thing database includes at least one species word bank, and each species word bank includes voiceprint storehouse and multimedia information lib.
Further, the sound that the different sound sources of the collection are sent includes:Whether the sound around detection has movable sound Sound;If detecting movable sound, the movable sound is acquired, and carry out sound cutting.
Further, it is described to gather the movable sound that different sound sources are sent respectively and include:To the movable sound Sound source direction is carried out, the direction of collection is pointed to the orientation that the movable sound is sent.
Further, when the sound that different sound sources are sent is gathered respectively, in addition to:According to the time interval of setting Taken pictures and/or imaged, obtain at least one picture signal and/or vision signal.
Further, when the match is successful, in addition to:The described image of time same with source acoustical signal collection is believed Number and/or vision signal be correspondingly stored in the multimedia information lib of species word bank.
Further, it is described by source acoustical signal corresponding to the first voiceprint that the match is successful be stored in corresponding to species After the multimedia information lib in storehouse, in addition to:The acquisition time of the source acoustical signal is recorded in the multimedia information lib.
Further, the sound around the detection, and the sound that different sound sources are sent is gathered respectively, obtain at least one Individual source acoustical signal includes:Sound around user's active detecting, and the obtained source acoustical signal is sent to server.
Further, source acoustical signal corresponding to the first voiceprint that the match is successful is stored in corresponding species described After in the multimedia information lib of word bank, in addition to:The vocal print corresponding to first voiceprint that the match is successful renewal is believed Cease storehouse.
Further, before the sound around the detection, in addition to the sound-groove model is obtained, including:Collection is each The sounding sample of animal species corresponding to the individual species word bank;The first voiceprint in the sounding sample is extracted, and it is raw Into corresponding sound-groove model.
Further, the vocal print by each species word bank in the first voiceprint extracted and animal data storehouse Sound-groove model in information bank carries out matching respectively to be included:If the vocal print letter in first voiceprint and each species word bank Sound-groove model in breath storehouse mismatches, then, creates a new species word bank, first voiceprint is stored in described In the multimedia information lib of new species word bank.
Further, in the analyte seed bank each source acoustical signal the second voiceprint, and according to analysis result Carrying out cluster operation to the source acoustical signal in the species word bank includes:
By in the multimedia information lib in the species word bank the voice signal fragment that forms of active acoustical signal the Two voiceprints are compared two-by-two, therefrom select two minimum voice signal fragments of the second vocal print information gap;
By the gap of the second voiceprint of the two voice signal fragments compared with preset value;
If the second vocal print information gap of the two voice signal fragments is less than preset value, by the two voice signal pieces Section synthesizes a voice signal fragment;
The second voiceprint of all voice signal fragments is two-by-two from the multimedia information lib in the word bank by species It is compared and starts to repeat all of above step, until the second vocal print information gap of any two voice signal fragment is all higher than Preset value, then it is different animal individuals corresponding to these voice signal fragments, then these voice signal fragments is not closed Into.
Further, after the source acoustical signal in the word bank to species carries out cluster operation, in addition to:Judge thing The quantity of animal individual in seed bank, and each animal individual and the information and animal individual in the multimedia information lib Between corresponding relation.
Further, after the source acoustical signal in the word bank to species carries out cluster operation, there is new source sound letter During the multimedia information lib of number deposit species word bank, in addition to:By corresponding to each animal individual in existing cluster result The active acoustical signal set of institute clusters initial input as one, while initial defeated using each new source acoustical signal as a cluster Enter, clustered;
Or, the active acoustical signal one of existing institute in the new source acoustical signal and the species word bank is re-started into cluster Operation.
Further, after the source acoustical signal in the word bank to species carries out cluster operation, there is new source sound letter During the multimedia information lib of number deposit species word bank, in addition to:Based on existing all voice signals in the species word bank, Animal individual identification is carried out to the new source acoustical signal;If the new source acoustical signal belongs to existing in the species word bank Some animal individual, then the new source acoustical signal is labeled as belonging to the animal individual;If the new source acoustical signal is not Any existing animal individual belonged in the species word bank, then create a new animal individual mark.
In order to solve the above-mentioned technical problem, present invention also offers a kind of device for safeguarding animal data storehouse, including:
Sound acquisition module, the sound sent for gathering different sound sources, obtain at least one source acoustical signal;
Extraction module, for extracting corresponding first voiceprint from the source acoustical signal;
Matching module, for by each species word bank in first voiceprint extracted and animal data storehouse Sound-groove model in voiceprint storehouse is matched respectively;
Classifying module, for when the match is successful for the matching module, by first voiceprint pair that the match is successful The source acoustical signal answered is stored in the multimedia information lib of corresponding species word bank;
Cluster module, for analyzing the second voiceprint of each sound-source signal in the species word bank, and according to analysis As a result cluster operation is carried out to the source acoustical signal in the species word bank, determines the sound of the Different Individual in the species word bank Sound.
Further, the sound acquisition module includes judging submodule and collection submodule, and the judging submodule is used Whether there is movable sound in the sound for judging surrounding;The collection submodule is used for when the judge module judges there is movable sound When, the movable sound is acquired, and carry out sound cutting.
Further, the sound acquisition module also includes orientation submodule, for carrying out sound source to the movable sound Orientation, the direction that the collection submodule gathers is set to point to the orientation that the movable sound is sent.
Further, in addition to image collecting module, for gathering different sound source institutes respectively in the sound acquisition module When the sound sent, taken pictures and/or imaged according to the time interval of setting, obtain at least one picture signal and/or Vision signal.
Further, the image collecting module also includes preserving submodule, for will the match is successful with matched sub-block The described image signal that collects of source acoustical signal same time and/or vision signal be correspondingly stored in the multimedia of species word bank In information bank.
Further, in addition to time recording module, the source acoustical signal that the match is successful is stored in pair for classifying module After in the multimedia information lib answered, the acquisition time of the source acoustical signal is recorded in the multimedia information lib.
Further, in addition to update module, corresponding to for classifying module, the source acoustical signal that the match is successful is stored in After in multimedia information lib, the voiceprint storehouse corresponding to first voiceprint that the match is successful renewal.
Further, in addition to sound-groove model acquisition module, for the sound around sound acquisition module detection Before, the sound-groove model is obtained, is specifically included:Gather the sounding sample of animal species corresponding to each species word bank; Extract the first voiceprint in the sounding sample, and sound-groove model corresponding to generation.
Further, the classifying module includes newly-built submodule, for first voiceprint and each species When the sound-groove model in voiceprint storehouse in storehouse mismatches, a new species word bank is created, first vocal print is believed Breath is stored in the multimedia information lib of the new species word bank.
Further, the cluster module includes comparing submodule, analysis submodule, cluster submodule, compares submodule For by the multimedia information lib in the species word bank the rising tone of voice signal fragment that forms of active acoustical signal Line information is compared two-by-two;Analysis submodule is used for the second vocal print information gap minimum for drawing the comparison submodule Two voice signal fragments are compared with preset value;Submodule is clustered to be used for when the analysis result of the analysis submodule When being less than preset value for the second vocal print information gap of two voice signal fragments, the two voice signal fragments are synthesized one Individual voice signal fragment.
In order to solve the above-mentioned technical problem, present invention also offers a kind of system for safeguarding animal data storehouse, including animal Database and the above-mentioned device for safeguarding animal data storehouse, the animal data storehouse include at least one species word bank, Mei Gesuo Stating species word bank includes voiceprint storehouse and multimedia information lib, and the voiceprint storehouse includes at least one sound-groove model; Animal species corresponding to each species word bank in the animal data storehouse are respectively different.
Beneficial effects of the present invention:
The invention provides a kind of methods, devices and systems for safeguarding animal data storehouse, the animal data pre-established Storehouse includes at least one species word bank, and each species word bank includes voiceprint storehouse and multimedia information lib, obtains surrounding During sound, the first voiceprint is extracted from the source acoustical signal got, by itself and the voiceprint in each species word bank Sound-groove model in storehouse is matched, if the match is successful, the source acoustical signal is stored in corresponding multimedia information lib, Then the source acoustical signal in each species word bank is clustered, the source acoustical signal that same animal individual is sent is closed Into.Realizing just can safeguard the function of animal sounds database without manual operation, substantially increase to collect in regional and move The efficiency of goods and materials material, at the same by collection region realize the programming count of size of animal, establish the personalized number of animal individual Provided convenience according to storehouse.
Brief description of the drawings
Fig. 1 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention one provides;
Fig. 2 is a kind of method flow diagram for realizing cluster that the embodiment of the present invention one provides;
Fig. 3 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention two provides;
Fig. 4 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention three provides;
Fig. 5 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention four provides;
Fig. 6 is a kind of method flow diagram in maintenance animal data storehouse that the embodiment of the present invention five provides;
Fig. 7 is a kind of schematic device in maintenance animal data storehouse that the embodiment of the present invention six provides.
Embodiment
Inventive conception is that:Using the difference of various animal vocalization features, believed according to the sample of each animal vocalization Breath extracts its voiceprint and establishes sound-groove model, and forms based on this in species word bank different in animal data storehouse Voiceprint storehouse, is then detected and is obtained to the sound of surrounding, then extracts the voiceprint of the source acoustical signal got, It is matched with the sound-groove model in the voiceprint storehouse in each species word bank, if the match is successful, then it is assumed that The source acoustical signal is by being sent with the animal of the same species of species word bank, is then stored in the source acoustical signal corresponding more In media information storehouse, cluster operation then is carried out to the source acoustical signal in each multimedia information lib, determines Different Individual Sound, so statistic of classification effectively can be carried out to each animal species, to realize the programming count of size of animal, establishing animal The individuation data storehouse of individual provides facility, and need not implement behavior, the maximumlly life not to animal such as to arrest and do Disturb.
The specific implementation of the present invention is further described below in conjunction with the accompanying drawings.
Embodiment one
A kind of method for safeguarding animal data storehouse is present embodiments provided, refer to Fig. 1, including:
The sound that S101, the different sound sources of collection are sent, obtains at least one source acoustical signal;
S102, the first voiceprint is extracted from the acoustical signal of source;
S103, by the voiceprint storehouse in each species word bank in the first voiceprint extracted and animal data storehouse Sound-groove model matched respectively, when the match is successful, source acoustical signal corresponding to the first voiceprint that the match is successful is protected In the presence of in the multimedia information lib of corresponding species word bank;
Second voiceprint of each source acoustical signal in S104, the same species word bank of analysis, and according to analysis result to thing Source acoustical signal in seed bank carries out cluster operation, determines the sound of Different Individual in species word bank;Wherein, animal data storehouse Include at least one species word bank, each species word bank includes voiceprint storehouse and multimedia information lib.
Animal data storehouse in the present embodiment, is established as needed, specifically according to classification and the animal of statistics The needs of species;Animal data storehouse includes at least one species word bank, and each species word bank corresponds to an animal species, so Animal species corresponding to different species word banks are different;In each species word bank, including voiceprint storehouse and multimedia letter Storehouse is ceased, wherein, voiceprint storehouse is the sounding sample for gathering animal corresponding to the species word bank, extracts first in sounding sample Voiceprint is made up of these sound-groove models as sound-groove model, and multimedia information lib, is to be used to corresponding preserve gather The source acoustical signal arrived, and corresponding preservation picture signal and vision signal, that is to say, that more when just being established in animal data storehouse The content in media information storehouse is sky.The structural representation in animal data storehouse refer to table 1:
Table 1
In animal data storehouse, the structural representation of the multimedia information lib in each species word bank refer to table 2, every in table Individual project corresponds to some individual multi-media signal of the animal species and relevant information in the collection of some time respectively.Initialization When multimedia information lib in content for sky.
Table 2
Sound around detecting, is that periodically the sound of surrounding is monitored, and the purpose of monitoring, which mainly obtains, to be moved The cry of thing;After getting the cry of animal, it is possible to which the cry of animal is acquired.It is worth noting that, sounding and It is not necessarily an animal individual, it is likely that have multiple individuals, it will not be same sound source hair that can use sound cutting technique The sound gone out makes a distinction, specifically, be exactly the time point that different sounder status transformations are found out in continuous sound, so as to It is in time that the signal for belonging to different animals individual is separated;Sound cutting technique, the pattra leaves based on model can be utilized This algorithm, intersect the methods of likelihood ratio algorithm to realize.
Further, in order to obtain the simplicity of animal sounds, detecting the sound of surrounding includes:In sound around detecting Whether movable sound is had, if so, then gathering the movable sound.Here movable sound, refer to that sound signal intensity reaches certain The sound of degree;In the environment of surrounding, the general information of our generally known animal species, the intensity of phonation model of each species Enclosing to determine, that is to say, that the sound that audible signal intensity reaches minimum animal vocalization intensity can be directly collected, so as to Avoid collecting the sound of excessive non-animal, influence the correctness of matching.
In addition, when gathering sound, sound source direction can also be carried out to sound, the direction of collection is pointed to what sound was sent Orientation;Specifically, pick up facility when gathering sound, can use the modes such as microphone array, the sound is positioned, made Point to the orientation where sound in pickup direction.
While sound is gathered, it can also be taken pictures and/or be imaged, picture signal corresponding to collection and/or video Signal, picture signal and vision signal can be acquired according to the acquisition interval of setting;Similar, in collection picture signal And/or during vision signal, the direction of collection can be pointed to the direction in direction, i.e. sound source that sound is sent, determine method and sound The mode of source positioning is consistent.
First voiceprint is a kind of characteristic information of voice signal, and it includes but is not limited to MFCC in the present embodiment (Mel-Frequency Cepstral Coefficients, mel-frequency cepstrum coefficient), LPCC (Linear Predict Cepstral Coefficients, linear prediction residue error) etc. a kind of voiceprint that can distinguish different animals species. The first voiceprint is extracted from the acoustical signal of source, is extracted in the sound exactly sent from these each sound sources collected The parameters such as MFCC, LPCC, different animal species, itself MFCC and LPCC parameter is different.
After the first voiceprint of each source acoustical signal is extracted, by these voiceprints and each species word bank Sound-groove model in voiceprint storehouse is matched;When being matched, two kinds of possibility that the match is successful He it fails to match be present, With success, refer to that the first voiceprint matches with the sound-groove model in some voiceprint storehouse, now assert first vocal print Source acoustical signal corresponding to information is that the animal for belonging to animal species corresponding to this voiceprint storehouse issues;And match and lose Lose, then refer to can not find any one voiceprint storehouse for first voiceprint it is matched, in this case, can To establish a new species word bank based on this first voiceprint that it fails to match, this species word bank equally includes Voiceprint storehouse and multimedia information lib, train to obtain sound-groove model and be saved according to the first voiceprint of the source acoustical signal In the species voiceprint storehouse of the species word bank, and corresponding source acoustical signal is stored in the multimedia information lib of the species word bank In, while the information such as picture signal and/or vision signal, acquisition time, collecting location corresponding to preservation.
After the match is successful in the first voiceprint, then source acoustical signal corresponding to this first voiceprint is stored in first In multimedia information lib corresponding to the voiceprint voiceprint storehouse that the match is successful, each source sound in a multimedia information lib Signal individually preserves;, should if also being corresponded to when gathering the source acoustical signal and acquiring picture signal and/or vision signal Picture signal and/or vision signal preservation corresponding with the source acoustical signal concurrently gathered;Further, can also will be with collection The unidirectional picture signal of source acoustical signal and/or vision signal are corresponding to be preserved, the highly preferred mode of the present embodiment, be by with Collection source acoustical signal same time and the corresponding preservation of unidirectional picture signal and/or vision signal.
In addition, when preserving source acoustical signal and corresponding picture signal and/or vision signal, can also record Gather the time of signal;In addition to recording acquisition time, the place of collection can also be recorded, the definition of collecting location can be Position where the animal individual of sounding, the recording mode in this place can be various, can use the mode of mark, Coordinate system can be pre-established, collecting location is recorded as coordinate or coordinate range.And the determination of specific collecting location, It can be position taken pictures/imaged etc. from the position of the position of each collecting device, such as pick up facility.
, can be with when the match is successful for the sound-groove model in the first voiceprint of source acoustical signal and some voiceprint storehouse The voiceprint storehouse is updated with first voiceprint that the match is successful, i.e. root first voiceprint trains to obtain corresponding sound Line model.
The present embodiment to different animal species in addition to it can sort out, in respective species word bank, we Have been obtained for the not homologous acoustical signal for belonging to identical species;Because each animal individual is not only to send out an infrasonic sound, that is, Say, an animal individual is on the premise of multiple sound is sent, and its final result is all sound that an animal individual is sent, especially It is the sound sent the different time, the multimedia information lib of same species word bank can be stored in as different source acoustical signals In.So, multiple source acoustical signals of same animal individual will be preserved in same multimedia information lib;So in order to just In the individual amount of the same species of statistics, while multimedia database is established to each individual, so as to support to each animal The behavioral study of body, the second voiceprint of the source acoustical signal in the multimedia information lib in each species word bank can be carried out Then these source acoustical signals are clustered by analysis according to analysis result.
Specifically, the process of above-mentioned cluster can be described as following steps:By the source sound letter in each multimedia information lib Number the second voiceprint of voice signal fragment formed is compared two-by-two;From each comparison result, two gaps are selected The second minimum voiceprint, by its gap value compared with preset value;Result of the comparison is divided into two kinds of situations:It is first, poor It is more than preset value away from value, then stops comparing, processing procedure terminates;Second, gap value is less than preset value, now then by the two Voice signal fragment corresponding to two voiceprints synthesizes, then by all including the voice signal fragment after this synthesis Voice signal fragment be compared two-by-two again, the voiceprint two of the voice signal fragment from by each multimedia information lib Two are compared and start to repeat above step, until not having any two second sound signal piece in the multimedia information lib Untill gap value between section is less than preset value.At the end of analysis and processing procedure, due to the sound of each animal individual being believed Number fragment is merged, then now easily can count corresponding species according to the situation of the voice signal fragment after merging How many animal individual in word bank.The each animal individual identified is marked, the species word bank obtained according to cluster The corresponding relation of project and animal individual in the corresponding relation namely multimedia information lib of middle source acoustical signal and animal individual, will Animal individual corresponding to each project is recorded in " affiliated animal individual " row of multimedia information lib shown in table 2.
Further, Fig. 2 is refer to, the specific implementation step of cluster is as follows:
S104a, estimate each voice signal fragment feature samples set Gauss model (μ, Σ):To voice signal Feature samples set { the X of fragmenti, i=1 ..., k }, according to the following formula calculate sample set mean μ and covariance Σ:
Wherein k represents the sample points in voice signal fragment, and i represents sample point index, feature samples XiIt is column vector, (*)TRepresent transposition operation.Feature includes but is not limited to the parameters such as MFCC or LPCC described in this implementation example.
S104b, calculate the Generalized Likelihood Ratio distance of all voice signal fragments between any two.Generalized Likelihood Ratio distance is every The product of the individual individually log-likelihood of voice signal fragment sample set by two independent voice signal fragments with being mixed Sample set log-likelihood ratio.The log-likelihood of sample set is that own in voice signal fragment sample set Log-likelihood sum of the sample point to Gauss model:
Wherein d represents sample vector dimension, | * | representing matrix determinant.
S104c, calculate similarity Δ BIC value of the Generalized Likelihood Ratio apart from two voice signal fragments corresponding to minimum value. Δ BIC subtracts 0.5 λ (d+0.5d (d+1)) logN with Generalized Likelihood Ratio distance and obtained, and λ is penalty threshold, and d ties up for sample vector Number, N are sample number total in two voice signal fragments.If Δ BIC is less than 0, merge two voice signal fragments, estimation merges The Gauss model of voice signal fragment afterwards, and go to S104b and continue executing with;Otherwise, cluster is completed.
Similitude between voice signal fragment except Generalized Likelihood Ratio distance can be utilized to calculate, can also utilize pair Divergence algorithm is claimed to calculate.
Species multimedia information lib for carrying out cluster operation, the information bank is saved into when there is new source acoustical signal When, quick clustering is carried out, i.e.,:Using the set of all voice signals corresponding to each animal individual in existing cluster result as one Individual cluster initial input, while using each new source acoustical signal as a cluster initial input, clustered.
Species multimedia information lib for carrying out cluster operation, the information bank is saved into when there is new source acoustical signal When, it can also proceed as follows:By existing all voice signals in the information bank with new source acoustical signal together as cluster Initial input, clustered.
Species multimedia information lib for carrying out cluster operation, the information bank is saved into when there is new source acoustical signal When, it can also proceed as follows:Based on existing all voice signals in the information bank, action is entered to the new source acoustical signal Thing individual identification., will if the new source acoustical signal is recognized as belonging to some existing animal individual in the information bank " affiliated animal individual " information of the new affiliated project of source acoustical signal is designated as the mark of the animal individual;It is if described new Source acoustical signal is recognized as being not belonging to any existing animal individual in the information bank, then creates a new animal individual mark Note, and " affiliated animal individual " information of the new affiliated project of source acoustical signal is designated as the mark.
When the source acoustical signal preserved in the multimedia information lib to species word bank clusters, it can also impose a condition, Such as only when the active acoustical signal total amount of institute of preservation reaches predetermined threshold value, then carry out cluster operation.
The first voiceprint in the present embodiment is used in the voiceprint storehouse in source acoustical signal and each species word bank Sound-groove model is matched, and the second voiceprint is used to enter the source acoustical signal in the multimedia information lib in each species word bank Row cluster, because it is used for different scenes, the first voiceprint is applied to the differentiation of species, and the second voiceprint is then used for same Differentiation between one species Different Individual, therefore its first voiceprint and the second voiceprint are probably different parameters, when So, it is also possible to which identical parameter, this does not have strict restriction, as long as selected parameter can reach required effect i.e. Can.
Embodiment two
A kind of method for safeguarding animal data storehouse is present embodiments provided, as shown in figure 3, including:
S200, initialization:According to the animal species to be monitored of setting, animal data storehouse is built, animal data storehouse includes each Individual species word bank, species word bank include voiceprint storehouse and multimedia information lib.The topology example in animal data storehouse is shown in Table 1.
When voiceprint storehouse is established under species word bank, the sounding sample of animal species is gathered, the of extraction sounding sample One voiceprint establishes sound-groove model, forms voiceprint storehouse.First voiceprint described in this implementation example includes but unlimited In:The one kind such as MFCC or LPCC can distinguish the vocal print feature information of different animals species.
The topology example of multimedia information lib is shown in Table 3, each project in table correspond to respectively in the collection of some time should Some individual multi-media signal of animal species and relevant information.Content is sky in multimedia information lib when initialization.
Table 3
S201, single fixed pick up facility carry out continuous pickup and detection to the sound of surrounding, when detection judgement has work When dynamic sound occurs, start sound cutting, untill detection judges the not active sound of sound at current time, obtain at least One source acoustical signal.Movable sound refers to that signal energy reaches the sound of certain numerical value.Sound cutting is looked into continuous sound The time point of different sounder status transformations is found out, so as in time that the signal for belonging to different animals individual is separated. Specifically, the technology of sound cutting can utilize the bayesian algorithm based on model, intersect the methods of likelihood ratio algorithm to realize.
While sound cutting is carried out, continuously taken pictures according to the time interval of setting, obtain at least one image Signal.
When detecting that movable sound occurs, microphone array can also be used to carry out sound source direction to the sound, make to pick up Orientation where sound is pointed in sound direction and direction of taking pictures.
S202, each voiceprint of source acoustic signal extraction first obtained to cutting, with each vocal print in animal data storehouse Sound-groove model in information bank is matched.If the match is successful, one is created in the multimedia information lib of corresponding animal species Individual new project, source acoustical signal is preserved into the voice signal into the project, while preserve corresponding picture signal and acquisition time Deng multimedia messages;If matching is unsuccessful, the information such as source acoustical signal and corresponding picture signal and acquisition time is abandoned, or The newly-built species word bank of person, the information such as the unsuccessful source acoustical signal of the matching and corresponding picture signal and acquisition time are preserved In the new species word bank.
In step S202, if the match is successful, the first voiceprint of source acoustical signal can also be utilized to corresponding animal Sound-groove model in the voiceprint storehouse of species is updated.
S203, the multimedia information lib for creating new projects, analysis is passed through to all voice signals wherein preserved Its second voiceprint is clustered, and judges the quantity of animal individual in multimedia information lib, to each individual identified It is marked, the corresponding relation namely multimedia of voice signal and animal individual in the multimedia information lib obtained according to cluster The corresponding relation of project and animal individual in information bank, animal individual corresponding to each project is recorded in more shown in table 3 In " affiliated animal individual " row in media information storehouse.
Cluster is according to the similitude between voice signal, similar voice signal is classified as into one kind, it is believed that be classified as one kind Different voice signal fragments be to come from same animal individual.Fig. 2 is refer to, the specific implementation step of cluster is as follows:
The Gauss model (μ, Σ) of the feature samples set for the voice signal fragment that S104a, estimation source acoustical signal are formed:It is right Feature samples set { the X of voice signal fragmenti, i=1 ..., k }, according to the following formula calculate sample set mean μ and covariance Σ:
Wherein k represents the sample points in voice signal fragment, and i represents sample point index, feature samples XiIt is column vector, (*)TRepresent transposition operation.Feature includes but is not limited to the parameters such as MFCC or LPCC described in this implementation example.
S104b, calculate the Generalized Likelihood Ratio distance of all voice signal fragments between any two.Generalized Likelihood Ratio distance is every The product of the individual individually log-likelihood of voice signal fragment sample set by two independent voice signal fragments with being mixed Sample set log-likelihood ratio.The log-likelihood of sample set is that own in voice signal fragment sample set Log-likelihood sum of the sample point to Gauss model:
Wherein d represents sample vector dimension, | * | representing matrix determinant.
S104c, calculate similarity Δ BIC value of the Generalized Likelihood Ratio apart from two voice signal fragments corresponding to minimum value. Δ BIC subtracts 0.5 λ (d+0.5d (d+1)) logN with Generalized Likelihood Ratio distance and obtained, and λ is penalty threshold, and d ties up for sample vector Number, N are sample number total in two voice signal fragments.If Δ BIC is less than 0, merge two voice signal fragments, estimation merges The Gauss model of voice signal fragment afterwards, and go to S104b and continue executing with;Otherwise, cluster is completed.
Similitude between voice signal fragment except Generalized Likelihood Ratio distance can be utilized to calculate, can also utilize pair Divergence algorithm is claimed to calculate.
Multimedia information lib for carrying out cluster operation, when there is new source acoustical signal to be saved into the information bank, enter Row quick clustering, i.e.,:It is poly- using the set of all voice signals corresponding to each animal individual in existing cluster result as one Class initial input, while using each new source acoustical signal as a cluster initial input, clustered.
Multimedia information lib for carrying out cluster operation, when there is new source acoustical signal to be saved into the information bank, It can proceed as follows:Existing active acoustical signal in the information bank is initial defeated together as cluster with new source acoustical signal Enter, clustered.
Species multimedia information lib for carrying out cluster operation, the information bank is saved into when there is new source acoustical signal When, it can also proceed as follows:Based on existing all voice signals in the information bank, action is entered to the new source acoustical signal Thing individual identification., will if the new source acoustical signal is recognized as belonging to some existing animal individual in the information bank " affiliated animal individual " information of the new affiliated project of source acoustical signal is designated as the mark of the animal individual;It is if described new Source acoustical signal is recognized as being not belonging to any existing animal individual in the information bank, then creates a new animal individual mark Note, and " affiliated animal individual " information of the new affiliated project of source acoustical signal is designated as the mark.
When the source acoustical signal preserved in the multimedia information lib to species word bank clusters, it can also impose a condition, Such as only when the active acoustical signal total amount of institute of preservation reaches predetermined threshold value, then carry out cluster operation.Embodiment three
This implementation example describes another method for safeguarding animal data storehouse.Specific implementation flow such as Fig. 4 of this method Shown, this method includes:
S300, initialization:According to the animal species to be monitored of setting, animal data storehouse is built, animal data storehouse includes each Individual species word bank, species word bank include voiceprint storehouse and multimedia information lib.The topology example in animal data storehouse is shown in Table 1.
When voiceprint storehouse is established under species word bank, the sounding sample of animal species is gathered, the of extraction sounding sample One voiceprint establishes sound-groove model, forms voiceprint storehouse.First voiceprint described in this implementation example includes but unlimited In:The one kind such as MFCC or LPCC can distinguish the vocal print feature information of different animals species.
The topology example of multimedia information lib is shown in Table 4, each project in table correspond to respectively in the collection of some time should Some individual multi-media signal of animal species and relevant information.Content is sky in multimedia information lib when initialization.
Table 4
S301, single fixed pick up facility carry out continuous pickup and detection to the sound of surrounding, when detection judgement has work When dynamic sound occurs, start sound cutting, untill detection judges the not active sound of sound at current time, obtain at least One source acoustical signal.Movable sound refers to that signal energy reaches the sound of certain numerical value.Sound cutting is looked into continuous sound The time point of different sounder status transformations is found out, so as in time that the signal for belonging to different animals individual is separated. Specifically, the technology of sound cutting can utilize the bayesian algorithm based on model, intersect the methods of likelihood ratio algorithm to realize.
While sound cutting is carried out, recorded a video.
When detecting that movable sound occurs, microphone array can also be used to carry out sound source direction to the sound, make to pick up Orientation where sound is pointed in sound direction and video recording direction.
The sliced time point for cutting to obtain using sound, the video obtained to video recording are temporally split, obtained accordingly It is associated at least one video segment, and with the source acoustical signal of corresponding time.
S302, each voiceprint of source acoustic signal extraction first obtained to cutting, with each vocal print in animal data storehouse Sound-groove model in information bank is matched.If the match is successful, one is created in the multimedia information lib of corresponding animal species Individual new project, source acoustical signal is preserved into the voice signal into the project, while preserve corresponding video segment and acquisition time Deng multimedia messages;If matching is unsuccessful, the information such as source acoustical signal and corresponding video segment and acquisition time is abandoned, or The newly-built species word bank of person, the information such as the unsuccessful source acoustical signal of the matching and corresponding picture signal and acquisition time are preserved In the new species word bank.
In step s 302, if the match is successful, the first voiceprint of source acoustical signal can also be utilized to corresponding animal Sound-groove model in the voiceprint storehouse of species is updated.
S303, the multimedia information lib for creating new projects, analysis is passed through to all voice signals wherein preserved Its second voiceprint is clustered, and judges the quantity of animal individual in multimedia information lib, to each individual identified It is marked, the corresponding relation namely multimedia of voice signal and animal individual in the multimedia information lib obtained according to cluster The corresponding relation of project and animal individual in information bank, animal individual corresponding to each project is recorded in more shown in table 4 In " affiliated animal individual " row in media information storehouse.
Wherein, the mode of cluster is consistent with embodiment one and embodiment two, repeats no more here.
Example IV
Another method for safeguarding animal data storehouse is present embodiments provided, refer to Fig. 5, including:
S400, initialization:According to the animal species to be monitored of setting, animal data storehouse, number of animals are built in server end Each species word bank is included according to storehouse, species word bank includes voiceprint storehouse and multimedia information lib.The structure in animal data storehouse is shown Example is shown in Table 1.
When voiceprint storehouse is established under species word bank, the sounding sample of animal species is gathered, the of extraction sounding sample One voiceprint establishes sound-groove model, forms voiceprint storehouse.First voiceprint described in this implementation example includes but unlimited In:The one kind such as MFCC or LPCC can distinguish the vocal print feature information of different animals species.
The topology example of multimedia information lib is shown in Table 5, and each project in table corresponds in some place of some time respectively Some individual multi-media signal of the animal species and relevant information of collection.Content in multimedia information lib when initialization For sky.
Table 5
S401, sound progress continuous pickup and detection of each client of different location to surrounding are distributed in, work as inspection When survey judgement is that movable sound occurs, start sound cutting, until detection judges that the not active sound of sound at current time is Only, at least one source acoustical signal is obtained.Movable sound refers to that signal energy reaches the sound of certain numerical value.Carrying out sound cutting While, continuously taken pictures according to the time interval of setting, obtain at least one picture signal.Obtained source sound letter will be cut Number picture signal associated with the time, together with corresponding acquisition time and collecting location information, is sent to server end. Sound cutting is the time point that different sounder status transformations are found out in continuous sound, different so as to belong in time The signal of animal individual is separated.Specifically, the technology of sound cutting can utilize the bayesian algorithm based on model, intersect The methods of likelihood ratio algorithm, is realized.
When client detection judgement is that movable sound occurs, first can also be cut without sound, but will sentence from detection Disconnected is that the voice signal that movable sound occurs untill detection judges the not active sound of sound at current time preserves, together When during this period of time continuously taken pictures, obtain at least one picture signal.By voice signal and picture signal, together with corresponding Acquisition time and collecting location information together, be sent to server end.Voice signal is cut in server end, obtained The picture signal that at least one source acoustical signal associates with the time.
S402, in server end, each obtained source is cut to each the source acoustical signal received or after receiving The voiceprint of acoustic signal extraction first, matched with the sound-groove model in each voiceprint storehouse in animal data storehouse.If With success, a new project is created in the multimedia information lib of corresponding animal species, source acoustical signal is preserved into the project Voice signal, while preserve the multimedia messages such as corresponding picture signal, acquisition time, collecting location;If matching not into Work(, the information such as source acoustical signal and corresponding picture signal and acquisition time, or newly-built species word bank are abandoned, by the matching not The successful information such as source acoustical signal and corresponding picture signal and acquisition time is stored in the new species word bank.
In step S402, if the match is successful, the first voiceprint of source acoustical signal can also be utilized to corresponding animal Sound-groove model in the voiceprint storehouse of species is updated.
S403, in server end, the multimedia information lib for creating new projects, all sound wherein preserved are believed Number clustered by analyzing its second voiceprint, the quantity of animal individual in multimedia information lib is judged, to identifying Each individual be marked, according to the corresponding relation of voice signal and animal individual in the obtained multimedia information lib of cluster, Namely the corresponding relation of project and animal individual in multimedia information lib, animal individual corresponding to each project is recorded in In " affiliated animal individual " row of multimedia information lib shown in table 5.
Embodiment five
Another method for safeguarding animal data storehouse is present embodiments provided, refer to Fig. 6, including:
S600, initialization:According to the animal species to be monitored of setting, animal data storehouse, number of animals are built in server end Each animal species word bank is included according to storehouse, species word bank includes species voiceprint storehouse and species multimedia information lib.Number of animals 1 is shown in Table according to the topology example in storehouse.
When species voiceprint storehouse is established under species word bank, the sounding sample of animal species is gathered, extracts sounding sample The first voiceprint establish sound-groove model, form species voiceprint storehouse.First voiceprint bag described in this implementation example Include but be not limited to:The one kind such as MFCC or LPCC can distinguish the vocal print feature information of different animals species.
The topology example of species multimedia information lib is shown in Table 5, each project in table correspond to respectively some time some Some individual multi-media signal of the animal species and relevant information of place collection.Species multimedia messages when initialization Content is sky in storehouse.
S601, client user actively enroll animal sounds signal and obtain source acoustical signal, and during together with corresponding collection Between and collecting location information together, be sent to server end;Animal painting can also be shot, is associated with source acoustical signal together with simultaneously It is sent to server end.As explorer records one section of animal sounds signal, some animal paintings of shooting in the wild, and it is sent to clothes Business device end.
S602, in server end, to each voiceprint of source acoustic signal extraction first received, with animal data storehouse In sound-groove model in each species voiceprint storehouse matched.If the match is successful, in the multimedia letter of corresponding animal species Cease and a new project is created in storehouse, source acoustical signal is preserved into the voice signal into the project, while preserve corresponding image letter Number, the multimedia messages such as acquisition time, collecting location;If matching is unsuccessful, newly-built species in animal data storehouse Storehouse, train to obtain sound-groove model and be saved in the species voiceprint storehouse of the species word bank according to source acoustical signal, in the species A new project is created in the multimedia information lib of word bank, source acoustical signal is preserved into the voice signal into the project, protected simultaneously Deposit the multimedia messages such as corresponding picture signal, acquisition time, collecting location.
In step S602, if the match is successful, the first voiceprint of source acoustical signal can also be utilized to corresponding animal Sound-groove model in the species voiceprint storehouse of species is updated.When sound-groove model is updated, it can impose a condition, than As being only just updated when corresponding to the animal species to be monitored that animal species are not belonging to set during initialization.
S603, in server end, the species multimedia information lib for creating new projects is sound to the institute that wherein preserves Sound signal is clustered by analyzing its second voiceprint, judges the quantity of animal individual in species multimedia information lib, The each individual identified is marked, voice signal and animal individual in the species multimedia information lib obtained according to cluster Corresponding relation namely species multimedia information lib in the corresponding relation of project and animal individual, will corresponding to each project move Thing individual mark is recorded in " affiliated animal individual " row of multimedia information lib shown in table 5.
Embodiment six
A kind of device for safeguarding animal data storehouse is present embodiments provided, refer to Fig. 7, including:
Sound acquisition module 51, the sound sent for gathering different sound sources, obtain at least one source acoustical signal;
Extraction module 52, for extracting corresponding first voiceprint from the acoustical signal of source;
Matching module 53, for the first voiceprint and the sound in each species word bank in animal data storehouse that will be extracted Sound-groove model in line information bank is matched respectively;
Classifying module 54, for when the match is successful for matching module, by source corresponding to the first voiceprint that the match is successful Acoustical signal is stored in the multimedia information lib of corresponding species word bank;
Cluster module 56, for analyzing the second voiceprint of each source acoustical signal in same species word bank, and according to point Analyse result and cluster operation is carried out to the source acoustical signal in the species word bank, determine the sound of Different Individual in the species word bank Sound.
Also include sound-groove model acquisition module 59;Animal data storehouse in the present embodiment, is established as needed, tool Body is the needs of the animal species according to classification and statistics;Animal data storehouse includes at least one species word bank, each species The corresponding animal species of word bank, so animal species corresponding to different species word banks are different;In each species word bank, Including voiceprint storehouse and multimedia information lib, wherein, sound-groove model acquisition module 59 is to be used to obtain sound-groove model, specifically Sounding sample including gathering animal species corresponding to each species word bank, extracts the first voiceprint in sounding sample As sound-groove model, voiceprint storehouse is formed by these sound-groove models, and multimedia information lib, it is to be used to corresponding preservation collect Source acoustical signal, and corresponding preserve picture signal and vision signal, that is to say, that when just being established in animal data storehouse, more matchmakers The content of body information bank is sky.
Sound around detecting, is that periodically the sound of surrounding is monitored, and the purpose of monitoring, which mainly obtains, to be moved The cry of thing;After getting the cry of animal, it is possible to which the cry of animal is acquired.It is worth noting that, sounding and It is not necessarily an animal individual, it is likely that have multiple individuals, it will not be same sound source hair that can use sound cutting technique The sound gone out makes a distinction, specifically, be exactly the time point that different sounder status transformations are found out in continuous sound, so as to It is in time that the signal for belonging to different animals individual is separated;Sound cutting technique, the pattra leaves based on model can be utilized This algorithm, intersect the methods of likelihood ratio algorithm to realize.
Sound acquisition module 51 includes judging submodule 511 and collection submodule 512;Judging submodule 511 is used to judge Whether there is movable sound in the sound of surrounding, if so, collection submodule 512 gathers the movable sound.Here movable sound, it is Refer to sound signal intensity and reach a certain degree of sound;In the environment of surrounding, the general letter of our generally known animal species Breath, the range of voice of each species can also determine, that is to say, that can directly collect range of voice and reach minimum animal vocalization The sound of intensity, so as to avoid the sound for collecting excessive non-animal, influence the correctness of matching.
Sound acquisition module 51 also includes orientation submodule 513, for being oriented to movable sound, makes the direction of collection Point to the orientation that sound is sent;Specifically, orientation submodule 513 here, which can be microphone array etc., can realize that sound is determined The device of position.
Also include image collecting module 55, for being taken pictures and/or being imaged while sound is gathered, corresponding to collection The collection of picture signal and/or vision signal, picture signal and vision signal can be by way of timing acquiring, that is, sets Fixed certain acquisition interval is acquired;Similar, can be by the side of collection when gathering picture signal and/or vision signal To consistent with collection sound, the direction in direction, i.e. sound source that sound is sent is pointed to, determines method and the mode one of auditory localization Cause.
Classifying module 54 includes newly-built submodule 541;After the first voiceprint of each source acoustical signal is extracted, matching Module 53 is matched these first voiceprints with the sound-groove model in the voiceprint storehouse in each species word bank;Since It is matching, certainly with the possibility that the match is successful He it fails to match, the match is successful, refers to voiceprint and some voiceprint storehouse In any one sound-groove model the match is successful, now then assert that source acoustical signal corresponding to the voiceprint is and this voiceprint The consistent animal of animal species corresponding to species word bank corresponding to storehouse issues;And it fails to match, then it represents that the vocal print is believed Breath all can not find matching sound-groove model in any one voiceprint storehouse, and such situation typically has two kinds:Its One, this voiceprint is not the voiceprint of animal, it may be possible to the first voiceprint of other abiotic sound sent; Second, this first voiceprint is for we does not add corresponding sound-groove model when establishing animal data storehouse, in other words, It is considered that this animal without addition, is not the scope that we classify and counted;Newly-built submodule 541 is to be used for this A new species word bank is established based on the individual voiceprint that it fails to match, this species word bank equally includes voiceprint storehouse And multimedia information lib, it is stored in voiceprint storehouse, is classified as sound-groove model after first voiceprint is trained And statistics.
After in voiceprint, the match is successful, classifying module 54 then preserves source acoustical signal corresponding to this first voiceprint It is each in a multimedia information lib in multimedia information lib corresponding to the first voiceprint voiceprint storehouse that the match is successful Individual source acoustical signal individually preserves;Image collecting module 55 also includes preserving submodule 551, if for gathering the source sound letter Number when also correspond to and acquire picture signal and/or vision signal, then adopt the picture signal and/or vision signal and concurrently The source acoustical signal of collection is corresponding to be preserved;Further, can also will be with the unidirectional picture signal of collection source acoustical signal and/or regarding Frequency signal is corresponding to be preserved, the highly preferred mode of the present embodiment, is by time same with collection source acoustical signal and unidirectional Picture signal and/or the corresponding preservation of vision signal.
In addition, in addition to logging modle 57, for preserving source acoustical signal and corresponding picture signal and/or regarding During frequency signal, the time of collection signal is recorded;In addition to recording acquisition time, the place of collection can also be recorded, is gathered The definition in place can be the position where the animal individual of sounding, the recording mode in this place can be it is various, can be with With the mode of mark, coordinate system can also be pre-established, collecting location is recorded as coordinate or coordinate range.And have The determination of body collecting location, information and the direction of collection that can be with reference to picture signal and/or in vision signal determine jointly; In addition, the definition of this collecting location can also be the position of each collecting device, such as the position of pick up facility, clap According to the position etc. of/shooting.
The present embodiment additionally provides update module 58, for when the first voiceprint and some voiceprint of source acoustical signal Sound-groove model in storehouse updates the voiceprint storehouse, i.e., by this when the match is successful with first voiceprint that the match is successful After the voiceprint training that the match is successful, added in voiceprint storehouse, as sound-groove model, it can so follow up in real time each The sounding situation of animal species, directive property are clearer and more definite.
The present embodiment to different animal species in addition to it can sort out, in respective species word bank, we Have been obtained for the not homologous acoustical signal for belonging to identical species;Because each animal individual is not only to send out an infrasonic sound, that is, Say, an animal individual is on the premise of multiple sound is sent, and its final result is all sound that an animal individual is sent, especially It is the sound sent the different time, the multimedia information lib of same species word bank can be stored in as different source acoustical signals In.So, multiple source acoustical signals of same animal individual will be preserved in same multimedia information lib;So in order to just In the individual amount of the same species of statistics, while multimedia database is established to each individual, so as to support to each animal The behavioral study of body, the device for safeguarding animal data storehouse in the present embodiment also includes cluster module 56, for each species Second voiceprint of the source acoustical signal in the multimedia information lib in word bank is analyzed, then according to analysis result to these Source acoustical signal is clustered.
Specifically, cluster module 56 includes comparing submodule 561, analysis submodule 562, cluster submodule 563;Comparer Module 561 is used for the second voiceprint of the voice signal fragment for forming the source acoustical signal in each multimedia information lib two-by-two It is compared;Analyze submodule 562 to be used for from each comparison result, select the second minimum voiceprint of two gaps, will Its gap value is compared with preset value;Result of the comparison is divided into two kinds of situations:First, gap value is more than preset value, then Stop comparing, processing procedure terminates;Second, gap value is less than preset value, now by cluster submodule 563 by the two rising tones Voice signal fragment corresponding to line information synthesizes a voice signal fragment;Then, analysis submodule 562 includes all The voice signal fragment including voice signal fragment after this synthesis is compared two-by-two again, repeats above step, directly There is no the gap value between the second voiceprint of any two voice signal fragment to be less than into the multimedia information lib default Untill value.
Also include statistical module 50 in the present embodiment, due to cluster process at the end of, the multimedia of all species word banks Voice signal fragment in information bank is that different animal individuals is sent, and statistical module 50 is for direct at this moment The quantity of the voice signal fragment in the multimedia information lib of each species word bank is counted, it is true according to the quantity of voice signal fragment How many animal individual in fixed corresponding animal word bank.
Further, Fig. 2 is refer to, the specific implementation step of cluster is as follows:
S104a, estimate each voice signal fragment feature samples set Gauss model (μ, Σ):To voice signal Feature samples set { the X of fragmenti, i=1 ..., k }, according to the following formula calculate sample set mean μ and covariance Σ:
Wherein k represents the sample points in voice signal fragment, and i represents sample point index, feature samples XiIt is column vector, (*)TRepresent transposition operation.Feature includes but is not limited to the parameters such as MFCC or LPCC described in this implementation example.
S104b, calculate the Generalized Likelihood Ratio distance of all voice signal fragments between any two.Generalized Likelihood Ratio distance is every The product of the individual individually log-likelihood of voice signal fragment sample set by two independent voice signal fragments with being mixed Sample set log-likelihood ratio.The log-likelihood of sample set is that own in voice signal fragment sample set Log-likelihood sum of the sample point to Gauss model:
Wherein d represents sample vector dimension, | * | representing matrix determinant.
S104c, calculate similarity Δ BIC value of the Generalized Likelihood Ratio apart from two voice signal fragments corresponding to minimum value. Δ BIC subtracts 0.5 λ (d+0.5d (d+1)) logN with Generalized Likelihood Ratio distance and obtained, and λ is penalty threshold, and d ties up for sample vector Number, N are sample number total in two voice signal fragments.If Δ BIC is less than 0, merge two voice signal fragments, estimation merges The Gauss model of voice signal fragment afterwards, and go to 104b and continue executing with;Otherwise, cluster is completed.
Similitude between voice signal fragment except Generalized Likelihood Ratio distance can be utilized to calculate, can also utilize pair Divergence algorithm is claimed to calculate.
When the source acoustical signal preserved in the multimedia information lib to species word bank clusters, it can also impose a condition, Such as only when the active acoustical signal total amount of institute of preservation reaches predetermined threshold value, then carry out cluster operation.
In addition, the present embodiment additionally provides a kind of system for safeguarding animal data storehouse, including animal data storehouse and upper That states safeguards the device in animal data storehouse, and animal data storehouse includes at least one species word bank, and each species word bank includes vocal print Information bank and multimedia information lib, voiceprint storehouse include at least one sound-groove model;Each species in animal data storehouse Animal species are respectively different corresponding to word bank.
One of ordinary skill in the art will appreciate that all or part of step in the above method can be instructed by program Related hardware is completed, and described program can be stored in computer-readable recording medium, such as read-only storage, disk or CD Deng.Alternatively, all or part of step of above-described embodiment can also be realized using one or more integrated circuits, accordingly Ground, each module/unit in above-described embodiment can be realized in the form of hardware, can also use the shape of software function module Formula is realized.The present invention is not restricted to the combination of the hardware and software of any particular form.
Above content is to combine specific embodiment further description made for the present invention, it is impossible to assert this hair Bright specific implementation is confined to these explanations.For general technical staff of the technical field of the invention, do not taking off On the premise of from present inventive concept, some simple deduction or replace can also be made, should all be considered as belonging to the protection of the present invention Scope.

Claims (25)

  1. A kind of 1. method for safeguarding animal data storehouse, it is characterised in that including:
    The sound that different sound sources are sent is gathered, obtains at least one source acoustical signal;
    The first voiceprint is extracted from the source acoustical signal;
    By the sound in the voiceprint storehouse in each species word bank in first voiceprint extracted and animal data storehouse Line model is matched respectively;When the match is successful, source acoustical signal corresponding to the first voiceprint that the match is successful is stored in In the multimedia information lib of corresponding species word bank;
    The second voiceprint of each source acoustical signal in the species word bank is analyzed, and according to analysis result to the species word bank In source acoustical signal carry out cluster operation, determine the sound of the Different Individual in the species word bank;Wherein, the number of animals Include at least one species word bank according to storehouse, each species word bank includes voiceprint storehouse and multimedia information lib.
  2. 2. the as claimed in claim 1 method for safeguarding animal data storehouse, it is characterised in that the different sound sources of the collection are sent Sound include:Whether the sound around detection has movable sound;If detecting movable sound, the movable sound is carried out Collection, and carry out sound cutting.
  3. 3. the method as claimed in claim 2 for safeguarding animal data storehouse, it is characterised in that described to be carried out to the movable sound Collection includes:Sound source direction is carried out to the movable sound, the direction of collection is pointed to the orientation that the movable sound is sent.
  4. 4. the method as claimed in claim 1 for safeguarding animal data storehouse, it is characterised in that sent out gathering different sound sources respectively When the sound gone out, in addition to:Taken pictures and/or imaged according to the time interval of setting, obtain at least one picture signal And/or vision signal.
  5. 5. the method as claimed in claim 4 for safeguarding animal data storehouse, it is characterised in that when the match is successful, in addition to:Will The corresponding more matchmakers for being stored in species word bank of the described image signal and/or vision signal of the collection of same with source acoustical signal time In body information bank.
  6. 6. the as claimed in claim 1 method for safeguarding animal data storehouse, it is characterised in that described by the first sound that the match is successful Source acoustical signal corresponding to line information is stored in after the multimedia information lib of corresponding species word bank, in addition to:In more matchmakers The acquisition time of the source acoustical signal is recorded in body information bank.
  7. 7. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that described to gather not in unison The sound that source is sent, obtaining at least one source acoustical signal includes:Sound around user's active detecting, and described in obtaining Source acoustical signal is sent to server.
  8. 8. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that it is described will matching into After source acoustical signal corresponding to first voiceprint of work(is stored in the multimedia information lib of corresponding species word bank, also wrap Include:The voiceprint storehouse corresponding to first voiceprint that the match is successful renewal.
  9. 9. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that around the detection Sound before, in addition to obtain the sound-groove model, including:Gather the hair of animal species corresponding to each species word bank Sound sample;Extract the first voiceprint in the sounding sample, and sound-groove model corresponding to generation.
  10. 10. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that described to extract The first voiceprint and animal data storehouse in sound-groove model in voiceprint storehouse in each species word bank carry out respectively With including:If the sound-groove model in voiceprint storehouse in first voiceprint and each species word bank mismatches, A new species word bank is created, first voiceprint is stored in the multimedia information lib of the new species word bank In.
  11. 11. the method for safeguarding animal data storehouse as described in claim any one of 1-6, it is characterised in that described in the analysis Second voiceprint of each source acoustical signal in species word bank, and according to analysis result to the source acoustical signal in the species word bank Carrying out cluster operation includes:
    By in the multimedia information lib in the species word bank the rising tone of voice signal fragment that forms of active acoustical signal Line information is compared two-by-two, therefrom selects two minimum voice signal fragments of the second vocal print information gap;
    By the gap of the second voiceprint of the two voice signal fragments compared with preset value;
    If the second vocal print information gap of the two voice signal fragments is less than preset value, the two voice signal fragments are closed As a voice signal fragment;
    The second voiceprint of all voice signal fragments is carried out two-by-two from the multimedia information lib in the word bank by species Comparison starts to repeat all of above step, until the second vocal print information gap of any two voice signal fragment is all higher than presetting Value, then be different animal individuals corresponding to these voice signal fragments, then these voice signal fragments do not synthesized.
  12. 12. the method as claimed in claim 11 for safeguarding animal data storehouse, it is characterised in that in the word bank to species After source acoustical signal carries out cluster operation, in addition to:Judge the quantity of the animal individual in species word bank, and each animal Corresponding relation between individual information and animal individual with the multimedia information lib.
  13. 13. the method as claimed in claim 11 for safeguarding animal data storehouse, it is characterised in that in the word bank to species After source acoustical signal carries out cluster operation, when there is new source acoustical signal to be stored in the multimedia information lib of the species word bank, also Including:Using corresponding to each animal individual in existing cluster result active acoustical signal set it is initial defeated as a cluster Enter, while using each new source acoustical signal as a cluster initial input, clustered;
    Or, by the new source acoustical signal with re-starting cluster behaviour together with the existing active acoustical signal of institute in the species word bank Make.
  14. 14. the method as claimed in claim 11 for safeguarding animal data storehouse, it is characterised in that in the word bank to species After source acoustical signal carries out cluster operation, when having the multimedia information lib of the new source acoustical signal deposit species word bank, also wrap Include:Based on existing all voice signals in the species word bank, animal individual identification is carried out to the new source acoustical signal;If The new source acoustical signal belongs to some existing animal individual in the species word bank, then is labeled as the new source acoustical signal Belong to the animal individual;If the new source acoustical signal is not belonging to any existing animal individual in the species word bank, create Build a new animal individual mark.
  15. A kind of 15. device for safeguarding animal data storehouse, it is characterised in that including:
    Sound acquisition module, the sound sent for gathering different sound sources, obtain at least one source acoustical signal;
    Extraction module, for extracting corresponding first voiceprint from the source acoustical signal;
    Matching module, for by the vocal print in each species word bank in first voiceprint extracted and animal data storehouse Sound-groove model in information bank is matched respectively;
    Classifying module, for when the match is successful for the matching module, by corresponding to first voiceprint that the match is successful Source acoustical signal is stored in the multimedia information lib of corresponding species word bank;
    Cluster module, for analyzing the second voiceprint of each sound-source signal in the species word bank, and according to analysis result Cluster operation is carried out to the source acoustical signal in the species word bank, determines the sound of the Different Individual in the species word bank.
  16. 16. the device as claimed in claim 15 for safeguarding animal data storehouse, it is characterised in that the sound acquisition module includes Judging submodule and collection submodule, the judging submodule are used to judge whether the sound of surrounding has movable sound;It is described to adopt Collection submodule is used to, when the judge module judges to have movable sound, be acquired the movable sound, and carry out sound Cutting.
  17. 17. the device as claimed in claim 16 for safeguarding animal data storehouse, it is characterised in that the sound acquisition module is also wrapped Orientation submodule is included, for carrying out sound source direction to the movable sound, the direction that the collection submodule gathers is pointed to institute State the orientation that movable sound is sent.
  18. 18. the as claimed in claim 15 device for safeguarding animal data storehouse, it is characterised in that also including image collecting module, For when the sound acquisition module gathers the sound that different sound sources are sent respectively, being carried out according to the time interval of setting Take pictures and/or image, obtain at least one picture signal and/or vision signal.
  19. 19. the device as claimed in claim 18 for safeguarding animal data storehouse, it is characterised in that the image collecting module is also wrapped Preservation submodule is included, for the described image signal that will be collected with the matched sub-block source acoustical signal same time that the match is successful And/or vision signal is correspondingly stored in the multimedia information lib of species word bank.
  20. 20. the as claimed in claim 15 device for safeguarding animal data storehouse, it is characterised in that also including time recording module, After for classifying module, the source acoustical signal that the match is successful is stored in corresponding multimedia information lib, believe in the multimedia The acquisition time of the source acoustical signal is recorded in breath storehouse.
  21. 21. the device for safeguarding animal data storehouse as described in claim any one of 15-20, it is characterised in that also include renewal Module, after the source acoustical signal that the match is successful is stored in corresponding multimedia information lib for classifying module, with described With voiceprint storehouse corresponding to the renewal of successful first voiceprint.
  22. 22. the method for safeguarding animal data storehouse as described in claim any one of 15-20, it is characterised in that also including vocal print Model acquisition module, before the sound around sound acquisition module detection, the sound-groove model is obtained, specific bag Include:Gather the sounding sample of animal species corresponding to each species word bank;Extract the first vocal print in the sounding sample Information, and sound-groove model corresponding to generation.
  23. 23. the method for safeguarding animal data storehouse as described in claim any one of 15-20, it is characterised in that the classification mould Block includes newly-built submodule, for the sound-groove model in the voiceprint storehouse in first voiceprint and each species word bank When mismatching, a new species word bank is created, first voiceprint is stored in the more of the new species word bank In media information storehouse.
  24. 24. the device for safeguarding animal data storehouse as described in claim any one of 15-20, it is characterised in that the cluster mould Block includes comparing submodule, analysis submodule, cluster submodule, compares submodule and is used for the multimedia in the species word bank In information bank the second voiceprint of voice signal fragment for forming of active acoustical signal be compared two-by-two;Analyze submodule The two voice signal fragments minimum for the second vocal print information gap for drawing the comparison submodule are carried out with preset value Compare analysis;Submodule is clustered to be used for when the second vocal print that the analysis result of the analysis submodule is two voice signal fragments When information gap is less than preset value, the two voice signal fragments are synthesized into a voice signal fragment.
  25. 25. a kind of system for safeguarding animal data storehouse, it is characterised in that including animal data storehouse and as claim 15-24 appoints The device for safeguarding animal data storehouse described in one, the animal data storehouse include at least one species word bank, each thing Seed bank includes voiceprint storehouse and multimedia information lib, and the voiceprint storehouse includes at least one sound-groove model;It is described Animal species corresponding to each species word bank in animal data storehouse are respectively different.
CN201610694221.2A 2016-08-19 2016-08-19 A kind of methods, devices and systems for safeguarding animal data storehouse Pending CN107766372A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610694221.2A CN107766372A (en) 2016-08-19 2016-08-19 A kind of methods, devices and systems for safeguarding animal data storehouse
PCT/CN2017/094405 WO2018032946A1 (en) 2016-08-19 2017-07-25 Method, device, and system for maintaining animal database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610694221.2A CN107766372A (en) 2016-08-19 2016-08-19 A kind of methods, devices and systems for safeguarding animal data storehouse

Publications (1)

Publication Number Publication Date
CN107766372A true CN107766372A (en) 2018-03-06

Family

ID=61196345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610694221.2A Pending CN107766372A (en) 2016-08-19 2016-08-19 A kind of methods, devices and systems for safeguarding animal data storehouse

Country Status (2)

Country Link
CN (1) CN107766372A (en)
WO (1) WO2018032946A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112472355A (en) * 2021-01-06 2021-03-12 南京北数游电子玩具有限公司 Wild animal sperm collection equipment based on database
CN113448975A (en) * 2021-05-26 2021-09-28 科大讯飞股份有限公司 Method, device and system for updating character image library and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101976564A (en) * 2010-10-15 2011-02-16 中国林业科学研究院森林生态环境与保护研究所 Method for identifying insect voice
CN103117061B (en) * 2013-02-05 2016-01-20 广东欧珀移动通信有限公司 A kind of voice-based animals recognition method and device
CN105161093B (en) * 2015-10-14 2019-07-09 科大讯飞股份有限公司 A kind of method and system judging speaker's number

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112472355A (en) * 2021-01-06 2021-03-12 南京北数游电子玩具有限公司 Wild animal sperm collection equipment based on database
CN113448975A (en) * 2021-05-26 2021-09-28 科大讯飞股份有限公司 Method, device and system for updating character image library and storage medium
CN113448975B (en) * 2021-05-26 2023-01-17 科大讯飞股份有限公司 Method, device and system for updating character image library and storage medium

Also Published As

Publication number Publication date
WO2018032946A1 (en) 2018-02-22

Similar Documents

Publication Publication Date Title
US11900947B2 (en) Method and system for automatically diarising a sound recording
Temko et al. Acoustic event detection in meeting-room environments
Moattar et al. A review on speaker diarization systems and approaches
Imoto Introduction to acoustic event and scene analysis
CN106791579A (en) The processing method and system of a kind of Video Frequency Conference Quality
Ajmera et al. Clustering and segmenting speakers and their locations in meetings
CN109410956A (en) A kind of object identifying method of audio data, device, equipment and storage medium
CN108876951A (en) A kind of teaching Work attendance method based on voice recognition
Vivek et al. Acoustic scene classification in hearing aid using deep learning
Serizel et al. Machine listening techniques as a complement to video image analysis in forensics
Schröter et al. Segmentation, classification, and visualization of orca calls using deep learning
Hagiwara et al. BEANS: The benchmark of animal sounds
CN107766372A (en) A kind of methods, devices and systems for safeguarding animal data storehouse
CN109920447A (en) Recording fraud detection method based on sef-adapting filter Amplitude & Phase feature extraction
Mertens et al. On the applicability of speaker diarization to audio concept detection for multimedia retrieval
Jadhav et al. Machine learning approach to classify birds on the basis of their sound
Najafian et al. Employing speech and location information for automatic assessment of child language environments
Lei et al. User verification: Matching the uploaders of videos across accounts
Andono et al. Feature Selection on Gammatone Cepstral Coefficients for Bird Voice Classification Using Particle Swarm Optimization.
CN108629024A (en) A kind of teaching Work attendance method based on voice recognition
Ortega et al. Bird Identification from the Thamnophilidae Family at the Andean Region of Colombia
Bhor et al. Automated Bird Species Identification using Audio Signal Processing and Neural Network
Wang et al. Augmented strategy for polyphonic sound event detection
Baumann et al. Influence of utterance and speaker characteristics on the classification of children with cleft lip and palate
Changapur et al. Bioacoustics Monitoring to Improve Conservation Efforts for Endangered Species

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180306