CN111933148A - Age identification method and device based on convolutional neural network and terminal - Google Patents

Age identification method and device based on convolutional neural network and terminal Download PDF

Info

Publication number
CN111933148A
CN111933148A CN202010601537.9A CN202010601537A CN111933148A CN 111933148 A CN111933148 A CN 111933148A CN 202010601537 A CN202010601537 A CN 202010601537A CN 111933148 A CN111933148 A CN 111933148A
Authority
CN
China
Prior art keywords
age
audio
neural network
audio data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010601537.9A
Other languages
Chinese (zh)
Inventor
叶志坚
李稀敏
肖龙源
刘晓葳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Kuaishangtong Technology Co Ltd
Original Assignee
Xiamen Kuaishangtong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Kuaishangtong Technology Co Ltd filed Critical Xiamen Kuaishangtong Technology Co Ltd
Priority to CN202010601537.9A priority Critical patent/CN111933148A/en
Publication of CN111933148A publication Critical patent/CN111933148A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/14Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Abstract

The invention provides an age identification method based on a convolutional neural network, which comprises the following steps: collecting audio data of different age groups, and dividing the collected data into n categories according to the different age groups of the audio data; constructing a multi-classification volume neural network age classification model; constructing a multi-classification convolutional neural network age classification model, constructing a network structure and performing model training, obtaining a trained model after a plurality of iterations, inputting a test audio, extracting audio features of the test audio, and inputting the audio features into the trained network model for testing; and matching the result according to the information output by the trained network model, judging which age group the result belongs to, and outputting age group information. The invention further correspondingly provides an age identification device and a terminal based on the convolutional neural network. According to the age identification method based on the convolutional neural network, the audio data are collected to identify the age, the identification accuracy is high, and the age identification efficiency is greatly improved.

Description

Age identification method and device based on convolutional neural network and terminal
Technical Field
The invention relates to the technical field of information processing, in particular to an age identification method, device and terminal based on a convolutional neural network.
Background
In daily life, the age identification is generally carried out by adopting a facial identification mode, and in some specific cases, the age identification cannot be carried out by adopting the facial identification mode because facial information cannot be acquired.
Voice information, in some specific tasks and environments, may be collected and recognized to obtain valuable information. For example, in a criminal investigation case, since the amount of information of a suspect is acquired is small, the age group of the suspect can be identified by the acquired audio information, and the exclusion range of the police is narrowed. In the patent application No. 201910076388.6, an age identifying device based on a preset neural network is disclosed, which completes training of a network model through iterative training until a prediction error is smaller than a set threshold value.
The existing age identification method usually needs to collect a large amount of data information, and the difference of the collected information can bring interference to the identification of the data, so that the age identification is inaccurate.
Disclosure of Invention
In view of the above, it is desirable to provide an age identification method, an age identification device and an age identification terminal based on a convolutional neural network, which have high identification efficiency and accurate identification result, so as to solve the above problems.
The invention provides an age identification method based on a convolutional neural network, which comprises the following steps:
collecting audio data of different age groups, and dividing the collected data into n categories according to the different age groups of the audio data;
constructing a multi-classification convolutional neural network age group classification model, and classifying the audio data by adopting a proper classifier according to the variation characteristics of pronunciation habit difference and frequency of different age groups;
training a classification model by using a convolutional neural network, dividing audio data into a training set and a test set, carrying out vad processing on the audio data, carrying out feature extraction, constructing a network structure, carrying out model training, and carrying out iteration for a plurality of times to obtain a trained model;
inputting a test audio, extracting audio features of the test audio, and inputting the audio features into a trained network model for testing;
and outputting age group information, matching the result according to the information output by the trained network model, judging which age group the result belongs to, and outputting the age group information.
Further, the training of the classification model by using the convolutional neural network comprises:
dividing the audio data, and taking out 80% of all collected audio data as a training set and 20% of all collected audio data as a testing set;
carrying out vad processing on the audio data, cutting off a mute section of the audio data, and intercepting the audio data after being subjected to vad processing into 4s sections;
extracting characteristics, namely extracting stft characteristics from the audio data subjected to vad processing, wherein 257-dimensional stft characteristics are adopted as bottom acoustic characteristics;
constructing a network structure and carrying out model training, wherein an output layer adopts n node softmax layers, and an one-hot code is used for representing the age bracket to which the output layer belongs;
and (3) updating network parameters, wherein the network adopts a loss function as cross entropy loss, updates the network parameters by adopting an Adam algorithm, and obtains a trained model through a plurality of iterations.
Further, the network structure specifically includes:
a first layer: DNN layer, second layer: DNN layer, third layer: DNN layer, fourth layer: CNN layer, fifth-seventh layer: CNN layer, eighth layer: pooling layer, ninth layer: and (4) fully connecting the layers.
Furthermore, dropout operation is added in the process of constructing a network structure and training a model, so that overfitting of the model is prevented.
Further, the acquiring audio data of different age groups comprises:
inputting the audio information;
performing front-end preprocessing, including signal processing and feature extraction;
performing back-end processing on the audio information based on an acoustic model and a language model;
and outputting a voice recognition result.
Further, the feature extraction of the audio information includes:
preprocessing the audio information;
carrying out signal transformation on each frame of audio information to obtain an amplitude spectrum;
adding a Mel filter bank to the magnitude spectrum;
and carrying out logarithm operation on the output of the filter, and then carrying out one-step discrete cosine transform to obtain the MFCC characteristics.
The application also provides an age identification device based on volume neural network, includes:
the audio acquisition module is used for acquiring audio data information of different ages;
the classification model building module is used for building a multi-classification volume neural network age classification model;
the classification model training module is used for training a classification model by using a convolutional neural network;
the test audio input module is used for extracting the audio features of the test audio;
and the age group information output module judges which age group the result belongs to and outputs the age group information.
Further, the audio acquisition module comprises:
the preprocessing module is used for processing signals and extracting features;
and the back-end processing module is used for carrying out back-end processing on the audio information.
The application also provides a terminal device, which comprises a memory and a processor, and is characterized in that the processor is used for realizing the steps of the method when executing the computer program.
The present application also proposes a computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, carries out the steps of the method of the present application.
According to the age identification method based on the convolutional neural network, the characteristics of the audio data are further extracted by acquiring the audio data and processing the audio data, so that the identity of the acquired different audio data is ensured, and the method is not interfered by irrelevant factors; by constructing a multi-classification convolutional neural network age classification model and carrying out model training, after a plurality of iterations, a trained model is obtained, and the result is further judged and age information is output. Compared with the prior art, the method and the device can accurately identify the audio data, and the volume neural network model is adopted, so that the identification result is accurate and the efficiency is higher for age identification based on the audio data.
Drawings
Fig. 1 is a schematic flowchart of an age identification method based on a convolutional neural network according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of model training based on a convolutional neural network in an embodiment of the present invention.
Fig. 3 is a detailed flow chart of audio data acquisition according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of feature extraction of audio information in an embodiment of the present invention.
Fig. 5 is a schematic diagram of signal transformation of audio information in an embodiment of the invention.
Fig. 6 is a block diagram showing a specific configuration of the age identifying apparatus based on the convolutional neural network according to the present invention.
Fig. 7 is a block diagram of an audio capture module in an embodiment of the invention.
Fig. 8 is a block diagram of a detailed structure of a terminal according to an embodiment of the present invention.
Description of the main elements
Terminal 100
Audio acquisition module 11
Preprocessing module 111
Back-end processing module 11
Classification model constructing module 120
Classification model training module 130
Test audio input module 140
Age group information output module 150
Processor 21
Memory 22
RAM 221
Cache 222
Storage system 223
Program module 224
I/O interface 230
Network adapter 240
The following detailed description will further illustrate the invention in conjunction with the above-described figures.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It is to be further noted that, for the convenience of description, only some but not all of the matters related to the present invention are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts various operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently, or simultaneously. In addition, the order of various operations may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Referring to fig. 1, the present invention provides an age identification method based on a convolutional neural network, and the method of the present embodiment may be implemented by an age identification apparatus based on a convolutional neural network, which may be implemented in a hardware or software manner and may be generally integrated in a device, such as a server. The method of the embodiment specifically includes:
and S100, collecting audio data of different ages.
In this embodiment, after audio data is collected, the data is divided into n categories according to the age group to which the audio data belongs.
S200, constructing a multi-classification volume neural network age classification model.
In this embodiment, the audio data is classified by using a suitable classifier according to the pronunciation habit difference and the frequency variation characteristics of different age groups.
And S300, training a classification model by using a convolutional neural network.
In this embodiment, audio data is first divided into a training set and a test set, vad processing is performed on the audio data, feature extraction is performed, a network structure is constructed, model training is performed, and a trained model is obtained after a plurality of iterations.
And S400, inputting a test audio.
Before the test audio is input, the audio features of the test audio are extracted and then input into a trained network model for testing.
And S500, outputting age group information.
And matching the result according to the information output by the trained network model, judging which age group the result belongs to, and outputting age group information.
According to the age identification method based on the convolutional neural network, provided by the embodiment of the invention, the convolutional neural network classification model is constructed by collecting the audio data, the age group is further identified, the identification result is accurate, and the identification efficiency is high.
Fig. 2 shows a model training diagram based on the convolutional neural network. As shown in fig. 2, the training of the classification model using the convolutional neural network includes:
and S310, dividing the audio data.
In this embodiment, 80% of all collected audio data is taken as the training set and 20% is taken as the test set.
And S320, carrying out vad processing on the audio data.
And cutting off the mute section of the audio data, and cutting off the audio data after vad processing into 4s sections.
It should be noted that vad processing, that is, a voice endpoint detection technology, is adopted to separate silence from actual voice, and further intercept actual voice.
And S330, feature extraction.
Extracting stft characteristics from the audio data subjected to vad processing, wherein 257-dimensional stft characteristics are adopted as bottom acoustic characteristics;
in the present embodiment, a short-time fourier transform (stft), i.e., a series of windowed fourier transforms, is used to extract audio data features.
And S340, constructing a network structure and carrying out model training.
In this embodiment, the network structure specifically includes:
Figure BDA0002559173360000081
furthermore, dropout operation is added into the model on the network structure, so that overfitting of the model is prevented.
The output layer adopts n node softmax layers, and the age bracket is represented by one-hot codes. Illustratively, the age groups are arranged in the order: … for 0-5, 5-10, 10-15, 15-20 years old, then 0-5 years old is expressed as: 1000 …, expressed as age 5-10: 0100 …, age 10-15 expressed as: 0010 ….
And S350, updating the network parameters.
In this embodiment, the network adopts a loss function as cross entropy loss, updates network parameters by using an Adam algorithm, and obtains a trained model through a plurality of iterations.
Further, the Adam algorithm updates network parameters through steps of initialization, iterative processing, weighted average calculation, deviation correction, weight updating and the like, and a trained model is obtained after a plurality of iterations.
In this embodiment, the collected audio data is subjected to vad processing, data characteristics of the audio data are further extracted, a network structure is constructed for model training, and an Adam algorithm is further adopted for updating network parameters, so that the recognition degree of the collected audio data is ensured, and the accuracy of age recognition is improved.
Fig. 3 is a detailed flow chart of audio data acquisition. As shown in fig. 3, the acquiring audio data of different age groups includes:
inputting the audio information;
performing front-end preprocessing, including signal processing and feature extraction;
performing back-end processing on the audio information based on an acoustic model and a language model;
and outputting a voice recognition result.
In this embodiment, an acoustic model and a language model are combined, acoustic and pronunciation information is integrated, and the acquired audio data is used as a beginning input to obtain an audio recognition result.
Fig. 4 is a schematic diagram of feature extraction of audio information. Referring to fig. 4, the feature extraction of the audio information includes:
preprocessing the audio information;
carrying out signal transformation on each frame of audio information to obtain an amplitude spectrum;
adding a Mel filter bank to the magnitude spectrum;
and carrying out logarithm operation on the output of the filter, and then carrying out one-step discrete cosine transform to obtain the MFCC characteristics.
In this embodiment, the pre-processing of the audio information is framing, i.e. processing the speech stream into segments. The pre-emphasis is realized by compensating the high frame component of the voice signal at the transmitting end, so that the influence of sharp noise is reduced, and the high frequency part is improved.
After the preprocessing, fourier signal transformation is performed on the audio information, as can be seen in fig. 5. In this embodiment, a vector can be obtained by performing fourier transform on each frame of audio, and corresponds to the size of each frequency point. Based on this, by putting together a plurality of frames, an amplitude spectrogram can be obtained.
Further, after the amplitude spectrogram is obtained, a filter bank is added to the amplitude spectrogram, logarithm operation is performed on the output of the filter bank, and dynamic features are further obtained through discrete cosine transform, so that feature vectors are output.
The audio information feature extraction provided by the embodiment can be used for processing the audio information rapidly and efficiently, and further outputting the feature vector, so that the age identification efficiency based on the convolutional neural network is effectively improved.
Fig. 6 is a block diagram showing a specific configuration of the age identifying apparatus based on the convolutional neural network according to the present invention. As shown in fig. 6, the apparatus includes:
the audio acquisition module 110 is used for acquiring audio data information of different ages;
a classification model construction module 120, configured to construct a multi-class convolutional neural network age classification model;
a classification model training module 130 for training a classification model by using a convolutional neural network;
a test audio input module 140 for extracting audio features of the test audio;
the age group information output module 150 determines which age group the result belongs to, and outputs age group information.
Further, as shown in fig. 7, the audio capture module includes:
a preprocessing module 111 for performing signal processing and feature extraction;
and a back-end processing module 112, configured to perform back-end processing on the audio information.
The age identification device based on the convolutional neural network provided by the embodiment constructs the convolutional neural network classification model by acquiring audio data, so that the age bracket is further identified, the identification result is accurate, and the identification efficiency is high.
Fig. 8 is a block diagram of a terminal according to an embodiment of the present invention. The terminal 100 shown in fig. 8 is suitable for implementing embodiments of the present invention. The terminal 100 shown in fig. 8 is only an example, and should not bring any limitation to the functions and applicable scope of the embodiments of the present invention.
As shown in fig. 8, the components of terminal 100 may include, but are not limited to: one or more processors 16, and a system memory 220. In the present embodiment, the terminal 100 includes a variety of computer system readable media. Such media may be any available media that is accessible by terminal 100 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 220 may include computer system readable media in the form of volatile memory, such as random access memory (RAM221) and/or cache memory 222. Memory 220 may include at least one program product having a set (e.g., at least one) of program modules 224 that are configured to carry out the functions of embodiments of the invention.
The terminal 100 can communicate with one or more terminals that enable a user to interact with the terminal 100, such communication being via input/output (I/O) interfaces 230. The terminal 100 may also communicate with one or more networks (e.g., a local area network, a wide area network, the internet, etc.) through a network adapter 240.
The processor 210 executes programs stored in the memory 220 to perform various functional applications and data processing, such as the age identification method based on the volume neural network provided by the embodiment of the present invention.
Embodiments of the present invention also provide a computer-readable storage medium, which when executed by a computer processor, is configured to perform a method for identifying an age based on a volume neural network according to an embodiment of the present invention. Computer storage media in accordance with embodiments of the present invention may employ any combination of one or more computer-readable media.
The computer readable storage medium of the present embodiments may be an electronic, magnetic, optical, or semiconductor system, apparatus, or device, or any combination thereof. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
In embodiments of the present invention, computer program code for the operation of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and including conventional procedural programming languages. The program code may execute entirely on the computer, partly on the computer or remotely.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. The units or computer means recited in the computer means claims may also be implemented by the same unit or computer means, either in software or in hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An age identification method based on a volume neural network is characterized by comprising the following steps:
collecting audio data of different age groups, and dividing the collected data into n categories according to the different age groups of the audio data;
constructing a multi-classification convolutional neural network age group classification model, and classifying the audio data by adopting a proper classifier according to the variation characteristics of pronunciation habit difference and frequency of different age groups;
training a classification model by using a convolutional neural network, dividing audio data into a training set and a test set, carrying out vad processing on the audio data, carrying out feature extraction, constructing a network structure, carrying out model training, and carrying out iteration for a plurality of times to obtain a trained model;
inputting a test audio, extracting audio features of the test audio, and inputting the audio features into a trained network model for testing;
and outputting age group information, matching the result according to the information output by the trained network model, judging which age group the result belongs to, and outputting the age group information.
2. The method of claim 1, wherein the training of the classification model using the convolutional neural network comprises:
dividing the audio data, and taking out 80% of all collected audio data as a training set and 20% of all collected audio data as a testing set;
carrying out vad processing on the audio data, cutting off a mute section of the audio data, and intercepting the audio data after being subjected to vad processing into 4s sections;
extracting characteristics, namely extracting stft characteristics from the audio data subjected to vad processing, wherein 257-dimensional stft characteristics are adopted as bottom acoustic characteristics;
constructing a network structure and carrying out model training, wherein an output layer adopts n node softmax layers, and an one-hot code is used for representing the age bracket to which the output layer belongs;
and (3) updating network parameters, wherein the network adopts a loss function as cross entropy loss, updates the network parameters by adopting an Adam algorithm, and obtains a trained model through a plurality of iterations.
3. The age identification method based on the convolutional neural network as claimed in claim 2, wherein the network structure specifically comprises:
a first layer: DNN layer, second layer: DNN layer, third layer: DNN layer, fourth layer: CNN layer, fifth-seventh layer: CNN layer, eighth layer: pooling layer, ninth layer: and (4) fully connecting the layers.
4. The age identification method based on the convolutional neural network as claimed in claim 2, wherein dropout operation is added in the process of constructing the network structure and training the model to prevent the model from being over-fitted.
5. The method of claim 1, wherein the collecting audio data for different age groups comprises:
inputting the audio information;
performing front-end preprocessing, including signal processing and feature extraction;
performing back-end processing on the audio information based on an acoustic model and a language model;
and outputting a voice recognition result.
6. The volume neural network-based age identification method of claim 5, wherein the feature extraction of the audio information comprises:
preprocessing the audio information;
carrying out signal transformation on each frame of audio information to obtain an amplitude spectrum;
adding a Mel filter bank to the magnitude spectrum;
and carrying out logarithm operation on the output of the filter, and then carrying out one-step discrete cosine transform to obtain the MFCC characteristics.
7. An age identifying apparatus based on a convolutional neural network, comprising:
the audio acquisition module is used for acquiring audio data information of different ages;
the classification model building module is used for building a multi-classification volume neural network age classification model;
the classification model training module is used for training a classification model by using a convolutional neural network;
the test audio input module is used for extracting the audio features of the test audio;
and the age group information output module judges which age group the result belongs to and outputs the age group information.
8. The convolutional neural network-based age identification device of claim 7, wherein the audio acquisition module comprises:
the preprocessing module is used for processing signals and extracting features;
and the back-end processing module is used for carrying out back-end processing on the audio information.
9. A terminal device comprising a memory and a processor, characterized in that the processor is adapted to carry out the steps of the method of any of claims 1-6 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN202010601537.9A 2020-06-29 2020-06-29 Age identification method and device based on convolutional neural network and terminal Pending CN111933148A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010601537.9A CN111933148A (en) 2020-06-29 2020-06-29 Age identification method and device based on convolutional neural network and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010601537.9A CN111933148A (en) 2020-06-29 2020-06-29 Age identification method and device based on convolutional neural network and terminal

Publications (1)

Publication Number Publication Date
CN111933148A true CN111933148A (en) 2020-11-13

Family

ID=73316394

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010601537.9A Pending CN111933148A (en) 2020-06-29 2020-06-29 Age identification method and device based on convolutional neural network and terminal

Country Status (1)

Country Link
CN (1) CN111933148A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112581942A (en) * 2020-12-29 2021-03-30 云从科技集团股份有限公司 Method, system, device and medium for recognizing target object based on voice
CN112651372A (en) * 2020-12-31 2021-04-13 北京眼神智能科技有限公司 Age judgment method and device based on face image, electronic equipment and storage medium
CN113782032A (en) * 2021-09-24 2021-12-10 广东电网有限责任公司 Voiceprint recognition method and related device
CN114360148A (en) * 2021-12-06 2022-04-15 深圳市亚略特科技股份有限公司 Automatic selling method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108281138A (en) * 2017-12-18 2018-07-13 百度在线网络技术(北京)有限公司 Age discrimination model training and intelligent sound exchange method, equipment and storage medium
US20180261213A1 (en) * 2017-03-13 2018-09-13 Baidu Usa Llc Convolutional recurrent neural networks for small-footprint keyword spotting
CN110349588A (en) * 2019-07-16 2019-10-18 重庆理工大学 A kind of LSTM network method for recognizing sound-groove of word-based insertion
CN110534098A (en) * 2019-10-09 2019-12-03 国家电网有限公司客户服务中心 A kind of the speech recognition Enhancement Method and device of age enhancing
CN111179915A (en) * 2019-12-30 2020-05-19 苏州思必驰信息科技有限公司 Age identification method and device based on voice
CN111210840A (en) * 2020-01-02 2020-05-29 厦门快商通科技股份有限公司 Age prediction method, device and equipment
CN111261192A (en) * 2020-01-15 2020-06-09 厦门快商通科技股份有限公司 Audio detection method based on LSTM network, electronic equipment and storage medium
CN111261196A (en) * 2020-01-17 2020-06-09 厦门快商通科技股份有限公司 Age estimation method, device and equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180261213A1 (en) * 2017-03-13 2018-09-13 Baidu Usa Llc Convolutional recurrent neural networks for small-footprint keyword spotting
CN108281138A (en) * 2017-12-18 2018-07-13 百度在线网络技术(北京)有限公司 Age discrimination model training and intelligent sound exchange method, equipment and storage medium
CN110349588A (en) * 2019-07-16 2019-10-18 重庆理工大学 A kind of LSTM network method for recognizing sound-groove of word-based insertion
CN110534098A (en) * 2019-10-09 2019-12-03 国家电网有限公司客户服务中心 A kind of the speech recognition Enhancement Method and device of age enhancing
CN111179915A (en) * 2019-12-30 2020-05-19 苏州思必驰信息科技有限公司 Age identification method and device based on voice
CN111210840A (en) * 2020-01-02 2020-05-29 厦门快商通科技股份有限公司 Age prediction method, device and equipment
CN111261192A (en) * 2020-01-15 2020-06-09 厦门快商通科技股份有限公司 Audio detection method based on LSTM network, electronic equipment and storage medium
CN111261196A (en) * 2020-01-17 2020-06-09 厦门快商通科技股份有限公司 Age estimation method, device and equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112581942A (en) * 2020-12-29 2021-03-30 云从科技集团股份有限公司 Method, system, device and medium for recognizing target object based on voice
CN112651372A (en) * 2020-12-31 2021-04-13 北京眼神智能科技有限公司 Age judgment method and device based on face image, electronic equipment and storage medium
CN113782032A (en) * 2021-09-24 2021-12-10 广东电网有限责任公司 Voiceprint recognition method and related device
CN113782032B (en) * 2021-09-24 2024-02-13 广东电网有限责任公司 Voiceprint recognition method and related device
CN114360148A (en) * 2021-12-06 2022-04-15 深圳市亚略特科技股份有限公司 Automatic selling method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110600017B (en) Training method of voice processing model, voice recognition method, system and device
CN111179975B (en) Voice endpoint detection method for emotion recognition, electronic device and storage medium
CN106683680B (en) Speaker recognition method and device, computer equipment and computer readable medium
CN111276131B (en) Multi-class acoustic feature integration method and system based on deep neural network
CN111933148A (en) Age identification method and device based on convolutional neural network and terminal
CN108198547B (en) Voice endpoint detection method and device, computer equipment and storage medium
CN110457432B (en) Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium
CN110444202B (en) Composite voice recognition method, device, equipment and computer readable storage medium
CN113327626A (en) Voice noise reduction method, device, equipment and storage medium
CN111081223A (en) Voice recognition method, device, equipment and storage medium
CN113628612A (en) Voice recognition method and device, electronic equipment and computer readable storage medium
CN115457938A (en) Method, device, storage medium and electronic device for identifying awakening words
Kamble et al. Emotion recognition for instantaneous Marathi spoken words
CN111932056A (en) Customer service quality scoring method and device, computer equipment and storage medium
Dhakal et al. Detection and identification of background sounds to improvise voice interface in critical environments
CN117312548A (en) Multi-source heterogeneous disaster situation data fusion understanding method
CN111785262A (en) Speaker age and gender classification method based on residual error network and fusion characteristics
Therese et al. A linear visual assessment tendency based clustering with power normalized cepstral coefficients for audio signal recognition system
CN113129926A (en) Voice emotion recognition model training method, voice emotion recognition method and device
CN116153337B (en) Synthetic voice tracing evidence obtaining method and device, electronic equipment and storage medium
CN117079673B (en) Intelligent emotion recognition method based on multi-mode artificial intelligence
CN112669881B (en) Voice detection method, device, terminal and storage medium
CN109378002B (en) Voiceprint verification method, voiceprint verification device, computer equipment and storage medium
KR102300599B1 (en) Method and Apparatus for Determining Stress in Speech Signal Using Weight
CN113921018A (en) Voiceprint recognition model training method and device and voiceprint recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201113

RJ01 Rejection of invention patent application after publication