CN110263653A - A kind of scene analysis system and method based on depth learning technology - Google Patents
A kind of scene analysis system and method based on depth learning technology Download PDFInfo
- Publication number
- CN110263653A CN110263653A CN201910433837.8A CN201910433837A CN110263653A CN 110263653 A CN110263653 A CN 110263653A CN 201910433837 A CN201910433837 A CN 201910433837A CN 110263653 A CN110263653 A CN 110263653A
- Authority
- CN
- China
- Prior art keywords
- deep learning
- recognition
- module
- image
- database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Abstract
The scene analysis system and method based on depth learning technology that the invention discloses a kind of, the system include: data acquisition subsystem and cloud AI platform;Data acquisition subsystem acquires image and voice;Face recognition module carries out recognition of face to testing image according to depth learning technology in cloud AI platform;Facial Expression Analysis module people's facial expression analyzes and determines;Speech recognition module treats acoustic frequency according to depth learning technology and carries out speech recognition;Semanteme, the intonation of speech analysis module audio to be measured are analyzed and determined;Comprehensive analysis module carries out comprehensive analysis to facial Expression Analysis module and the obtained result of speech analysis module.The present invention can meet the identification of face and voice simultaneously, and according to depth learning technology, the recognition result of human face expression, the semanteme of voice and intonation is obtained, not only make recognition result more accurate but also ensure that recognition speed, further enriches scene analysis technology.
Description
Technical field
The present invention relates to deep learning technology fields, more particularly to a kind of based on based on deep learning technology
Scene analysis system and method.
Background technique
With being constantly progressive for modern science and technology, intellectualization times have arrived, wherein natural language processing and human face expression
Identification technology also becomes the important topic of those skilled in the art's research already.
However, on the one hand, due to the limitation of traditional shallow Model, traditional Natural Language Processing Models need using
A large amount of linguistic knowledge carrys out manual construction feature, and these are generally characterized by by concrete application guide, therefore not specific
Wide applicability must construct new feature by hand again again if specific tasks change;
On the other hand, current face recognition technology is mainly based upon the feature extraction algorithm of hand-designed also to carry out reality
It is existing, and in actual complex environment, human face data often there is the influence of various factors, such as illumination, block, posture becomes
Change etc., in this case, the existing face identification method based on hand-designed feature extraction algorithm has poor robustness,
It is poor to the anti-interference ability of above-mentioned influence factor, and these uncontrollable factors make the recognition of face based on existing method
Performance sharply declines, it is difficult to which the effect for guaranteeing recognition of face has that face recognition accuracy rate is low.
And people different field explore image recognition, speech recognition, semantic analysis application, but by nature language
Speech processing, recognition of face and human facial expression recognition, which combine, applies the application in scene analysis still less, even in hair
The exhibition stage can not be accurately identified.
Therefore, develop that a kind of identification is accurate and the field of the natural language processing based on deep learning and facial expression recognition
The problem of scape analysis system and method are those skilled in the art's urgent need to resolve.
Summary of the invention
In view of this, the present invention provides a kind of scene analysis system and method based on deep learning technology, pass through
Deep learning technology identifies face or voice, and further to the expression of face and to the semanteme and intonation of voice
It is analyzed, the accuracy of identification with analysis has been effectively ensured.
To achieve the goals above, the present invention adopts the following technical scheme:
A kind of scene analysis system based on deep learning technology, comprising: data acquisition subsystem, database and cloud AI are flat
Platform;Wherein,
The data acquisition subsystem, the acquisition for image and voice;
The database, for storing data;
The cloud AI platform includes data preprocessing module, face recognition module, facial Expression Analysis module, speech recognition
Module, speech analysis module and comprehensive analysis module;
The data preprocessing module, it is pre- for being carried out to data acquisition subsystem institute's acquired image and voice
Processing;
The face recognition module, for carrying out recognition of face to testing image according to deep learning technology, and according to institute
It states the data in database and differentiates whether the face in testing image has existed, and constantly carry out recognition of face deep learning;
The facial Expression Analysis module, for being analyzed and determined to people's facial expression in testing image, and constantly
Carry out facial Expression Analysis deep learning;
The speech recognition module carries out speech recognition for treating acoustic frequency, voice content is converted to word content,
Semantic analysis is carried out to voice content, and constantly carries out speech recognition deep learning;
The speech analysis module, semanteme, intonation for audio to be measured are analyzed and determined;
The comprehensive analysis module, for being carried out to facial Expression Analysis module and the obtained result of speech analysis module
Comprehensive analysis.
Preferably, the pretreatment content includes: to carry out dimension-reduction treatment to image, carries out noise reduction process and text to audio
This output.
Preferably, the data acquisition subsystem includes image capture module and audio collection module,
Described image acquisition module and the audio collection module, are respectively used to be acquired image and audio, and will
The collected described image of institute and the audio are sent to the data preprocessing module.
Preferably, the face recognition module includes fisrt feature extraction unit, the first deep learning model and first
Match and recognition unit;
The fisrt feature extraction unit, for according to the first deep learning model by pretreated image zooming-out face
Image feature vector;
First matching and recognition unit, for by the facial image feature vector extracted and the database
In facial image matched, obtain the first recognition result, and by first recognition result be sent to the database into
Row storage, the first deep learning model are constantly updated according to the update of database.
Preferably, the speech recognition module includes second feature extraction unit, the second deep learning model and second
Match and recognition unit;
The second feature extraction unit, for according to the second deep learning model by pretreated audio extraction audio
Feature vector;
Second matching and recognition unit, for will be in the audio feature vector that extracted and the database
Audio data is matched, and obtains the second recognition result, and second recognition result is sent to the database and is deposited
Storage, the second deep learning model are constantly updated according to the update of database.
Preferably, the speech analysis module includes semantic analysis unit and intonation analytical unit;
What the semantic analysis unit and the intonation analytical unit were recognized according to the voice recognition unit respectively
Voice carries out the analysis of semantic and intonation.
A kind of scene analysis method based on deep learning technology, comprising the following steps:
(1) acquisition of image and voice;
(2) institute's acquired image and voice are pre-processed;
(3) recognition of face is carried out to testing image according to deep learning technology, judged in database with the presence or absence of to mapping
Face as in, and the people's facial expression recognized is analyzed and determined;
(4) acoustic frequency is treated according to deep learning technology and carries out speech recognition, convert speech into word content, and to knowledge
Semanteme, the intonation for the voice being clipped to are analyzed and determined;
(5) the analytical judgment result of step (3) and step (4) carries out comprehensive analysis.
Preferably, the detailed process of recognition of face are as follows:
According to the first deep learning model by pretreated image zooming-out facial image feature vector;
The facial image feature vector extracted is matched with the facial image in database, obtains the first identification knot
Fruit, and first recognition result is sent to the database and is stored, the first deep learning model is according to data
The update in library and constantly update.
Preferably, the detailed process of speech recognition are as follows:
According to the second deep learning model by pretreated audio extraction audio feature vector;
The audio feature vector extracted is matched with the audio data in database, obtains the second identification knot
Fruit, and second recognition result is sent to the database and is stored, the second deep learning model is according to data
The update in library and constantly update.
It can be seen via above technical scheme that compared with prior art, the present disclosure provides one kind to be based on depth
The scene analysis system and method for habit technology, wherein the system can meet the identification of face and voice, and root simultaneously first
According to deep learning technology, the recognition result of human face expression, the semanteme of voice and intonation is obtained, not only makes recognition result more accurate
And ensure that recognition speed, scene analysis technology is further enriched, secondly, deep learning model during use can
Enough continuous iteration update, and further ensure that the accuracy of recognition result.The present invention can be used for service trade, smart city etc.
In field, there is timely insight into customer mood can preferably meet the advantages such as the needs of client.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 attached drawing is structural schematic diagram provided by the invention;
Fig. 2 attached drawing is cloud AI platform interior structural schematic diagram provided by the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
The scene analysis system based on deep learning technology that the embodiment of the invention discloses a kind of, as shown in Figure 1, comprising:
Data acquisition subsystem, database and cloud AI platform;Wherein,
Data acquisition subsystem, the acquisition for image and voice;
Database, for storing data;
As shown in Fig. 2, cloud AI platform includes data preprocessing module, face recognition module, facial Expression Analysis module, language
Sound identification module, speech analysis module and comprehensive analysis module;
Data preprocessing module, for being pre-processed to data acquisition subsystem institute's acquired image and voice;
Face recognition module, for carrying out recognition of face to testing image according to deep learning technology, and according to database
Interior data differentiate whether the face in testing image has existed, and constantly carry out recognition of face deep learning;
Facial Expression Analysis module for analyzing and determining to people's facial expression in testing image, and constantly carries out
Facial Expression Analysis deep learning;
Speech recognition module carries out speech recognition for treating acoustic frequency, voice content is converted to word content, to language
Sound content carries out semantic analysis, and constantly carries out speech recognition deep learning;
Speech analysis module, semanteme, intonation for audio to be measured are analyzed and determined;
Comprehensive analysis module, for being integrated to facial Expression Analysis module and the obtained result of speech analysis module
Analysis.
Preferably, pretreatment content includes: to carry out dimension-reduction treatment to image, carries out noise reduction process to audio and text is defeated
Out.
Further, which further includes database, for storing data;
Further, pretreatment content includes: to carry out dimension-reduction treatment to image, carries out noise reduction process and text to audio
This output.
Further, data acquisition subsystem includes image capture module and audio collection module,
Image capture module and audio collection module are respectively used to be acquired image and audio, and will be collected
Image and audio be sent to data preprocessing module.
Further, face recognition module includes fisrt feature extraction unit, the first deep learning model and first
Match and recognition unit;
Fisrt feature extraction unit, for according to the first deep learning model by pretreated image zooming-out facial image
Feature vector;
First matching and recognition unit, the facial image in facial image feature vector and database for will extract
It is matched, obtains the first recognition result, and the first recognition result is sent to database and is stored, the first deep learning mould
Type is constantly updated according to the update of database.
Further, speech recognition module includes second feature extraction unit, the second deep learning model and second
Match and recognition unit;
Second feature extraction unit, for according to the second deep learning model by pretreated audio extraction audio frequency characteristics
Vector;
Second matching and recognition unit, for carrying out the audio data in the audio feature vector and database that extract
Matching, obtains the second recognition result, and the second recognition result is sent to database and is stored, the second deep learning model root
It is constantly updated according to the update of database.
Further, speech analysis module includes semantic analysis unit and intonation analytical unit;
Semantic analysis unit and intonation analytical unit carry out according to the voice that voice recognition unit is recognized semantic respectively
It is analyzed with intonation.
The operation principle of the present invention is that:
Acquired image and voice are sent to data prediction mould respectively by image capture module and voice acquisition module
Block, image is carried out the processing such as dimensionality reduction by data preprocessing module, and carries out the processing such as noise reduction and text output to voice, and data are pre-
Pretreated image data and voice data are respectively sent to face recognition module and speech recognition module, people by processing module
Face identification module by the first matching and recognition unit by the data in the facial image feature vector extracted and database into
Row matching, judges whether there is the face, and obtain face recognition result, further carries out facial table according to face recognition result
Mutual affection analysis;Speech recognition module passes through the second matching and recognition unit for the number in the audio feature vector and database that extract
According to being matched, and then carry out semantic, intonation analysis.
Comprehensive analysis module synthesis face facial expression analysis and semantic intonation analysis are as a result, to obtain in current scene
Mood of identified the people etc. is as a result, complete scene analysis.Client's mood can be obtained in real time according to scene analysis result, known
Customer satisfaction degree can carry out early warning to emergency event, in addition, can be prevented for the service of smart city with dynamic early-warning
Social event occurs.
A kind of scene analysis method based on deep learning technology, comprising the following steps:
(1) acquisition of image and voice;
(2) institute's acquired image and voice are pre-processed;
(3) recognition of face is carried out to testing image according to deep learning technology, judged in database with the presence or absence of to mapping
Face as in, and the people's facial expression recognized is analyzed and determined;
(4) acoustic frequency is treated according to deep learning technology and carries out speech recognition, convert speech into word content, and to knowledge
Semanteme, the intonation for the voice being clipped to are analyzed and determined;
(5) the analytical judgment result of step (3) and step (4) carries out comprehensive analysis.
It, can also be with it should be understood that the sequencing of step (3) and step (4) is not necessarily, can to carry out simultaneously
It first carries out step (3) and carries out step (4) afterwards, vice versa, can also only carry out one of step, determines as needed.
Further, the detailed process of recognition of face are as follows:
According to the first deep learning model by pretreated image zooming-out facial image feature vector;
The facial image feature vector extracted is matched with the facial image in database, obtains the first identification knot
Fruit, and the first recognition result is sent to database and is stored, the first deep learning model according to the update of database without
It is disconnected to update.
Further, the detailed process of speech recognition are as follows:
According to the second deep learning model by pretreated audio extraction audio feature vector;
The audio feature vector extracted is matched with the audio data in database, obtains the second recognition result,
And the second recognition result is sent to database and is stored, the second deep learning model according to the update of database and constantly more
Newly.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other
The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment
For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part
It is bright.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.
Claims (9)
1. a kind of scene analysis system based on deep learning technology characterized by comprising data acquisition subsystem, data
Library and cloud AI platform;Wherein,
The data acquisition subsystem, the acquisition for image and voice;
The database, for storing data;
The cloud AI platform includes data preprocessing module, face recognition module, facial Expression Analysis module, speech recognition mould
Block, speech analysis module and comprehensive analysis module;
The data preprocessing module, for being located in advance to data acquisition subsystem institute's acquired image and voice
Reason;
The face recognition module, for carrying out recognition of face to testing image according to deep learning technology, and according to the number
Differentiate whether the face in testing image has existed according to the data in library, and constantly carries out recognition of face deep learning;
The facial Expression Analysis module for analyzing and determining to people's facial expression in testing image, and constantly carries out
Facial Expression Analysis deep learning;
The speech recognition module carries out speech recognition for treating acoustic frequency, voice content is converted to word content, to language
Sound content carries out semantic analysis, and constantly carries out speech recognition deep learning;
The speech analysis module, semanteme, intonation for audio to be measured are analyzed and determined;
The comprehensive analysis module, for being integrated to facial Expression Analysis module and the obtained result of speech analysis module
Analysis.
2. a kind of scene analysis system based on deep learning technology according to claim 1, which is characterized in that described pre-
Process content includes: to carry out dimension-reduction treatment to image, carries out noise reduction process and text output to audio.
3. a kind of scene analysis system based on deep learning technology according to claim 1, which is characterized in that the number
It include image capture module and audio collection module according to acquisition subsystem,
Described image acquisition module and the audio collection module, are respectively used to be acquired image and audio, and will be adopted
The described image and the audio collected is sent to the data preprocessing module.
4. a kind of scene analysis system based on deep learning technology according to claim 1, which is characterized in that the people
Face identification module includes fisrt feature extraction unit, the first deep learning model and the first matching and recognition unit;
The fisrt feature extraction unit, for according to the first deep learning model by pretreated image zooming-out facial image
Feature vector;
First matching and recognition unit, for will be in the facial image feature vector that extracted and the database
Facial image is matched, and obtains the first recognition result, and first recognition result is sent to the database and is deposited
Storage, the first deep learning model are constantly updated according to the update of database.
5. a kind of scene analysis system based on deep learning technology according to claim 1, which is characterized in that institute's predicate
Sound identification module includes second feature extraction unit, the second deep learning model and the second matching and recognition unit;
The second feature extraction unit, for according to the second deep learning model by pretreated audio extraction audio frequency characteristics
Vector;
Second matching and recognition unit, for by the audio in the audio feature vector extracted and the database
Data are matched, and obtain the second recognition result, and second recognition result is sent to the database and is stored, institute
The second deep learning model is stated to be constantly updated according to the update of database.
6. a kind of scene analysis system based on deep learning technology according to claim 1, which is characterized in that institute's predicate
Sound analysis module includes semantic analysis unit and intonation analytical unit;
The voice that the semantic analysis unit and the intonation analytical unit are recognized according to the voice recognition unit respectively
Carry out the analysis of semantic and intonation.
7. a kind of scene analysis method based on deep learning technology, which comprises the following steps:
(1) acquisition of image and voice;
(2) institute's acquired image and voice are pre-processed;
(3) recognition of face is carried out to testing image according to deep learning technology, judged in database with the presence or absence of in testing image
Face, and the people's facial expression recognized is analyzed and determined;
(4) acoustic frequency is treated according to deep learning technology and carries out speech recognition, convert speech into word content, and to recognizing
Semanteme, the intonation of voice analyzed and determined;
(5) the analytical judgment result of step (3) and step (4) carries out comprehensive analysis.
8. a kind of scene analysis method based on deep learning technology according to claim 8, which is characterized in that face is known
Other detailed process are as follows:
According to the first deep learning model by pretreated image zooming-out facial image feature vector;
The facial image feature vector extracted is matched with the facial image in database, obtains the first recognition result,
And first recognition result is sent to the database and is stored, the first deep learning model is according to database
It updates and constantly updates.
9. a kind of scene analysis method based on deep learning technology according to claim 8, which is characterized in that voice is known
Other detailed process are as follows:
According to the second deep learning model by pretreated audio extraction audio feature vector;
The audio feature vector extracted is matched with the audio data in database, obtains the second recognition result,
And second recognition result is sent to the database and is stored, the second deep learning model is according to database
It updates and constantly updates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910433837.8A CN110263653A (en) | 2019-05-23 | 2019-05-23 | A kind of scene analysis system and method based on depth learning technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910433837.8A CN110263653A (en) | 2019-05-23 | 2019-05-23 | A kind of scene analysis system and method based on depth learning technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110263653A true CN110263653A (en) | 2019-09-20 |
Family
ID=67915131
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910433837.8A Pending CN110263653A (en) | 2019-05-23 | 2019-05-23 | A kind of scene analysis system and method based on depth learning technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110263653A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110991329A (en) * | 2019-11-29 | 2020-04-10 | 上海商汤智能科技有限公司 | Semantic analysis method and device, electronic equipment and storage medium |
CN112001275A (en) * | 2020-08-09 | 2020-11-27 | 成都未至科技有限公司 | Robot for collecting student information |
WO2021134459A1 (en) * | 2019-12-31 | 2021-07-08 | Asiainfo Technologies (China) , Inc. | Ai intelligentialization based on signaling interaction |
CN115328661A (en) * | 2022-09-09 | 2022-11-11 | 中诚华隆计算机技术有限公司 | Computing power balance execution method and chip based on voice and image characteristics |
CN115440000A (en) * | 2021-06-01 | 2022-12-06 | 广东艾檬电子科技有限公司 | Campus early warning protection method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104902345A (en) * | 2015-05-26 | 2015-09-09 | 多维新创(北京)技术有限公司 | Method and system for realizing interactive advertising and marketing of products |
CN106095903A (en) * | 2016-06-08 | 2016-11-09 | 成都三零凯天通信实业有限公司 | A kind of radio and television the analysis of public opinion method and system based on degree of depth learning art |
CN106709804A (en) * | 2015-11-16 | 2017-05-24 | 优化科技(苏州)有限公司 | Interactive wealth planning consulting robot system |
WO2018052561A1 (en) * | 2016-09-13 | 2018-03-22 | Intel Corporation | Speaker segmentation and clustering for video summarization |
CN109558935A (en) * | 2018-11-28 | 2019-04-02 | 黄欢 | Emotion recognition and exchange method and system based on deep learning |
-
2019
- 2019-05-23 CN CN201910433837.8A patent/CN110263653A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104902345A (en) * | 2015-05-26 | 2015-09-09 | 多维新创(北京)技术有限公司 | Method and system for realizing interactive advertising and marketing of products |
CN106709804A (en) * | 2015-11-16 | 2017-05-24 | 优化科技(苏州)有限公司 | Interactive wealth planning consulting robot system |
CN106095903A (en) * | 2016-06-08 | 2016-11-09 | 成都三零凯天通信实业有限公司 | A kind of radio and television the analysis of public opinion method and system based on degree of depth learning art |
WO2018052561A1 (en) * | 2016-09-13 | 2018-03-22 | Intel Corporation | Speaker segmentation and clustering for video summarization |
CN109558935A (en) * | 2018-11-28 | 2019-04-02 | 黄欢 | Emotion recognition and exchange method and system based on deep learning |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110991329A (en) * | 2019-11-29 | 2020-04-10 | 上海商汤智能科技有限公司 | Semantic analysis method and device, electronic equipment and storage medium |
WO2021134459A1 (en) * | 2019-12-31 | 2021-07-08 | Asiainfo Technologies (China) , Inc. | Ai intelligentialization based on signaling interaction |
CN112001275A (en) * | 2020-08-09 | 2020-11-27 | 成都未至科技有限公司 | Robot for collecting student information |
CN115440000A (en) * | 2021-06-01 | 2022-12-06 | 广东艾檬电子科技有限公司 | Campus early warning protection method and device |
CN115328661A (en) * | 2022-09-09 | 2022-11-11 | 中诚华隆计算机技术有限公司 | Computing power balance execution method and chip based on voice and image characteristics |
CN115328661B (en) * | 2022-09-09 | 2023-07-18 | 中诚华隆计算机技术有限公司 | Computing power balance execution method and chip based on voice and image characteristics |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110263653A (en) | A kind of scene analysis system and method based on depth learning technology | |
CN107945805B (en) | A kind of across language voice identification method for transformation of intelligence | |
CN111461176B (en) | Multi-mode fusion method, device, medium and equipment based on normalized mutual information | |
CN108197115A (en) | Intelligent interactive method, device, computer equipment and computer readable storage medium | |
CN102509547B (en) | Method and system for voiceprint recognition based on vector quantization based | |
CN103366618B (en) | Scene device for Chinese learning training based on artificial intelligence and virtual reality | |
CN107731233A (en) | A kind of method for recognizing sound-groove based on RNN | |
CN110289003A (en) | A kind of method of Application on Voiceprint Recognition, the method for model training and server | |
CN108428446A (en) | Audio recognition method and device | |
CN107972028B (en) | Man-machine interaction method and device and electronic equipment | |
CN107492382A (en) | Voiceprint extracting method and device based on neutral net | |
CN104700843A (en) | Method and device for identifying ages | |
CN107622797A (en) | A kind of health based on sound determines system and method | |
CN109036437A (en) | Accents recognition method, apparatus, computer installation and computer readable storage medium | |
CN106776832B (en) | Processing method, apparatus and system for question and answer interactive log | |
CN109036395A (en) | Personalized speaker control method, system, intelligent sound box and storage medium | |
CN110085220A (en) | Intelligent interaction device | |
CN107871499A (en) | Audio recognition method, system, computer equipment and computer-readable recording medium | |
CN106512393A (en) | Application voice control method and system suitable for virtual reality environment | |
CN112863529B (en) | Speaker voice conversion method based on countermeasure learning and related equipment | |
CN109872713A (en) | A kind of voice awakening method and device | |
CN110428853A (en) | Voice activity detection method, Voice activity detection device and electronic equipment | |
CN109887510A (en) | A kind of method for recognizing sound-groove and device based on empirical mode decomposition and MFCC | |
CN110136726A (en) | A kind of estimation method, device, system and the storage medium of voice gender | |
Zhang et al. | Voice biometric identity authentication system based on android smart phone |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 519000 office 618, No. 2202, xinxiangjiang Road, Hengqin new area, Zhuhai, Guangzhou, Guangdong Applicant after: Guangdong Dingyi Interconnection Technology Co.,Ltd. Address before: 519000 unit 1 and unit 3, 10th floor, convention and Exhibition Center, No. 1, Software Park Road, Zhuhai, Guangdong Applicant before: Guangdong Dingyi Interconnection Technology Co.,Ltd. |