CN108882033B - 一种基于视频语音的人物识别方法、装置、设备和介质 - Google Patents
一种基于视频语音的人物识别方法、装置、设备和介质 Download PDFInfo
- Publication number
- CN108882033B CN108882033B CN201810798832.0A CN201810798832A CN108882033B CN 108882033 B CN108882033 B CN 108882033B CN 201810798832 A CN201810798832 A CN 201810798832A CN 108882033 B CN108882033 B CN 108882033B
- Authority
- CN
- China
- Prior art keywords
- identity information
- person
- video
- character
- name
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000001914 filtration Methods 0.000 claims abstract description 19
- 238000004458 analytical method Methods 0.000 claims description 23
- 238000012545 processing Methods 0.000 claims description 22
- 238000013527 convolutional neural network Methods 0.000 claims description 17
- 230000002996 emotional effect Effects 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 12
- 230000001815 facial effect Effects 0.000 claims description 6
- 230000008030 elimination Effects 0.000 claims description 4
- 238000003379 elimination reaction Methods 0.000 claims description 4
- 230000007774 longterm Effects 0.000 abstract description 5
- 238000001514 detection method Methods 0.000 description 18
- 238000012549 training Methods 0.000 description 16
- 238000010586 diagram Methods 0.000 description 10
- 238000003058 natural language processing Methods 0.000 description 6
- 238000005259 measurement Methods 0.000 description 5
- 238000012015 optical character recognition Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 230000008451 emotion Effects 0.000 description 4
- 238000002372 labelling Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 239000002131 composite material Substances 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 230000003698 anagen phase Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000007621 cluster analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/441—Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
- H04N21/4415—Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Character Discrimination (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810798832.0A CN108882033B (zh) | 2018-07-19 | 2018-07-19 | 一种基于视频语音的人物识别方法、装置、设备和介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810798832.0A CN108882033B (zh) | 2018-07-19 | 2018-07-19 | 一种基于视频语音的人物识别方法、装置、设备和介质 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108882033A CN108882033A (zh) | 2018-11-23 |
CN108882033B true CN108882033B (zh) | 2021-12-14 |
Family
ID=64303477
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810798832.0A Active CN108882033B (zh) | 2018-07-19 | 2018-07-19 | 一种基于视频语音的人物识别方法、装置、设备和介质 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108882033B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111061887A (zh) * | 2019-12-18 | 2020-04-24 | 广东智媒云图科技股份有限公司 | 一种新闻人物照片提取方法、装置、设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1703694A (zh) * | 2001-12-11 | 2005-11-30 | 皇家飞利浦电子股份有限公司 | 用于在视频节目中检索与人物相关的信息的系统和方法 |
CN102598055A (zh) * | 2009-10-23 | 2012-07-18 | 微软公司 | 视频会话的自动标记 |
CN104217008A (zh) * | 2014-09-17 | 2014-12-17 | 中国科学院自动化研究所 | 互联网人物视频交互式标注方法及系统 |
CN105354543A (zh) * | 2015-10-29 | 2016-02-24 | 小米科技有限责任公司 | 视频处理方法及装置 |
CN106980640A (zh) * | 2017-02-08 | 2017-07-25 | 网易(杭州)网络有限公司 | 针对照片的交互方法、设备和计算机可读存储介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7787697B2 (en) * | 2006-06-09 | 2010-08-31 | Sony Ericsson Mobile Communications Ab | Identification of an object in media and of related media objects |
EP2676222B1 (en) * | 2011-02-18 | 2018-09-19 | Google LLC | Facial recognition |
CN104281842A (zh) * | 2014-10-13 | 2015-01-14 | 北京奇虎科技有限公司 | 人脸图片人名识别方法和装置 |
CN105740760B (zh) * | 2016-01-21 | 2017-03-15 | 成都索贝数码科技股份有限公司 | 一种视频字幕ocr识别的自动校正方法 |
CN105868271B (zh) * | 2016-03-16 | 2019-12-06 | 东软集团股份有限公司 | 一种姓名统计方法及装置 |
CN107016361A (zh) * | 2017-03-29 | 2017-08-04 | 成都三零凯天通信实业有限公司 | 基于视频分析的识别方法及装置 |
-
2018
- 2018-07-19 CN CN201810798832.0A patent/CN108882033B/zh active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1703694A (zh) * | 2001-12-11 | 2005-11-30 | 皇家飞利浦电子股份有限公司 | 用于在视频节目中检索与人物相关的信息的系统和方法 |
CN102598055A (zh) * | 2009-10-23 | 2012-07-18 | 微软公司 | 视频会话的自动标记 |
CN104217008A (zh) * | 2014-09-17 | 2014-12-17 | 中国科学院自动化研究所 | 互联网人物视频交互式标注方法及系统 |
CN105354543A (zh) * | 2015-10-29 | 2016-02-24 | 小米科技有限责任公司 | 视频处理方法及装置 |
CN106980640A (zh) * | 2017-02-08 | 2017-07-25 | 网易(杭州)网络有限公司 | 针对照片的交互方法、设备和计算机可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN108882033A (zh) | 2018-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109902223B (zh) | 一种基于多模态信息特征的不良内容过滤方法 | |
CN107229627B (zh) | 一种文本处理方法、装置及计算设备 | |
Wazalwar et al. | Interpretation of sign language into English using NLP techniques | |
US10810467B2 (en) | Flexible integrating recognition and semantic processing | |
CN108229481B (zh) | 屏幕内容分析方法、装置、计算设备及存储介质 | |
CN111444349A (zh) | 信息抽取方法、装置、计算机设备和存储介质 | |
CN110858217A (zh) | 微博敏感话题的检测方法、装置及可读存储介质 | |
Elagouni et al. | A comprehensive neural-based approach for text recognition in videos using natural language processing | |
CN112381038B (zh) | 一种基于图像的文本识别方法、系统和介质 | |
CN114416979A (zh) | 一种文本查询方法、设备和存储介质 | |
CN114357206A (zh) | 基于语义分析的教育类视频彩色字幕生成方法及系统 | |
US20220028391A1 (en) | Method for processing a video file comprising audio content and visual content comprising text content | |
CN110717407A (zh) | 基于唇语密码的人脸识别方法、装置及存储介质 | |
Karappa et al. | Detection of sign-language content in video through polar motion profiles | |
CN108882033B (zh) | 一种基于视频语音的人物识别方法、装置、设备和介质 | |
CN112241470A (zh) | 一种视频分类方法及系统 | |
KR101800975B1 (ko) | 필기체가 인식되어 생성된 전자문서의 공유 방법 및 그 장치 | |
CN113177479B (zh) | 图像分类方法、装置、电子设备及存储介质 | |
US11574629B1 (en) | Systems and methods for parsing and correlating solicitation video content | |
CN109034040B (zh) | 一种基于演员表的人物识别方法、装置、设备和介质 | |
Zajíc et al. | Towards processing of the oral history interviews and related printed documents | |
CN113011301A (zh) | 一种活体识别方法、装置及电子设备 | |
Saudagar et al. | Efficient Arabic text extraction and recognition using thinning and dataset comparison technique | |
Wang et al. | Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition | |
Preethi et al. | Video Captioning using Pre-Trained CNN and LSTM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20200928 Address after: Room 108, No. 318, Shuixiu Road, Jinze town (Xichen), Qingpu District, Shanghai 201700 Applicant after: Shanghai Yingpu Technology Co.,Ltd. Address before: 100000 521, 5 level 521, Chao Wai Street, Chaoyang District, Beijing. Applicant before: BEIJING MOVIEBOOK SCIENCE AND TECHNOLOGY Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: A method, device, device, and medium for video speech based character recognition Effective date of registration: 20230425 Granted publication date: 20211214 Pledgee: Bank of Communications Co.,Ltd. Beijing Tongzhou Branch Pledgor: Shanghai Yingpu Technology Co.,Ltd. Registration number: Y2023990000234 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
PP01 | Preservation of patent right |
Effective date of registration: 20231128 Granted publication date: 20211214 |
|
PP01 | Preservation of patent right |