CN102074235B - Method of video speech recognition and search - Google Patents

Method of video speech recognition and search Download PDF

Info

Publication number
CN102074235B
CN102074235B CN 201010600817 CN201010600817A CN102074235B CN 102074235 B CN102074235 B CN 102074235B CN 201010600817 CN201010600817 CN 201010600817 CN 201010600817 A CN201010600817 A CN 201010600817A CN 102074235 B CN102074235 B CN 102074235B
Authority
CN
China
Prior art keywords
video
speech recognition
search
text
videos
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201010600817
Other languages
Chinese (zh)
Other versions
CN102074235A (en
Inventor
刘伟奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaqin Technology Co Ltd
Original Assignee
Huaqin Telecom Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaqin Telecom Technology Co Ltd filed Critical Huaqin Telecom Technology Co Ltd
Priority to CN 201010600817 priority Critical patent/CN102074235B/en
Publication of CN102074235A publication Critical patent/CN102074235A/en
Application granted granted Critical
Publication of CN102074235B publication Critical patent/CN102074235B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method of video speech recognition and search, comprising the following steps: 1) converting all video sounds into a text by speech recognition; 2) independently storing the texts or attaching the texts to videos; 3) selecting a plurality of words which occur maximally in the texts as word labels of the videos, wherein the word labels are added behind the file names of the videos; and 4) searching the word labels of all videos. The method can be used to search the videos widely and specifically, and carry out quick positioning in public security and private goods search.

Description

The method of video speech recognition and retrieval
Technical field
The present invention relates to the video manufacture field, particularly the method for a kind of video speech recognition and retrieval.
Background technology
Present cloud and search technique have been widely used in the various industries, present video search technology is also still in heuristic process, video search is large because of its data volume, be difficult for the reason such as expressions with the search of image content or video segment does not also reach detail, and label that the video search of at present widespread use all is based on filename and artificial increase is used as keyword search.Simultaneously, speech recognition technology also has been applied in the every field widely, but only is single speech recognition at present, and most of for only for identifying than the voice of short-movie section, does not do deep research and utilization.Simultaneously at present video can intercept intermediate segment and play or caught sometime sectional drawing content, but is not applied in the search at present.
In view of this, those skilled in the art provide the method for a kind of video speech recognition and retrieval for the problems referred to above.
Summary of the invention
The invention provides the method for a kind of video speech recognition and retrieval, overcome the difficulty of prior art, can carry out extensive and pointed search to video, also can use this technology to locate fast at public safety and personal objects aspect searching simultaneously.
The present invention adopts following technical scheme:
The invention provides the method for a kind of video speech recognition and retrieval, may further comprise the steps:
(1) the sound part with all videos changes text into by speech recognition;
(2) text is stored separately respectively or is attached in its video;
(3) choose the frequency of occurrences is the highest in the text some words as the word tag of this video, after described word tag is added on the filename of video;
(4) retrieve the word tag of all videos.
Preferably, the text in the described step (2) is preserved with the word file form.
Preferably, the text in the described step (2) is preserved with the TXT document form.
Preferably, the individual character number of the word tag in the described step (3) is defined as 3.
Preferably, the individual character number of the word tag in the described step (3) is defined as 5.
Preferably, the individual character number of the word tag in the described step (3) is defined as 10.
Owing to adopted above-mentioned technology, compared with prior art, the present invention can carry out extensive and pointed search to video, also can use this technology to locate fast at public safety and personal objects aspect searching simultaneously.
Further specify the present invention below in conjunction with drawings and Examples.
Description of drawings
Fig. 1 is the process flow diagram of the method for video speech recognition of the present invention and retrieval;
Fig. 2 is the process flow diagram of method of video speech recognition and the retrieval of embodiment 1;
Fig. 3 is the process flow diagram of method of video speech recognition and the retrieval of embodiment 2;
Fig. 4 is the process flow diagram of method of video speech recognition and the retrieval of embodiment 3.
Embodiment
Introduce three kinds of specific embodiments of the present invention below by Fig. 1 to 4.
As shown in Figure 1, the method for a kind of video speech recognition of the present invention and retrieval may further comprise the steps:
(1) the sound part with all videos changes text into by speech recognition;
(2) text is stored separately respectively or is attached in its video;
(3) choose the frequency of occurrences is the highest in the text some words as the word tag of this video, after described word tag is added on the filename of video;
(4) retrieve the word tag of all videos.
Text in the described step (2) is preserved with the word file form, or preserves with the TXT document form.
Preferably, the individual character number of the word tag in the described step (3) is defined as 3, or is 5, or is 10.
Actual operating position of the present invention is as follows:
Embodiment 1
As shown in Figure 2, in the public safety, in admission camera video content, obtain audio files and use speech recognition technology to handle accordingly, be stored in high in the clouds, or only preserve text beyond the clouds, with the actual audio-video document of other easily big data quantity memory bank storages, can carry out the screenshotss picture of single text retrieval or text, video segment and corresponding timeslice as result for retrieval for two kinds of situations during retrieval.
Embodiment 2
As shown in Figure 3, in the individual application, can do the Internet video media to video file equally similarly retrieves, the special application, sort articles time admission memory location and quote corresponding Item Title for example, input corresponding Item Title during search and can find the article storage position, prevent because of the difficulty problem of looking for of forgeing or the situation such as non-arrangement people finder exists, take when for example cleaning up the room and say: the clothing in summer is put here, daddy's shirt is put here, old mother's overcoat is put here, younger brother's pencil is put here, elder sister's cosmetics are all put here, when searching the input shirt, then retrieve a plurality of shirt results, according to screenshotss determine the target shirt timeslice or directly find the position to get final product.This domestic. applications can have been avoided the conflict that the reasons such as misunderstanding of the difference of same thing memory caused because can not find article or house person greatly, and the old man relatively poor for memory is especially convenient.
Embodiment 3
As shown in Figure 4, search for the Internet video media, high in the clouds is analyzed with sound video and is transformed, and indicate by Time Line in the mode of similar captions, the user only need input corresponding text or says the content (being converted to text by speech recognition technology equally) of wanting to search for and can list corresponding captioned test and the screenshotss picture of video segment and corresponding Time Line during search.Example when the user only remembers the part lines of certain film, uses this technology to carry out video frequency searching for these part lines.
In summary, owing to adopted above-mentioned technology, the present invention can carry out extensive and pointed search to video, also can use this technology to locate fast at public safety and personal objects aspect searching simultaneously.
Above-described embodiment only is used for illustrating technological thought of the present invention and characteristics, its purpose is to make those skilled in the art can understand content of the present invention and implements according to this, can not only limit claim of the present invention with present embodiment, be all equal variation or modifications of doing according to disclosed spirit, still drop in the claim of the present invention.

Claims (6)

1. the method for a video speech recognition and retrieval is characterized in that: may further comprise the steps:
(1) the sound part with all videos changes text into by speech recognition;
(2) text is stored separately respectively or is attached in its video;
(3) choose the frequency of occurrences is the highest in the text some words as the word tag of this video, after described word tag is added on the filename of video;
(4) retrieve the word tag of all videos.
Video speech recognition as claimed in claim 1 and the retrieval method, it is characterized in that: the text in the described step (2) is preserved with the word file form.
Video speech recognition as claimed in claim 1 and the retrieval method, it is characterized in that: the text in the described step (2) is preserved with the TXT document form.
Video speech recognition as claimed in claim 1 and the retrieval method, it is characterized in that: the individual character number of the word tag in the described step (3) is defined as 3.
Video speech recognition as claimed in claim 1 and the retrieval method, it is characterized in that: the individual character number of the word tag in the described step (3) is defined as 5.
Video speech recognition as claimed in claim 1 and the retrieval method, it is characterized in that: the individual character number of the word tag in the described step (3) is defined as 10.
CN 201010600817 2010-12-20 2010-12-20 Method of video speech recognition and search Active CN102074235B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010600817 CN102074235B (en) 2010-12-20 2010-12-20 Method of video speech recognition and search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010600817 CN102074235B (en) 2010-12-20 2010-12-20 Method of video speech recognition and search

Publications (2)

Publication Number Publication Date
CN102074235A CN102074235A (en) 2011-05-25
CN102074235B true CN102074235B (en) 2013-04-03

Family

ID=44032753

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010600817 Active CN102074235B (en) 2010-12-20 2010-12-20 Method of video speech recognition and search

Country Status (1)

Country Link
CN (1) CN102074235B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186557A (en) * 2011-12-28 2013-07-03 宇龙计算机通信科技(深圳)有限公司 Method and device for automatically naming sound record or video files
CN103631780B (en) * 2012-08-21 2016-11-23 重庆文润科技有限公司 Multimedia recording systems and method
CN103778131B (en) * 2012-10-18 2017-02-22 腾讯科技(深圳)有限公司 Caption query method and device, video player and caption query server
CN103186663B (en) * 2012-12-28 2016-07-06 中联竞成(北京)科技有限公司 A kind of network public-opinion monitoring method based on video and system
CN104375997A (en) * 2013-08-13 2015-02-25 腾讯科技(深圳)有限公司 Method and device for adding note information to instant messaging audio information
CN104023176B (en) * 2014-06-03 2017-07-14 华为技术有限公司 Handle method, device and the terminal device of audio and image information
CN104090955A (en) * 2014-07-07 2014-10-08 科大讯飞股份有限公司 Automatic audio/video label labeling method and system
CN104469544A (en) * 2014-11-07 2015-03-25 重庆晋才富熙科技有限公司 Video marking method based on voice technology
CN105898204A (en) * 2014-12-25 2016-08-24 支录奎 Intelligent video recorder enabling video structuralization
CN104994404A (en) * 2015-07-06 2015-10-21 无锡天脉聚源传媒科技有限公司 Method and device for obtaining keywords for video
CN105138670B (en) * 2015-09-06 2018-12-14 天翼爱音乐文化科技有限公司 Audio file label generating method and system
CN106649807A (en) * 2016-12-29 2017-05-10 维沃移动通信有限公司 Audio file processing method and mobile terminal
CN107391679A (en) * 2017-07-23 2017-11-24 肇庆高新区长光智能技术开发有限公司 Aid in methods of review, device and equipment
CN107422858A (en) * 2017-07-23 2017-12-01 肇庆高新区长光智能技术开发有限公司 Assisted learning method, device and terminal
CN109547847B (en) * 2018-11-22 2021-10-22 广州酷狗计算机科技有限公司 Method and device for adding video information and computer readable storage medium
CN109523990B (en) * 2019-01-21 2021-11-05 未来电视有限公司 Voice detection method and device
CN112784062A (en) * 2019-03-15 2021-05-11 北京金山数字娱乐科技有限公司 Idiom knowledge graph construction method and device
CN110059207A (en) * 2019-04-04 2019-07-26 Oppo广东移动通信有限公司 Processing method, device, storage medium and the electronic equipment of image information

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1977264A (en) * 2004-06-28 2007-06-06 松下电器产业株式会社 Video/audio stream processing device and video/audio stream processing method
CN101281534A (en) * 2008-05-28 2008-10-08 叶睿智 Method for searching multimedia resource based on audio content retrieval
CN101382937A (en) * 2008-07-01 2009-03-11 深圳先进技术研究院 Multimedia resource processing method based on speech recognition and on-line teaching system thereof
CN101539929A (en) * 2009-04-17 2009-09-23 无锡天脉聚源传媒科技有限公司 Method for indexing TV news by utilizing computer system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001043215A (en) * 1999-08-02 2001-02-16 Sony Corp Device and method for processing document and recording medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1977264A (en) * 2004-06-28 2007-06-06 松下电器产业株式会社 Video/audio stream processing device and video/audio stream processing method
CN101281534A (en) * 2008-05-28 2008-10-08 叶睿智 Method for searching multimedia resource based on audio content retrieval
CN101382937A (en) * 2008-07-01 2009-03-11 深圳先进技术研究院 Multimedia resource processing method based on speech recognition and on-line teaching system thereof
CN101539929A (en) * 2009-04-17 2009-09-23 无锡天脉聚源传媒科技有限公司 Method for indexing TV news by utilizing computer system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JP特开2001-43215A 2001.02.16

Also Published As

Publication number Publication date
CN102074235A (en) 2011-05-25

Similar Documents

Publication Publication Date Title
CN102074235B (en) Method of video speech recognition and search
CN109844708B (en) Recommending media content through chat robots
US8914363B2 (en) Disambiguating tags in network based multiple user tagging systems
CN104885081B (en) Search system and corresponding method
US8914368B2 (en) Augmented and cross-service tagging
US9348886B2 (en) Formation and description of user subgroups
KR102140177B1 (en) Answering questions using environmental context
US20140074466A1 (en) Answering questions using environmental context
US20130179426A1 (en) Search and Retrieval Methods and Systems of Short Messages Utilizing Messaging Context and Keyword Frequency
US20120124029A1 (en) Cross media knowledge storage, management and information discovery and retrieval
CN112364624B (en) Keyword extraction method based on deep learning language model fusion semantic features
EP1969481A1 (en) Browsing items related to email
Ayache et al. Evaluation of active learning strategies for video indexing
US9235634B2 (en) Method and server for media classification
CN106126605B (en) Short text classification method based on user portrait
Martín et al. Using semi-structured data for assessing research paper similarity
CN103942328A (en) Video retrieval method and video device
CN112883248B (en) Information pushing method and device and electronic equipment
CN112836008B (en) Index establishing method based on decentralized storage data
EP3144825A1 (en) Enhanced digital media indexing and retrieval
US11328218B1 (en) Identifying subjective attributes by analysis of curation signals
Ding et al. A web service discovery method based on tag
US20210342393A1 (en) Artificial intelligence for content discovery
Gupta et al. Considering manual annotations in dynamic segmentation of multimodal lifelog data
de Jesus Oliveira et al. Taylor–impersonation of AI for audiovisual content documentation and search

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 201203 Shanghai City, Pudong New Area Zhangjiang hi tech Park Keyuan Road No. 399 Building No. 1

Patentee after: HUAQIN TELECOM TECHNOLOGY Co.,Ltd.

Address before: 201203 Shanghai City, Pudong New Area Zhangjiang hi tech Park Keyuan Road No. 399 Building No. 1

Patentee before: SHANGHAI HUAQIN TELECOM TECHNOLOGY Co.,Ltd.

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: Building 1, No. 399 Keyuan Road, Zhangjiang hi tech park, Pudong New Area, Shanghai, 201203

Patentee after: Huaqin Technology Co.,Ltd.

Address before: Building 1, No. 399 Keyuan Road, Zhangjiang hi tech park, Pudong New Area, Shanghai, 201203

Patentee before: Huaqin Technology Co.,Ltd.

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Building 1, No. 399 Keyuan Road, Zhangjiang hi tech park, Pudong New Area, Shanghai, 201203

Patentee after: Huaqin Technology Co.,Ltd.

Address before: 201203 Shanghai City, Pudong New Area Zhangjiang hi tech Park Keyuan Road No. 399 Building No. 1

Patentee before: HUAQIN TELECOM TECHNOLOGY Co.,Ltd.