KR101832050B1 - Tagging method for mutimedia contents base on sound data and system using the smae - Google Patents
Tagging method for mutimedia contents base on sound data and system using the smae Download PDFInfo
- Publication number
- KR101832050B1 KR101832050B1 KR1020160036059A KR20160036059A KR101832050B1 KR 101832050 B1 KR101832050 B1 KR 101832050B1 KR 1020160036059 A KR1020160036059 A KR 1020160036059A KR 20160036059 A KR20160036059 A KR 20160036059A KR 101832050 B1 KR101832050 B1 KR 101832050B1
- Authority
- KR
- South Korea
- Prior art keywords
- voice
- tag
- server
- multimedia
- keyword information
- Prior art date
Links
Images
Classifications
-
- G06F17/30038—
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G06F17/30026—
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
Abstract
Disclosed herein is a voice data-based multimedia content tagging method for generating a voice tag based on voice data of multimedia contents and tagging the generated voice tag to multimedia contents. The present invention provides a method of tagging multimedia content based multimedia content, the method comprising the steps of: a server generating a voice tag based on the extracted voice keyword information; And the server tagging the generated voice tag to the multimedia content. Accordingly, the user of the mobile terminal can be provided with a search service that allows the user to search for the desired multimedia content. In addition, a reliable search result can be obtained by searching a voice tag associated with a specific search word among voice tags generated based on voice data at a search associated with a specific search word.
Description
BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a method of tagging multimedia content based on voice data and a system using the same, and more particularly, to a system and method for generating a voice tag based on voice data of multimedia contents, A content tagging method, and a system using the same.
Generally, multimedia content refers to contents of information service utilized in systems and services for creating, transmitting, and processing various types of information such as text, voice and images.
Such multimedia contents can transmit much more information during the same time than other images, sounds, and texts, so that the demand for multimedia contents is relatively increased compared to contents composed of other images, sounds, and texts.
However, the conventional method of searching for multimedia contents requires users to search for actual multimedia contents by playing them, or to search based on descriptive contents composed of images or text for describing the multimedia contents. Therefore, There is a disadvantage in that it takes a lot of time to search for the content desired by the user,
In order to solve the drawback that occurs when retrieving multimedia contents as described above, it is necessary to include tag information that is visible as an image in multimedia, and Korean Patent No. 10-1403317 which provides tag information to a user However, the tag information is also composed of an image, and it is necessary to check the image at the time of searching, to search for the desired multimedia content, and to use only the image stored in the tag information among the images included in the multimedia contents There is a problem that the search result can not be trusted because the multimedia content is searched.
Accordingly, it is possible to provide a service capable of searching for multimedia contents desired by the user, and it is possible to search multimedia contents desired by the user using the tag information without checking the contents of the multimedia contents, And to find a way to provide services.
SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and it is an object of the present invention to provide a voice data-based multimedia content tagging method for generating a voice tag based on voice data of multimedia contents and tagging the generated voice tag to multimedia contents And a system.
Another object of the present invention is to provide a voice data-based multimedia content tagging method and system capable of searching for multimedia content associated with a specific search word based on a voice tag.
According to another aspect of the present invention, there is provided a method for tagging multimedia content based on multimedia data, the method comprising: extracting voice keyword information based on multimedia contents; The server generating voice tags based on the extracted voice keyword information; And the server tagging the generated voice tag to the multimedia content.
The server extracts voice data included in the multimedia content on a morphological basis, selects voice data corresponding to vocabulary morpheme among the separated voice data, and transmits the selected voice data to the voice It can be extracted by keyword information.
The server generates text data of the extracted voice keyword and matches the textual voice keyword information with the synchronization time information of the voice data synchronized with the time line of the multimedia contents, Tag can be generated.
In addition, the tagging step may be performed by the server, and the server adds the generated voice tag to the multimedia content, and encodes and tags the voice tag in a predetermined format.
The server may generate the voice tag by allowing the server to match the extracted voice keyword information with the synchronization time information of the voice data synchronized with the time line of the multimedia content.
The server generates the voice tag information by matching the extracted voice keyword information with the synchronization time information of the voice data synchronized with the time line of the multimedia content and the URL address linked with the multimedia content, Can be generated.
In the generating step, the server texts the extracted voice keyword information, sets at least one of the voice keyword information in the textual voice keyword information as a keyword, and sets remaining voice keyword information Is set as a stop word, the voice data set as the stop word is filtered out, and only the voice data set by the keyword is selected to generate the voice tag.
According to another aspect of the present invention, there is provided a method of tagging multimedia content based on multimedia data, the method comprising: requesting a mobile terminal to search the server based on a specific search word; And performing the requested search by the server.
Here, the performing step may perform the search by allowing the server to compare the tagged voice tag with the search word to detect a voice tag associated with the search word among the tagged voice tags.
The performing step may further include a step of the server detecting a voice tag associated with the search word and providing the mobile terminal with a voice tag associated with the search word as a result of the search, The voice tag may be preferentially provided with a voice tag that includes voice data similar to the search word, and if the voice tag includes the same voice data as the search word, A voice tag of a multimedia content having a relatively large number of download requests and a large number of times of real-time reproduction requests may be preferentially provided by a mobile terminal than a voice tag of a multimedia content having a relatively small number of download requests and a number of real-time reproduction requests.
According to another aspect of the present invention, there is provided a system for tagging multimedia content based on multimedia data, comprising: extracting voice keyword information based on the multimedia content; generating a voice tag based on the extracted voice keyword information; A server for tagging the generated voice tag to the multimedia contents; And a mobile terminal from which the tagged multimedia content is provided from the server; .
According to another aspect of the present invention, there is provided a method for tagging multimedia content based on multimedia data, comprising the steps of: extracting voice keyword information based on multimedia contents; Generating a voice tag based on the extracted voice keyword information by the mobile terminal; And tagging the generated voice tag to the multimedia content by the mobile terminal, wherein, when the voice keyword information is extracted, the mobile terminal extracts the extracted voice keyword information and the multimedia content The path information of the storage path is matched to generate the voice tag.
Accordingly, the user of the mobile terminal can be provided with a search service that allows the user to search for the desired multimedia content.
In addition, a reliable search result can be obtained by searching a voice tag associated with a specific search word among voice tags generated based on voice data at a search associated with a specific search word.
FIG. 1 is a diagram illustrating a multimedia data content based tagging system according to an exemplary embodiment of the present invention. Referring to FIG.
2 is a diagram illustrating a configuration of a voice data-based multimedia content tagging system according to an embodiment of the present invention.
3 is a flowchart illustrating a method of tagging multimedia data based on multimedia data according to an embodiment of the present invention.
4 is a diagram for explaining a data structure of a multimedia content tagged by a voice data-based multimedia content tagging method according to an embodiment of the present invention.
5 is a flowchart illustrating a method for tagging multimedia data based multimedia contents according to an exemplary embodiment of the present invention.
FIG. 6 is a diagram for explaining a process of extracting voice keyword information in a voice data-based multimedia content tagging method according to an embodiment of the present invention.
FIG. 7 is a diagram illustrating a process of generating a voice tag using a voice data-based multimedia content tagging method according to an embodiment of the present invention.
FIG. 8 is a diagram illustrating a process of generating a voice tag using a voice data-based multimedia content tagging method according to an embodiment of the present invention. Referring to FIG.
FIG. 9 is a flowchart illustrating a method for tagging multimedia data based multimedia contents according to an exemplary embodiment of the present invention.
Hereinafter, the present invention will be described in detail with reference to the drawings. The embodiments described below are provided by way of example so that those skilled in the art will be able to fully understand the spirit of the present invention. The present invention is not limited to the embodiments described below and may be embodied in other forms.
FIG. 1 is a diagram illustrating a multimedia data content tagging system according to an embodiment of the present invention. FIG. 2 is a block diagram illustrating a configuration of a multimedia data content tagging system according to an embodiment of the present invention. FIG.
Hereinafter, a voice data-based multimedia content tagging system according to an embodiment of the present invention will be described with reference to FIGS. 1 and 2. FIG.
The multimedia data content tagging system according to the present embodiment generates a voice tag based on voice data of multimedia contents, tags the generated voice tag to multimedia contents, And is provided for performing retrieval of multimedia contents.
To this end, the present audio data based multimedia content tagging system includes a
The
Specifically, the
In addition, when a search based on a specific search term is requested from the
To this end, the
The
The
The
The
In addition, the
The
The
The
The
The
In addition, the
The
FIG. 3 is a flowchart illustrating a speech data-based multimedia content tagging method according to an exemplary embodiment of the present invention. FIG. 4 is a flowchart illustrating a method of tagging multimedia content tagged with a speech data-based multimedia content tagging method according to an exemplary embodiment of the present invention. And the like.
Hereinafter, a method of tagging multimedia data based on multimedia data according to the present embodiment will be described with reference to FIG. 3 to FIG.
First, the
Here, the morpheme is a minimum unit at the morphological level of the language, which gives a function of meaning, and the vocabulary morpheme is a morpheme for expressing a specific object, an action, and a state.
On the other hand, when the voice keyword information is extracted, the
In another example, the
At this time, the
When the
When the voice tag is provided, the mobile terminal can receive the multimedia contents linked to the URL address by decrypting the URL address information area of the voice tag.
When the voice tag is generated, the
At this time, the
At this time, the encoded and tagged multimedia contents can be stored in the
As a result, the multimedia contents tagged with the voice tag can be composed of the data area of the voice tag and the data area of the multimedia contents as shown in FIG.
When the multimedia contents are encoded, the
In another example, the
Here, the storage path means a storage path of a file stored in the storage unit in the form of a file of the multimedia contents.
When the
Meanwhile, according to another embodiment of the present invention, a
When the voice keyword information is extracted, the
More specifically, when the voice keyword information is extracted, the
Here, if the voice tag is generated by matching the extracted voice keyword information and the route information, the
As a concrete example of tagging the generated voice tag to the multimedia content, the
Accordingly, when the
FIG. 5 is a flowchart illustrating a method of tagging multimedia data based on multimedia data according to an embodiment of the present invention. FIG. 6 is a flowchart illustrating a method of tagging multimedia data based on a speech data according to an embodiment of the present invention. FIG. 7 is a view for explaining a process of generating a voice tag by a voice data-based multimedia content tagging method according to an embodiment of the present invention, and FIG. Is a diagram illustrating a process of generating a voice tag using a voice data-based multimedia content tagging method according to an embodiment of the present invention.
Hereinafter, the method of tagging multimedia data based on multimedia data according to the present embodiment will be described in more detail with reference to FIG. 5 to FIG.
First, as described above, the
As shown in FIG. 6, for example, when it is assumed that the specific multimedia contents include voice data that "Mongyong has fallen in love once in a swing on a swing" , "Love", "love", "love", "love", "love", "love" Can be separated into morpheme units such as " e ", "bar "," - lost ", and "-da"
The
Meanwhile, when the
Here, FIG. 7 is a diagram schematically illustrating a time line of multimedia contents, and FIG. 8 is a view illustrating a voice tag generated by matching voice keyword information with synchronization time information. More specifically, the extracted voice keyword information is extracted voice data, and the synchronization time information of voice data is information including a synchronization start time and a synchronization end time of voice data synchronized with the timeline of the multimedia contents.
As shown in FIGS. 7 to 8, for example, assuming that the voice keyword information including the voice data "swing" is synchronized for Ta and the voice keyword information including the voice data " , The
In addition, the
In another example, the
As another example, the
This means that the
Here, the keyword means a headword, and the stop word means a negative word.
In another example, the
When the specific multimedia contents are searched through the search service by matching the voice keyword information with the URL address in the voice tag, the
When the voice tag is generated, the
At this time, the
When the multimedia contents are encoded, the
When the
FIG. 9 is a flowchart illustrating a method for tagging multimedia data based multimedia contents according to an exemplary embodiment of the present invention.
Hereinafter, with reference to FIG. 9, a method of tagging multimedia data based on multimedia data according to the present embodiment will be described in detail.
First, the
On the other hand, when the voice keyword information is extracted, the
In addition, when the voice tag is generated, the
After tagging the voice tag to the multimedia content, if the
If there is a voice tag associated with the search word among the tagged voice tags (S450-Y), the
Specifically, the
In addition, if there are a plurality of voice tags including the same voice data as the search words, the
Here, the number of download requests for multimedia content and the number of real-time playback requests refer to the number of times that download requests are requested by other
Accordingly, at the time of searching associated with a specific search term, a voice tag associated with a specific search term among voice tags generated based on voice data can be searched, and a reliable search result can be obtained through search.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is clearly understood that the same is by way of illustration and example only and is not to be construed as limiting the scope of the invention as defined by the appended claims. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.
100: server 110:
120: control unit 130:
200: mobile terminal 210:
220: control unit 230:
240:
Claims (12)
The server generating voice tags based on the extracted voice keyword information;
The server tagging the generated voice tag to the multimedia content;
The mobile terminal requests the server to search based on a specific search word; And
And the server performing the requested search,
Wherein the extracting step comprises:
The server extracts voice data included in the multimedia contents in morpheme units, selects voice data corresponding to vocabulary morpheme among the separated voice data, extracts the selected voice data as the voice keyword information,
Wherein the generating comprises:
The server texts the extracted voice keyword information and converts the textualized voice keyword information into the synchronization time information of the voice data synchronized with the time line of the multimedia contents and the URL address linked with the multimedia contents And sets at least one textual voice keyword information of the textualized voice keyword information as a keyword, and sets remaining voice keyword information not set as the keyword as a stop word (Stop Word) , The speech data set by the stop word is filtered and excluded, and only the speech data set by the keyword is selected to generate the speech tag,
The generated voice tag is not tagged to the multimedia content but stored in the server in a file format separate from the multimedia content when the voice tag is generated by matching with the URL address,
Wherein the performing step comprises:
The server compares the voice tag stored in the file with the search word to detect a voice tag associated with the search word among the voice tags stored in the file format, A voice tag associated with the search word is provided as a result of the search, wherein a voice tag including voice data identical to the search word is preferentially provided to a voice tag including voice data similar to the search word,
If the number of the voice tags including the same voice data as the search word is larger than the number of the voice tags including the same voice data as the search word, Wherein the voice tags of the multimedia contents are provided preferentially over the voice tags of the multimedia contents having the relatively small number of download requests and the number of real-time reproduction requests.
The tagging step includes:
Wherein the server adds the generated voice tag to the multimedia content, and encodes and tags the voice tag in a predetermined format.
A mobile terminal from which the tagged multimedia content is provided; / RTI >
The mobile terminal comprises:
A search request can be made to the server based on a specific search word,
The server comprises:
When a search is requested through the mobile terminal based on the specific search word, performing a requested search,
The server comprises:
Extracting voice data included in the multimedia content by morpheme units, selecting voice data corresponding to vocabulary morpheme among the separated voice data, extracting the selected voice data as the voice keyword information,
The server comprises:
The textualized voice keyword information is matched with the synchronization time information of the voice data synchronized with the time line of the multimedia contents and the URL address linked with the multimedia contents, Tag, wherein at least one textual voice keyword information of the textualized voice keyword information is set as a keyword, and if the remaining voice keyword information not set as the keyword is set as a stop word, The voice data set as the stop word is filtered out, and only the voice data set by the keyword is selected to generate the voice tag,
The generated voice tag is not tagged to the multimedia content but stored in the server in a file format separate from the multimedia content when the voice tag is generated by matching with the URL address,
The server comprises:
Wherein the voice tag stored in the file format is compared with the search word to detect a voice tag associated with the search word among the voice tags stored in the file format, So that a voice tag including voice data identical to the search word is provided preferentially to a voice tag including voice data similar to the search word,
If the number of the voice tags including the same voice data as the search word is larger than the number of the voice tags including the same voice data as the search word, Wherein the voice tag of the multimedia tag is preferentially provided to the voice tag of the multimedia contents having a relatively small number of the download requests and the number of the real-time reproduction requests.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160036059A KR101832050B1 (en) | 2016-03-25 | 2016-03-25 | Tagging method for mutimedia contents base on sound data and system using the smae |
PCT/KR2017/001103 WO2017164510A2 (en) | 2016-03-25 | 2017-02-02 | Voice data-based multimedia content tagging method, and system using same |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160036059A KR101832050B1 (en) | 2016-03-25 | 2016-03-25 | Tagging method for mutimedia contents base on sound data and system using the smae |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170111161A KR20170111161A (en) | 2017-10-12 |
KR101832050B1 true KR101832050B1 (en) | 2018-02-23 |
Family
ID=59900594
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020160036059A KR101832050B1 (en) | 2016-03-25 | 2016-03-25 | Tagging method for mutimedia contents base on sound data and system using the smae |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR101832050B1 (en) |
WO (1) | WO2017164510A2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023233421A1 (en) * | 2022-05-31 | 2023-12-07 | Humanify Technologies Pvt Ltd | System and method for tagging multimedia content |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102523135B1 (en) * | 2018-01-09 | 2023-04-21 | 삼성전자주식회사 | Electronic Device and the Method for Editing Caption by the Device |
CN109215657A (en) * | 2018-11-23 | 2019-01-15 | 四川工大创兴大数据有限公司 | A kind of grain depot monitoring voice robot and its application |
KR20220138512A (en) | 2021-04-05 | 2022-10-13 | 이피엘코딩 주식회사 | Image Recognition Method with Voice Tagging for Mobile Device |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007156286A (en) * | 2005-12-08 | 2007-06-21 | Hitachi Ltd | Information recognition device and information recognizing program |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090062371A (en) * | 2007-12-13 | 2009-06-17 | 주식회사 그래텍 | System and method for providing additional information |
CN103119621B (en) * | 2010-04-30 | 2016-12-07 | 当今技术(Ip)有限公司 | Content management device |
KR101356006B1 (en) * | 2012-02-06 | 2014-02-12 | 한국과학기술원 | Method and apparatus for tagging multimedia contents based upon voice enable of range setting |
KR20130141094A (en) * | 2012-06-15 | 2013-12-26 | 휴텍 주식회사 | Method for managing searches of web-contents using voice tags, and computer-readable recording medium with management program for the same |
-
2016
- 2016-03-25 KR KR1020160036059A patent/KR101832050B1/en active IP Right Grant
-
2017
- 2017-02-02 WO PCT/KR2017/001103 patent/WO2017164510A2/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007156286A (en) * | 2005-12-08 | 2007-06-21 | Hitachi Ltd | Information recognition device and information recognizing program |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023233421A1 (en) * | 2022-05-31 | 2023-12-07 | Humanify Technologies Pvt Ltd | System and method for tagging multimedia content |
Also Published As
Publication number | Publication date |
---|---|
WO2017164510A2 (en) | 2017-09-28 |
KR20170111161A (en) | 2017-10-12 |
WO2017164510A3 (en) | 2018-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11798528B2 (en) | Systems and methods for providing notifications within a media asset without breaking immersion | |
US11197036B2 (en) | Multimedia stream analysis and retrieval | |
KR101777981B1 (en) | Real-time natural language processing of datastreams | |
KR101832050B1 (en) | Tagging method for mutimedia contents base on sound data and system using the smae | |
US8374845B2 (en) | Retrieving apparatus, retrieving method, and computer program product | |
CN101778233B (en) | Data processing apparatus, data processing method | |
JP5894149B2 (en) | Enhancement of meaning using TOP-K processing | |
US9426411B2 (en) | Method and apparatus for generating summarized information, and server for the same | |
JP6337183B1 (en) | Text extraction device, comment posting device, comment posting support device, playback terminal, and context vector calculation device | |
KR20120029861A (en) | Method for providing media-content relation information, device, server, and storage medium thereof | |
US20150178387A1 (en) | Method and system of audio retrieval and source separation | |
CN107193922B (en) | A kind of method and device of information processing | |
JP2019008779A (en) | Text extraction apparatus, comment posting apparatus, comment posting support apparatus, reproduction terminal, and context vector calculation apparatus | |
JP5474591B2 (en) | Image selection apparatus, image selection method, and image selection program | |
KR101902784B1 (en) | Metohd and apparatus for managing audio data using tag data | |
US20220318283A1 (en) | Query correction based on reattempts learning | |
KR102435243B1 (en) | A method for providing a producing service of transformed multimedia contents using matching of video resources | |
US20210134290A1 (en) | Voice-driven navigation of dynamic audio files | |
JP2010283707A (en) | Onboard electronic device and image update system for the same | |
KR20220130861A (en) | Method of providing production service that converts audio into multimedia content based on video resource matching | |
KR20220130859A (en) | A method of providing a service that converts voice information into multimedia video contents | |
KR20220130862A (en) | A an apparatus for providing a producing service of transformed multimedia contents | |
KR20220130860A (en) | A method of providing a service that converts voice information into multimedia video contents | |
CN116483946A (en) | Data processing method, device, equipment and computer program product | |
Venkataraman et al. | A Natural Language Interface for Search and Recommendations of Digital Entertainment Media |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
AMND | Amendment | ||
E90F | Notification of reason for final refusal | ||
AMND | Amendment | ||
E601 | Decision to refuse application | ||
E801 | Decision on dismissal of amendment | ||
AMND | Amendment | ||
X701 | Decision to grant (after re-examination) | ||
GRNT | Written decision to grant |