WO2024096276A1

WO2024096276A1 - Method for analyzing and recommending content

Info

Publication number: WO2024096276A1
Application number: PCT/KR2023/012090
Authority: WO
Inventors: 이종근
Original assignee: 캡슐미디어 주식회사
Priority date: 2022-11-03
Filing date: 2023-08-16
Publication date: 2024-05-10
Also published as: KR20240065343A

Abstract

A method for recommending content, according to the present invention, may comprise the steps of: obtaining a first semantic descriptor related to a story from video data of sample content; obtaining a second semantic descriptor related to content from text data of the sample content; obtaining a third semantic descriptor related to mood from audio data of the sample content; generating a first semantic document corresponding to the sample content on the basis of the first semantic descriptor, the second semantic descriptor, and the third semantic descriptor; calculating a similarity between the first semantic document and a plurality of pre-stored semantic documents corresponding to a plurality of content; and extracting at least one resultant content from among the plurality of content on the basis of the similarity.

Description

Content analysis and content recommendation methods

The present invention relates to content analysis and content recommendation methods, and more specifically, to a method of recommending similar content based on multi-modal latent semantic analysis techniques.

Recently, the content-related industry has been actively progressing. The most fundamental thing in the content-related industry may be content and the creators who produce it. A conventional content recommendation system is a user-centered recommendation system that finds users similar to the user and recommends content that similar users have watched or liked to the user. However, in essence, the user's request is to find content similar to the content he or she likes, not to users similar to the user. Furthermore, the user's request is to find creators who produce content similar to the content the user likes. Therefore, there is a need for a content recommendation system that analyzes content based on deep learning and recommends the most similar content and the creator who produced it, focusing on content rather than the user.

One object of the present invention relates to a method for analyzing content and recommending content based on deep learning.

According to an embodiment of the present invention, a method for analyzing content and recommending content based on deep learning can be provided.

1 is an environmental diagram of a content recommendation system according to an embodiment.

Figure 2 is a block diagram of a content recommendation system according to an embodiment of the present invention.

Figure 3 is a diagram for explaining a method of obtaining a semantic descriptor according to an embodiment.

Figure 4 is a flowchart of a content recommendation method according to an embodiment of the present invention.

Figure 5 is a flowchart of a content extraction method in a content recommendation method according to an embodiment of the present invention.

A content recommendation method according to an embodiment is performed by at least one processor, the method comprising: obtaining a first semantic descriptor related to a story from video data of sample content; Obtaining a second semantic descriptor related to content from text data of the sample content; Obtaining a third semantic descriptor related to mood from audio data of the sample content; generating a first semantic document corresponding to the sample content based on the first semantic descriptor, the second semantic descriptor, and the third semantic descriptor; calculating a degree of similarity between a plurality of pre-stored semantic documents corresponding to each of a plurality of contents and the first semantic document; and extracting at least one result content from among the plurality of contents based on the similarity.

Here, calculating the similarity includes extracting a first feature vector of the first semantic document; extracting a plurality of feature vectors from each of the plurality of semantic documents; and calculating a plurality of similarities, which are the similarities of the first feature vector to each of the plurality of feature vectors, based on a cosine function.

Here, extracting the at least one result content includes sorting the plurality of similarities in descending order; Extracting the top N similarities (N is a natural number) among the plurality of sorted similarities; And it may include extracting N contents corresponding to the N similarities among the plurality of contents.

Here, the step of extracting the at least one result content includes extracting N similarities having a value equal to or higher than a reference value among the plurality of similarities; And it may include extracting N contents corresponding to the N similarities among the plurality of contents.

Here, the step of generating the first semantic document may be a step based on a latent semantic analysis (LSA) technique.

Here, generating the first semantic document includes setting a first weight for the first semantic descriptor, a second weight for the second semantic descriptor, and a third weight for the third semantic descriptor; and generating the first semantic document using the first weight, the second weight, and the third weight.

Here, the step of obtaining the first semantic descriptor includes generating word data by extracting a word related to an object in each scene from a plurality of scenes included in the video data; generating color data related to color, brightness, and saturation of each scene from the plurality of scenes; and generating the first semantic descriptor based on the word data and the color data.

Here, the step of obtaining the second semantic descriptor may include processing duplicate text, error text, and foreign language conversion from the text data.

Here, the step of obtaining the third semantic descriptor includes converting the audio data into a frequency band; extracting a plurality of parameters from the converted audio data, the plurality of parameters including at least one of harmony, melody, timbre, pitch, rhythm, beat, tempo, genre, and instrumentation; and generating the third semantic descriptor based on the plurality of parameters.

Here, in the step of calculating the plurality of similarities, the similarities can be calculated based on the equation below.

(

: i-th component of the first feature vector,

: i-th component of one of the feature vectors, n: dimension of the feature vector)

Here, setting a first weight for the first semantic descriptor, a second weight for the second semantic descriptor, and a third weight for the third semantic descriptor; generating a first semantic supplementary document corresponding to the sample content using the first weight, the second weight, and the third weight; calculating a similarity between N semantic documents corresponding to each of the N contents and the first semantic supplementary document; And based on the similarity, it may include extracting content with the greatest similarity among the N pieces of content.

Here, a computer-readable non-transitory recording medium on which a program for executing the content recommendation method described above is recorded may be provided.

A content recommendation system according to an embodiment includes a video descriptor extractor that extracts a first semantic descriptor related to a story from video data of sample content; a subtitle descriptor extraction unit that extracts a second semantic descriptor related to content from text data of the sample content; an audio descriptor extraction unit that extracts a third semantic descriptor related to mood from the audio data of the sample content; a latent semantic analysis unit that generates a first semantic document corresponding to the sample content based on the first semantic descriptor, the second semantic descriptor, and the third semantic descriptor; and a similarity determination unit that calculates a similarity between a plurality of pre-stored semantic documents corresponding to each of a plurality of contents and the first semantic document, and extracts at least one result content among the plurality of contents based on the similarity. It can be included.

The embodiments described in this specification are intended to clearly explain the idea of the present invention to those skilled in the art to which the present invention pertains, and the present invention is not limited to the embodiments described in this specification, and the present invention is not limited to the embodiments described in this specification. The scope should be construed to include modifications or variations that do not depart from the spirit of the present invention.

The terms used in this specification are general terms that are currently widely used as much as possible in consideration of their function in the present invention, but this may vary depending on the intention of those skilled in the art, precedents, or the emergence of new technology in the technical field to which the present invention pertains. You can. However, if a specific term is defined and used in an arbitrary sense, the meaning of the term will be described separately. Therefore, the terms used in this specification should be interpreted based on the actual meaning of the term and the overall content of this specification, not just the name of the term.

The drawings attached to this specification are intended to easily explain the present invention, and the shapes shown in the drawings may be exaggerated as necessary to aid understanding of the present invention, so the present invention is not limited by the drawings.

In this specification, if it is determined that a detailed description of a known configuration or function related to the present invention may obscure the gist of the present invention, the detailed description thereof will be omitted as necessary.

Referring to FIG. 1, the content recommendation system 1000 can communicate with the user terminal 2000 to exchange data with each other.

The content recommendation system 1000 may receive data related to sample content from the user terminal 2000. Sample content may be content requested by the user of the user terminal 2000. Specifically, the sample content may be a sample for a user to request that the content recommendation system 1000 recommend similar content. At this time, content similar to the sample content requested by the user from the content recommendation system 1000 may be content having a similar mood, similar story, similar audio, etc. as the sample content.

The user terminal 2000 may request a video recommendation service similar to sample content from the content recommendation system 1000. Specifically, the user of the user terminal 2000 may request a creator recommendation service that produces content similar to sample content.

The conventional content recommendation system was a system that recommended content that users with similar tendencies had viewed. In other words, the conventional content recommendation system provided a user-centered content recommendation service.

However, the content recommendation system 1000 of the present invention can provide a content-centered recommendation service rather than a user-centered recommendation service. Specifically, the content recommendation system 1000 of the present invention analyzes sample content received from a user, calculates the similarity between the sample content and other content, and provides a service that recommends content to the user based on the similarity. .

Furthermore, the content recommendation system 1000 of the present invention can provide a service for recommending creators who create content similar to sample content. In other words, the content recommendation system 1000 of the present invention can connect the user with a creator who produces content of a similar style to the sample content so that the user can continuously access content of the desired style.

Referring to FIG. 2, the content recommendation system 1000 according to an embodiment of the present invention includes a control unit 100, a video descriptor extraction unit 200, a subtitle descriptor extraction unit 300, an audio descriptor extraction unit 400, It may include a latent meaning analysis unit 500 and a similarity determination unit 600. FIG. 2 illustrates that the content recommendation system 1000 includes six components, but the content recommendation system 1000 is not limited to this and may have more or fewer components. Additionally, each component may be merged with each other.

The control unit 100 may oversee the operation of the content recommendation system 1000. The control unit 100 may be a control processor of the content recommendation system 1000. Specifically, the control unit 100 sends control commands to the video descriptor extraction unit 200, the subtitle descriptor extraction unit 300, the audio descriptor extraction unit 400, the latent semantic analysis unit 500, and the similarity determination unit 600. You can send it to each department to execute its actions. Additionally, the control unit 100 includes a communication module and can transmit and receive data with the user terminal 2000.

Unless otherwise specified below, the operation of the content recommendation system 1000 may be interpreted as being performed under the control of the control unit 100.

The video descriptor extraction unit 200 may extract and/or obtain a first semantic descriptor related to the story of the sample content based on video data of the sample content acquired from the user terminal 2000. At this time, the video data may be data obtained by extracting only the image data of the content from the sample content by the control unit 100 or the video descriptor extractor 200. Also, at this time, the story may be a video flow related to the contents of the sample content. The video descriptor extractor 200 may be a deep learning-based processor of the content recommendation system 1000.

The video descriptor extractor 200 may extract a first semantic descriptor related to the story from the video data. The first semantic descriptor is a parameter for latent semantic analysis and may include words related to each scene from a plurality of scenes included in the video data.

The video descriptor extractor 200 may generate word data by extracting a word related to an object in each scene from a plurality of scenes included in the video data.

Referring to FIG. 3, the video descriptor extractor 200 may detect an object in one scene 3100 among a plurality of scenes. Specifically, the video descriptor extractor 200 is a deep learning-based processor and can extract the first object 3110, the second object 3120, and the third object 3130 from the scene 3100.

The video descriptor extractor 200 may perform labeling on the extracted object. For example, the video descriptor extractor 200 sets the first object 3110 as a person, the second object 3120 as a person, and the third object 3130 as a TV monitor. ) can be labeled.

The video descriptor extractor 200 may generate word data based on the labeled words. For example, word data generated from the example of FIG. 3 may include words for people and TV monitors. Word data may be composed of a matrix, and each word may be a component of the matrix, but the data is not limited to this.

The video descriptor extractor 200 may generate color data related to the color, brightness, and saturation of each scene from a plurality of scenes. The video descriptor extractor 200 may detect the color profile of one scene 3100 among a plurality of scenes. For example, the video descriptor extractor 200 may extract the color spectrum, brightness spectrum, and saturation spectrum of the scene 3100.

The video descriptor extractor 200 may generate color data based on the extracted spectral data. For example, color data generated from the example of FIG. 3 may include the words beige, brown, and turquoise (turquoise). Color data is composed of a matrix, and words related to color, brightness, and saturation may be components of the matrix, but are not limited to this.

The video descriptor extractor 200 may generate a first semantic descriptor based on the generated word data and color data. Specifically, the first semantic descriptor may be data including both word data and color data. For example, the first semantic descriptor may be a matrix whose components include all words included in word data and color data, but is not limited to this.

The video descriptor extraction unit 200 may generate a first semantic descriptor by reflecting the weight. Specifically, the user may transmit information about factors to be considered as a priority to the content recommendation system 1000 through the user terminal 2000. For example, the user may request recommendation of content with particularly similar colors through the user terminal 2000.

At this time, the video descriptor extractor 200 may generate the first semantic descriptor by giving more weight to the color data than to the word data. For example, when the video descriptor extractor 200 generates a matrix that is the sum of word data and color data, the first weight for words included in the word data and the second weight for words included in the color data can be set. At this time, the video descriptor extractor 200 may generate the first semantic descriptor by making the second weight have a larger value than the first weight. However, it is not limited to this method, and the video descriptor extractor 200 may generate the first semantic descriptor by reflecting the weight in other ways, such as increasing the number of words.

The subtitle descriptor extraction unit 300 may extract and/or obtain a second semantic descriptor related to the content of the sample content based on text data of the sample content obtained from the user terminal 2000. At this time, the text data may be data obtained by extracting only the subtitles of the content from the sample content by the control unit 100 or the subtitle descriptor extraction unit 300. Also, at this time, the content may be commentary and/or summary included in the subtitles of the sample content. The subtitle descriptor extraction unit 300 may be a deep learning-based processor of the content recommendation system 1000.

The subtitle descriptor extraction unit 300 may extract a second semantic descriptor related to content from text data. The second semantic descriptor is a parameter for latent semantic analysis and may include words that are the result of processing such as noise removal and text normalization from words included in text data.

The subtitle descriptor extractor 300 may perform noise removal on text data. Specifically, the subtitle descriptor extractor 300 can remove noise from text data by removing duplicate text and correcting misspelled text (erroneous text) in text data.

Additionally, the subtitle descriptor extractor 300 may normalize text data. Specifically, the subtitle descriptor extraction unit 300 may normalize the text data by performing Korean conversion on the foreign language of the text data. At this time, the subtitle descriptor extractor 300 can perform foreign language conversion by using a translator.

Additionally, the subtitle descriptor extractor 300 can normalize text data by organizing similar words. For example, text data may contain the words 'nature', 'nature', and 'nature'. At this time, since the above three languages all contain the same meaning, the subtitle descriptor extraction unit 300 converts all of the words 'nature' and 'nature' into the representative word 'nature' to perform similar word organizing. You can. At this time, the subtitle descriptor extractor 300 may organize the similar words by replacing representative words stored in advance for the similar words.

The audio descriptor extraction unit 400 may extract and/or obtain a third semantic descriptor related to the mood of the sample content based on the audio data of the sample content acquired from the user terminal 2000. At this time, the audio data may be data obtained by extracting only the voice data of the content from the sample content by the control unit 100 or the audio descriptor extractor 400. Also, at this time, the mood may be a musical element that determines the atmosphere or auditory concept of the sample content. The audio descriptor extractor 400 may be a deep learning-based processor of the content recommendation system 1000.

The audio descriptor extractor 400 may extract a third semantic descriptor related to mood from audio data. The third semantic descriptor is a parameter for latent semantic analysis and can be generated based on audio data converted to a frequency band. For example, the audio descriptor extractor 400 may be a processor based on Convolutional Neural Networks (CNN) that can extract a third semantic descriptor from audio data.

The audio descriptor extractor 400 may extract a plurality of parameters from audio data converted to a frequency band. At this time, the plurality of parameters may include at least one of harmony, melody, tone, pitch, rhythm, beat, tempo, genre, and instrumentation. The audio descriptor extractor 400 may generate a third semantic descriptor based on the plurality of extracted parameters. For example, if the genre of the audio data is rock, the instrument arrangement is electric guitar, and the tempo is 100 to 120, the third semantic descriptor is one whose components include the words rock, electric guitar, and tempo 100 to 120. It may be a matrix, but is not limited to this.

The latent semantic analysis unit 500 may obtain a first semantic descriptor from the video descriptor extraction unit 200, a second semantic descriptor from the subtitle descriptor extraction unit 300, and a third semantic descriptor from the audio descriptor extraction unit 400. there is. The latent semantic analysis unit 500 may generate a first semantic document corresponding to the sample content based on the first semantic descriptor, the second semantic descriptor, and the third semantic descriptor.

The latent semantic analysis unit 500 is a processor capable of performing deep learning and can generate a first semantic document based on a latent semantic analysis (LSA) technique.

The latent semantic analysis unit 500 may perform singular value decomposition (SVD) on each of the first semantic descriptor, second semantic descriptor, and third semantic descriptor, which are matrices of words. Specifically, the latent semantic analysis unit 500 decomposes the first to third semantic descriptor matrices into two orthogonal matrices (one row dimension of the matrix, one column dimension of the matrix) and one rectangular diagonal matrix. A total of 9 matrices can be created by singular value decomposition.

The latent semantic analysis unit 500 may perform truncated singular value decomposition (TSVD) on the nine matrices generated by singular value decomposition. Specifically, the latent semantic analysis unit 500 can generate a matrix consisting of parameters that affect the result by cutting only the top few components of the nine matrices.

The latent semantic analysis unit 500 may input the truncated matrices through truncated singular value decomposition into a latent semantic analysis model to generate a first semantic document corresponding to the sample content. At this time, the first semantic document may be in the form of a matrix including a plurality of components. Also, at this time, the first semantic document may include feature values extracted for the video, subtitles, and audio of the sample content.

When generating the first semantic document, the latent semantic analysis unit 500 may reflect the weight for each semantic descriptor. Specifically, the latent semantic analysis unit 500 may set a first weight for the first semantic descriptor, a second weight for the second semantic descriptor, and a third weight for the third semantic descriptor. At this time, the first to third weights may be set based on the user's selection or user information obtained from the user terminal 2000. At this time, the sum of the first to third weights may be 1.

For example, if a user wants to find content with a similar color to the sample content, the first weight of the first semantic descriptor based on the color can be set higher than the second weight and the third weight. Also, for example, if the user has a job related to music, the third weight of the third semantic descriptor based on auditory elements may be set higher than the first weight and the second weight.

The latent semantic analysis unit 500 may generate a first semantic document using the first weight, second weight, and third weight. Specifically, the latent semantic analysis unit 500 may generate a first semantic document in which the weights are reflected using a matrix in which the first to third weights are components.

The latent semantic analysis unit 500 may generate a semantic document not only for sample content acquired from the user terminal 2000 but also for content acquired from an external server connected to the content recommendation system 1000.

Specifically, the content recommendation system 1000 may obtain content stored in a library of an external server from an external server. The video descriptor extraction unit 200 may extract semantic descriptors related to the story from content acquired from an external server. The subtitle descriptor extraction unit 300 may extract semantic descriptors related to content from content acquired from an external server. The audio descriptor extractor 400 may extract a third semantic descriptor related to mood from content acquired from an external server.

Based on the respective semantic descriptors extracted by the video descriptor extraction unit 200, subtitle descriptor extraction unit 300, and audio descriptor extraction unit 400 from content acquired from an external server, the latent semantic analysis unit 500 A semantic document corresponding to content obtained from the server can be created. The latent semantic analysis unit 500 may store a plurality of semantic documents corresponding to each of the plurality of contents acquired from an external server in the storage of the content recommendation system 1000.

The similarity determination unit 600 may calculate the similarity between the first semantic document corresponding to the sample content and a plurality of pre-stored semantic documents corresponding to each of the plurality of contents authored in the storage. Specifically, the similarity determination unit 600 may calculate the similarity between content obtained from an external server and the sample content in order to find content similar to the sample content requested by the user. The similarity determination unit 600 may be a deep learning-based processor of the content recommendation system 1000.

The similarity determination unit 600 may extract a first feature vector, which is a feature vector of the first semantic document. Additionally, the similarity determination unit 600 may extract a plurality of feature vectors from each of a plurality of semantic documents stored in the storage. The similarity determination unit 600 may calculate similarity based on the extracted feature vector. Specifically, the similarity determination unit 600 can calculate the similarity based on [Equation 1] below.

(

: i-th component of the first feature vector,

For example, the similarity determination unit 600 may use a first feature vector extracted from a first semantic document corresponding to sample content and a second feature extracted from a second semantic document corresponding to the first content among a plurality of pre-stored contents. Similarity can be calculated using vectors. At this time, the first feature vector is the A vector of [Equation 1], the second feature vector is the B vector of [Equation 1], and the similarity between the first feature vector and the second feature vector is based on [Equation 1]. can be calculated. In this way, the similarity determination unit 600 can calculate the similarity between a plurality of feature vectors extracted from a plurality of contents and the first feature vector using [Equation 1].

The similarity determination unit 600 may extract at least one result content based on the calculated similarity. According to one embodiment, the similarity determination unit 600 may sort the calculated plurality of similarities in descending order. The similarity determination unit 600 may extract the top N similarities (N is a natural number) among the plurality of sorted similarities. That is, the similarity determination unit 600 can extract N similarities in order, starting from the one with the highest similarity value.

According to another embodiment, the similarity determination unit 600 may extract N similarities having a value greater than or equal to a reference value among a plurality of similarities. At this time, the reference value is a value that serves as a standard for determining similarity, and can be set by the user's selection or by the efficiency of the system (speed, capacity, etc.).

The similarity determination unit 600 may extract N pieces of content corresponding to the N extracted similarities. Specifically, the similarity determination unit 600 can extract N feature vectors resulting from the calculation of N extracted similarities, extract N semantic documents from which the N feature vectors are extracted, and N semantics. N pieces of content corresponding to each document can be extracted.

According to one embodiment, the control unit 100 may check information on N pieces of content extracted from the similarity determination unit 600. Specifically, content information may include information such as content production/distribution date, production/distribution company, production creator, number of views, and number of likes. The control unit 100 may transmit content information about N pieces of content to the user terminal 1000. The user can be recommended content similar to the sample content and the creator who produced it based on content information about N pieces of content acquired through the user terminal 1000.

According to another embodiment, the similarity determination unit 600 may apply a weight to the N pieces of content extracted to ultimately extract the resulting content that is the most similar content. That is, when extracting N pieces of content, if the content to be compared is all content stored in the storage, the content to be compared can be reduced to N when extracting the most similar content.

The latent semantic analysis unit 500 may set first to third weights for the first to third semantic descriptors, respectively. The latent semantic analysis unit 500 may generate a first semantic supplementary document using the first to third weights. Unlike the first semantic document, the first semantic supplementary document may be a document that reflects the weight of the semantic descriptor.

The latent semantic analysis unit 500 may generate N semantic supplementary documents using the first to third weights for each of the N extracted contents. The similarity determination unit 600 may calculate the similarity between the first semantically complementary document and the N semantically complementary documents. The similarity determination unit 600 extracts the similarity with the largest value among the calculated similarities, extracts the feature vector from which the greatest similarity is derived, extracts a semantic supplementary document from which the feature vector is extracted, and as a result, the meaning Resulting content corresponding to the supplementary document can be extracted.

The control unit 100 may transmit content information of the resulting content to the user terminal 1000. Based on data acquired through the user terminal 1000, the user can be recommended the resulting content and the creator who produced it, which are most similar to the sample content and are weighted and selected.

Referring to FIG. 4, the content recommendation method according to an embodiment of the present invention includes a step of acquiring sample content (S110), a step of acquiring a first semantic descriptor, a second semantic descriptor, and a third semantic descriptor (S120), It may include generating a first semantic document (S130), calculating similarity (S140), and extracting content (S150). The order of each step may change depending on the situation, and each step may be omitted or performed repeatedly depending on the situation.

The step of acquiring sample content (S110) may be a step in which the content recommendation system 1000 communicates with the user terminal 2000 to obtain data related to the sample content from the user terminal 2000. At this time, the content recommendation system 1000 can also obtain information about the user's taste, user information, and configurations (related to weight) that the user considers important, etc. from the user terminal 2000.

The step of acquiring the first semantic descriptor, the second semantic descriptor, and the third semantic descriptor (S120) is a step in which the content recommendation system 1000 acquires each semantic descriptor from the video data, text data, and audio data of the sample content. You can. Step S120 may include extracting video data, text data, and audio data from sample content.

Specifically, video data may be extracted from sample content by the control unit 100 or the video descriptor extraction unit 200. Additionally, text data may be extracted from sample content by the control unit 100 or the subtitle descriptor extraction unit 300. Additionally, audio data may be extracted from sample content by the control unit 100 or the audio descriptor extractor 400. A detailed description of this is omitted as it overlaps with the content of FIG. 2.

Specifically, the video descriptor extractor 200 of the content recommendation system 1000 may extract a first semantic descriptor related to the story from video data of sample content. Additionally, the subtitle descriptor extraction unit 300 of the content recommendation system 1000 may extract a second semantic descriptor related to content from text data of sample content. Additionally, the audio descriptor extractor 400 of the content recommendation system 1000 may extract a third semantic descriptor related to mood from audio data of sample content. A detailed description of this is omitted as it overlaps with the content of FIG. 2.

The step of generating a first semantic document (S130) is to generate a first semantic document corresponding to the sample content based on the first semantic descriptor, second semantic descriptor, and third semantic descriptor acquired by the content recommendation system 1000. It may be a step. The latent semantic analysis unit 500 of the content recommendation system 1000 may generate a first semantic document using first to third semantic descriptors based on a latent semantic analysis technique. Specifically, the latent semantic analysis unit 500 may generate a first semantic document by performing singular value decomposition and truncated singular value decomposition on the first to third semantic descriptors. A detailed description of this is omitted as it overlaps with the content of FIG. 2.

The step of calculating the similarity (S140) may be a step in which the content recommendation system 1000 calculates the similarity between the first semantic document and another semantic document based on a cosine function using the first semantic document generated in step S130. there is. Specifically, the latent semantic analysis unit 500 of the content recommendation system 1000 may extract the first feature vector from the first semantic document. Additionally, the latent semantic analysis unit 500 may extract a plurality of feature vectors from a plurality of semantic documents corresponding to each of the plurality of contents stored in the storage of the content recommendation system 1000. The latent semantic analysis unit 500 may calculate a plurality of similarities, which are the similarities between the first semantic document and the plurality of feature vectors, based on the cosine function.

The content extraction step (S150) may be a step in which the content recommendation system 1000 extracts the top N content based on the similarity calculated in step S140 or extracts content with the greatest similarity. Thereafter, the content recommendation system 1000 may transmit content information about the extracted content to the user terminal 2000. Users can obtain information about the content and the creator who produced the content based on the content information.

Figure 5 is a flowchart of a content extraction method in a content recommendation method according to an embodiment of the present invention. Specifically, Figure 5 is a flowchart of a method for calculating similarity by reflecting weights according to one embodiment.

Referring to FIG. 5, the content extraction method according to an embodiment of the present invention includes the steps of sorting in descending order of similarity (S210), extracting the top N contents (S220), setting weights (S230), It may include a step of generating a semantic supplementary document (S240) and a step of calculating similarity (S250). The order of each step may change depending on the situation, and each step may be omitted or performed repeatedly depending on the situation.

The step of sorting the similarities in descending order (S210) may be a step of sorting the plurality of similarities calculated by the similarity determination unit 600 in descending order.

The step of extracting the top N content (S220) may be a step of extracting the top N content from similarities sorted in descending order.

In the weight setting step (S230), the latent semantic analysis unit 500 sets a first weight for the first semantic descriptor, a second weight for the second semantic descriptor, and a third weight for the third semantic descriptor. It can be. The order of step S230 may change depending on the situation if it is before step S240.

The step of generating a semantically complementary document (S240) may be a step in which the latent semantic analysis unit 500 generates a first semantically complementary document corresponding to the sample content by reflecting the first to third weights. Additionally, the latent semantic analysis unit 500 may use the first to third weights to generate N semantic supplementary documents corresponding to the top N content as well as sample content.

The similarity calculation step (S250) may be a step in which the similarity determination unit 600 calculates the similarity between the first semantically complementary document and the N semantically complementary documents based on a cosine function. The control unit 100 may extract a semantically complementary document with the highest similarity value from the calculated similarity. The control unit 100 may use the content corresponding to the semantically supplemented document with the greatest similarity as the result content and transmit content information about the result content to the user terminal 2000.

The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc., singly or in combination. Program instructions recorded on the medium may be specially designed and configured for the embodiment or may be known and available to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -Includes optical media (magneto-optical media) and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions include machine language code, such as that produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

As described above, although the embodiments have been described with limited examples and drawings, various modifications and variations can be made by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent.

Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

Claims

In a content recommendation method performed by at least one processor,

Obtaining a first semantic descriptor related to a story from video data of sample content;

Obtaining a second semantic descriptor related to content from text data of the sample content;

Obtaining a third semantic descriptor related to mood from audio data of the sample content;

generating a first semantic document corresponding to the sample content based on the first semantic descriptor, the second semantic descriptor, and the third semantic descriptor;

calculating a degree of similarity between a plurality of pre-stored semantic documents corresponding to each of a plurality of contents and the first semantic document; and

Based on the similarity, extracting at least one result content from the plurality of contents.

How to recommend content.
According to paragraph 1,

The step of calculating the similarity is,

extracting a first feature vector of the first semantic document;

extracting a plurality of feature vectors from each of the plurality of semantic documents; and

Comprising a step of calculating a plurality of similarities, which is the similarity of the first feature vector to each of the plurality of feature vectors, based on a cosine function.

How to recommend content.
According to paragraph 2,

The step of extracting at least one result content includes:

Sorting the plurality of similarities in descending order;

Extracting the top N similarities (N is a natural number) among the plurality of sorted similarities; and

Comprising the step of extracting N contents corresponding to the N similarities among the plurality of contents.

How to recommend content.
According to paragraph 2,

The step of extracting at least one result content includes:

Extracting N similarities having a value greater than or equal to a reference value among the plurality of similarities; and

Comprising the step of extracting N contents corresponding to the N similarities among the plurality of contents.

How to recommend content.
According to paragraph 1,

The step of generating the first semantic document is a step based on latent semantic analysis (LSA) technique.

How to recommend content.
According to clause 5,

The step of generating the first semantic document is,

setting a first weight for the first semantic descriptor, a second weight for the second semantic descriptor, and a third weight for the third semantic descriptor; and

Generating the first semantic document using the first weight, the second weight, and the third weight.

How to recommend content.
According to paragraph 1,

The step of obtaining the first semantic descriptor is,

generating word data by extracting words related to objects in each scene from a plurality of scenes included in the video data;

generating color data related to color, brightness, and saturation of each scene from the plurality of scenes; and

Generating the first semantic descriptor based on the word data and the color data.

How to recommend content.
According to paragraph 1,

The step of obtaining the second semantic descriptor includes performing processing for duplicate text, error text, and foreign language conversion from the text data.

How to recommend content.
According to paragraph 1,

The step of acquiring the third semantic descriptor is,

converting the audio data into a frequency band;

extracting a plurality of parameters from the converted audio data, the plurality of parameters including at least one of harmony, melody, timbre, pitch, rhythm, beat, tempo, genre, and instrumentation; and

Generating the third semantic descriptor based on the plurality of parameters.

How to recommend content.
According to paragraph 2,

The step of calculating the plurality of similarities is calculating the similarity based on the formula below.

(
: i-th component of the first feature vector,
: i-th component of one of the feature vectors, n: dimension of the feature vector)

How to recommend content.
According to clause 3,

setting a first weight for the first semantic descriptor, a second weight for the second semantic descriptor, and a third weight for the third semantic descriptor;

generating a first semantic supplementary document corresponding to the sample content using the first weight, the second weight, and the third weight;

calculating a similarity between N semantic documents corresponding to each of the N contents and the first semantic supplementary document; and

Based on the similarity, extracting content with the greatest similarity among the N contents.

How to recommend content.
A non-transitory computer-readable recording medium on which a program for executing the content recommendation method described in claim 1 is recorded.