WO2022085814A1

WO2022085814A1 - Artificial-intelligence-recommended educational music mixing streaming service

Info

Publication number: WO2022085814A1
Application number: PCT/KR2020/014427
Authority: WO
Inventors: 양재훈; 박종길
Original assignee: 송계순
Priority date: 2020-10-21
Filing date: 2020-10-21
Publication date: 2022-04-28

Abstract

The present invention provides an artificial-intelligence-recommended educational music mixing streaming service. In detail, the present invention is a service that compares music video analysis information with user information by using an artificial intelligence algorithm, mixes an educational music video corresponding to a user's learning type or learning ability, and provides the most suitable music video for the user's educational improvement.

Description

Music video mixing streaming service for education recommended by artificial intelligence

The present invention relates to a music video mixing streaming service for education recommended by artificial intelligence, and more particularly, to a service for mixing and providing educational music videos suitable for a user's learning purpose and learning level in a medley song format.

In general, children often learn songs by using smartphones or PC computers when studying, but the present invention provides a learning burden and low efficiency because the conventional technology has to listen to the entire simple melody song at the nursery rhyme level. By providing a mixing music video in which an artificial intelligence algorithm is mixed in the form of a medley song according to the purpose and level of the learning purpose and the melodic content of the song divided by content, such as verses and measures, it maximizes learning efficiency by allowing children to listen to only the core parts intensively. it is technology

The purpose of the present invention is to analyze the learning level of the child with the continuous viewing time, video clicks, and problem solving level test for watching educational music videos for children through computer streaming services on the web and on the app. By extracting only the essential parts, artificial intelligence provides a music video that mixes multiple music videos into one.

The music video mixing streaming service for AI recommendation education according to the present invention generates and transmits user information, a user terminal that receives and outputs a recommended or selected music video, extracts audio information of a music video image to generate music video analysis information, A pre-process for determining the similarity of each music video image based on the music video analysis information and grouping users with similar learning types based on user information, and a user-customized music video image based on the information analyzed in the pre-process It can be composed of a data server including a main process of extracting and listing, and generating images and music that are inserted into the connection part during image mixing.

The pre-process includes an image DB that stores music video image information and music video analysis information, user basic information including user age, gender, regional information, occupation, and grade, user learning information, and skipping and repeating of arbitrary images. , using a user DB that stores user analysis information including information on , favorites, asterisks, and number of plays, an audio analysis unit that extracts audio information from the music video image information to generate text information and melody pattern information, and a deep learning algorithm a similarity measurement unit that analyzes the melody information to generate similarity information, quantifies and stores the similarity between each music video based on the similarity information, and a user having a similar learning type based on the user basic information and the user analysis information It may include a user analysis unit for grouping and storing them.

The music video analysis information includes music video difficulty information, the initial difficulty information for any music video is set by the administrator, and the difficulty can be adjusted by the user's correct rate of a test having a topic similar to the arbitrary music video. there is.

The audio analysis unit extracts lyric information from the audio information, converts the lyric information into text, generates text information by naturalizing it, extracts melody information from the audio information, and analyzes the melody information to store melody pattern information can

The main process includes a list generator for generating a music video list by randomly extracting music video images corresponding to the user information, the melody information at the end of the music video output based on the music video analysis information, and the next to be output. A melody generator that analyzes the pattern of melody information in the front part of a music video to generate a section melody similar to the progress of two pieces of melody information, and a problem DB that stores problem information output to a test that can determine the user's learning level, the user It may include an image setting unit for outputting various functions on the screen of the user terminal based on the image setting information received from the terminal.

It may further include a quiz output unit for extracting a quiz question similar to the learning topic of the music video output to the user terminal from the question DB, and outputting the quiz question on the screen of the user terminal together with the section melody.

The list generating unit may list the randomly extracted music video images by arranging them in the order of the images with the higher difficulty from the images with the lower difficulty.

The list generator additionally extracts a candidate music video image, and when a user bookmarks a currently output music video, determines that the user has a high preference, and outputs a music video having a high similarity to the output music video next, and the user is currently outputting a music video. When a music video is skipped, it is determined that the preference is low, and a music video with a low similarity to the output music video can be output next.

The image setting unit may generate a caption based on the text information, highlight the keyword caption corresponding to the learning topic, and output it on the screen of the user terminal.

The image setting unit may include a bookmark function capable of moving to the start point of each music video on the play bar formed at the bottom of the music video image.

An audio extraction step of extracting audio information for an arbitrary music video stored in the image DB, text information extracting a keyword in the form of a noun by extracting lyrics from the audio information, converting it into text, and naturalizing it based on the text Extracting step, generating melody information by extracting a melody from the audio information, and generating melody pattern information generating melody pattern information through analyzing the melody information, receiving the melody pattern information and applying a deep learning algorithm to analyze the melody A similarity information generation step of extracting features and quantifying the melody features to generate similarity information by analyzing the melody features of each music video, and generating user learning type information by analyzing arbitrary user information, and generating the same or similar learning type information It may be composed of a group information generation step of generating user group information by grouping users.

A music video extraction step of extracting an arbitrary music video based on the user information, receiving the difficulty information of the extracted music video, and sorting the music video with high difficulty from the music video with low difficulty in order of music video list information A list generation step of generating the list, receiving the melody pattern information of the music video corresponding to the list information, analyzing the melody pattern information at the end of the first output music video and the melody pattern information in the front of the music video output next to the section A section melody generation step of generating melody information, a problem extraction step of extracting problem information corresponding to the first outputted music video from the problem information DB, and inserting the section melody information and the problem information into a connection part between the music videos based on the problem information It is possible to create a segmented image.

A behavior information generation step of generating user behavior information corresponding to the presented music video, a preference information generation step of analyzing the user behavior information to generate preference information for the output music video, a preference information high based on the preference information The method may further include a list regeneration step of regenerating the music video list by selecting an arbitrary music video having similarity information to the music video.

A subtitle generation step of generating subtitles translated into Korean and foreign languages using artificial intelligence, receiving music video image information included in a music video list output to a user terminal and creating a play bar on the user terminal screen, The method may further include a bookmark creation step of creating a bookmark for displaying each music video start point on the play bar.

In the present invention, since a single educational song is divided into several detailed contents, artificial intelligence provides the child with a single medley song through a computer streaming service with such segmented detailed songs according to the child's learning attitude and level. can use the learning time efficiently and by repeatedly learning the medley song has the effect of memorizing the learning contents needed for the child easily and quickly.

1 is a conceptual diagram of a music video system for artificial intelligence education according to an embodiment of the present invention.

2 is a block diagram of a pre-process of a data server according to an embodiment of the present invention.

3 is a block diagram of a main process of a data server according to an embodiment of the present invention.

4 is a flowchart of a music video analysis method of a pre-process according to an embodiment of the present invention.

5 is a flowchart of a music video similarity analysis method of a pre-process according to an embodiment of the present invention.

6 is a flowchart of a user analysis method of a pre-process according to an embodiment of the present invention.

7 is a flowchart of a method for generating a music video list in the main process according to an embodiment of the present invention.

8 is a flowchart of a method for regenerating a music video list in the main process according to an embodiment of the present invention.

9 is an example of a technology platform of a music video providing program for artificial intelligence education according to an embodiment of the present invention.

10 is an example of a music video image according to an embodiment of the present invention.

Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

Advantages and features of the present invention, and a method for achieving the same, will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings.

However, the present invention is not limited by the embodiments disclosed below, but will be implemented in various different forms, and only these embodiments allow the disclosure of the present invention to be complete, and common knowledge in the art to which the present invention pertains It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims.

In addition, in the description of the present invention, when it is determined that related known techniques may obscure the gist of the present invention, a detailed description thereof will be omitted.

1 is a conceptual diagram schematically illustrating a network structure of a music video system for artificial intelligence education according to an embodiment of the present invention.

As shown in FIG. 1 , the present invention consists of a user terminal 100 and a data server 200 .

The user terminal 100 is a device in which a user installs and uses a music video mixing system for artificial intelligence education.

The artificial intelligence education music video mixing system may include a test function that can identify educational music video mixing, user-customized music video recommendation, and user learning ability through deep learning and big data analysis using artificial intelligence.

The user terminal 100 may generate basic user information directly input by the user, such as the user's age, gender, region, occupation, and educational background, and transmit it to the data server 200 .

In addition, the music video image and list information may be received from the data server 200 and output to the screen.

The user terminal 100 is a computing terminal device having one or more processors and one or more memories that can connect to the test server 200 and the member management server in a wired or wireless manner. Such terminal devices include a smart phone and a tablet, a personal computer (PC), a mobile phone, a video phone, an e-book reader, a desktop PC, and a laptop. (Laptop) PC, Netbook PC, Personal Digital Assistant (PDA), Portable Multimedia Player (PMP), MP3 player, mobile medical device, camera, wearable device as examples , a computing device, such as a head-mounted device (HMD), electronic clothing, electronic bracelet, electronic necklace, electronic accessory, electronic tattoo, or smart watch. there is.

This is merely an example, and the terminal in the present invention should be interpreted as a concept including all devices capable of transmitting data or signals that have been developed and commercialized or will be developed in the future in addition to the above-described examples.

The data server 200 analyzes and mixes music videos for education to generate a music video list, and the user analysis information extracted based on the user basic information received from the user terminal 100 and music video viewing and test information A music video list may be transmitted to the user terminal 100 by extracting a corresponding user-customized music video.

The data server 200 extracts audio information of a music video image to generate music video analysis information, determines the similarity between each music video based on the music video analysis information, and determines a similar learning type based on user information Based on the pre-process 210 for grouping and storing users with It may include a main process 220 that

2 is a block diagram of the pre-process 210 of the data server 200 according to the present invention.

As shown in FIG. 2 , the pre-process 210 may include an image DB 211 , a user DB 212 , an audio analysis unit 213 , a similarity measurement unit 214 , and a user analysis unit 215 . there is.

The image DB 211 may store music video image information and music video analysis information generated by analyzing the music video.

The music video image information is an image obtained by extracting a playback section having an arbitrary learning topic from any educational music video, and the playback section can be set by an administrator directly or by using deep learning artificial intelligence analysis.

For example, if the content of the text is "Turn, turn, turn, return to original position, the force that the deformed object tries to return to its original state Elasticity is increased and the bouncing rubber band It is an elastic object, block it, block it, block the motion. The heavier it is, the more the frictional force gets bigger as the surface is bumpy The force to return, the elastic force The elastic force increased, and the bouncing rubber band, the video section corresponding to the elastic object and the sentence corresponding to the frictional force, “block, block, block the motion. The more bumpy, the greater the frictional force. The music video image information can be created by dividing the video section corresponding to the "friction force in the opposite direction of the direction of the force."

The music video analysis information may include difficulty information set based on the subject and content of the music video.

The difficulty information is initially set by the administrator directly or by artificial intelligence, and the difficulty level may be determined by the user's correct rate for a test or quiz similar to the learning content of the music video.

The user DB 212 may store basic user information received from the user terminal 100 and user analysis information obtained by analyzing the user learning type and test result information using the music video mixing system for artificial intelligence education.

The user basic information may include information that can be directly input by the user, such as the user's age, gender, region, occupation, and educational background.

The user learning type may be generated by analyzing behavioral information generated while using the music video mixing system for artificial intelligence education, such as the number of music video views, the number of repetitions, whether to skip, and registration of favorites.

The audio analyzer 213 may extract audio information from the music video image information to generate text information and melody pattern information.

The text information may include keywords extracted by extracting lyrics from the audio information using Speech to Text, converting the lyrics into text, and naturalizing the converted text.

The naturalization processing may correspond to a commonly used method such as tokenization, part of speech tagging, stopword processing, text extraction, and the like.

In detail, word information corresponding to a noun may be extracted from the converted text, and the word information may be determined as a word corresponding to a learning topic.

For example, if the text content is "The Bronze Age began around 2000 to 1500 B.C., but the materials for making the bronze were not widely used because there were not enough materials for making them", among the above text contents, 2000, 1500, Bronze Age, Bronze Age, materials of the word information can be extracted, and the Bronze Age and Bronze Age corresponding to any learning subject history among the word information can be set as keywords.

The melody pattern information is obtained by extracting melody information constituting a melody such as a pitch and a time signature from the audio information, and analyzing frequency-domain data generated through Fourier transform on the melody information to obtain the music video of melody pattern information can be generated.

The similarity measurement unit 214 receives the melody pattern information from the audio analysis unit 213, extracts pattern features through deep learning with a specific algorithm, and compares the pattern features of each music video to generate similarity information. can

More specifically, through deep learning, melodic features are extracted from the melody pattern information, and each music video is vectorized and quantified based on the melody features, and the melody features are converted into numerical values and each It is possible to generate similarity information by comparing the numerical values of the music videos of

Although the deep learning in the present invention has been described as using an auto-encoder, it is not limited thereto, and a commonly used deep learning algorithm can be used to generate a melody feature.

The user analysis unit 215 may group and store users having similar user information by analyzing the basic user information and user analysis information received from the user DB 212 .

The user analysis unit 215 may group users using only the basic user information in the case of a first-time user of the music video mixing system for artificial intelligence education.

3 is a block diagram of the main process 220 of the data server 200 according to the present invention.

As shown in FIG. 3 , the main process 220 may include a list generator 221 , a melody generator 222 , a problem DB 223 , an image setting unit 224 , and a quiz output unit 225 . there is.

The list generating unit 221 may receive data of a group to which the user belongs from the user analysis unit 215 and randomly extract it from among music videos with high preference to generate a user-customized music video list.

In detail, the music video list playback time is set based on the average viewing time and the number of music videos viewed by users belonging to the same group as the user, and a music video suitable for the user is selected based on the group's favorite and skip setting information. You can set the difficulty of the music video based on the user's test result information.

The playback time of the music video list of the present invention is preferably set to 2 minutes and 30 seconds to 3 minutes 30 seconds, but is not limited thereto, and the music video playback time may be changed by an administrator or user setting.

In addition, a list may be generated by arranging the randomly extracted music videos in the order of the music videos having the high difficulty in the music videos having the low difficulty.

The list generator 221 further extracts a preliminary music video having a difficulty similar to the extracted music video, and a music video output next according to user behavior information on the music video output to the user terminal 100 can be set.

The user behavior information may include information about a user's favorite music video, skipping, and repeating play.

In more detail, if the user sets the music video currently being viewed as a favorite, the music video played next is set as a music video with high similarity to the music video, and the music video the user is currently watching is selected as a favorite. In the case of skipping, it is determined that the preference is low, and a music video played next may be set as a music video having a low similarity to the music video.

For example, as shown in Table 1 below, the list generator 221 extracts and lists MV-01, MV-02, MV-07, MV-22, and MV-12 corresponding to column A, and lists A Columns B and C are created by extracting music videos similar to the difficulty level of the music videos corresponding to each order of the column. If a music video with a high similarity to MV-02 is selected from among MV-67 and played, and the user skips MV-02, the similarity to MV-02 from MV-07, MV-44, and MV-67 in the following order is displayed. You can select and play a low music video.

OrderOrder	AA	BB	CC
1One	MV-01MV-01
22	MV-02MV-02	MV-10MV-10	MV-17MV-17
33	MV-07MV-07	MV-44MV-44	MV-67MV-67
44	MV-22MV-22	MV-23MV-23	MV-04MV-04
55	MV-12MV-12	MV-46MV-46	MV-42MV-42

The melody generation unit 222 receives the melody pattern information from the audio analysis unit 213 and analyzes the patterns of the melody information at the end of the output music video and the melody information at the beginning of the music video to be output next, to find two melodies. A section melody similar to the progress of information can be created.

More specifically, the melody information is information stored by digitizing components of a melody, such as a pitch and a time signature, and a section melody can be generated through sequence probability analysis using a Markov model for the melody information.

The melody generating unit 222 may generate by further adding chords or pitches based on the section melody.

Although it has been described that the melody generator 222 in the present invention generates a section melody using a Markov model, it is not limited thereto and can generate a section melody using a commonly used melody analysis algorithm. .

The question DB 223 may store question information provided to tests and quizzes that can determine the user's learning level.

The image setting unit may output various additional functions to the screen of the user terminal 100 based on the image setting information and music video analysis information received from the user terminal 100 .

In more detail, the image setting unit may additionally output subtitles, bookmarks, illustrations, and character images corresponding to the music video output on the screen of the user terminal 100 to the user terminal.

The image setting unit may receive text information corresponding to an arbitrary music video, generate subtitles, highlight subtitles corresponding to keywords, and output them on the screen of the user terminal 100 .

Then, by receiving the translation information using the manager or artificial intelligence for the text information, foreign language subtitles can be generated and outputted on the screen of the user terminal 100 like the subtitles.

Alternatively, the caption may be highlighted in accordance with audio information of an arbitrary music video output to the user terminal 100 .

For example, if the text information is "The Bronze Age began around 2000 to 1500 B.C., but the materials for making the bronze ware were not widely used because there were not enough materials", the text information is converted into subtitles and the screen size of the user terminal 100 The subtitle is divided and output on the screen, and the background color of the bronze device, which is a keyword, is highlighted in yellow or fluorescent color, or the background color of the subtitle is highlighted in yellow or fluorescent color in accordance with the audio information output to the user terminal 100. can

The image setting unit 224 may output a bookmark indicating the starting point of each music video included in the music video list on the screen of the user terminal 100 .

In more detail, a play bar is formed at the lower end of the user terminal 100 to check the progress of the currently output music video list, and a bookmark indicating the start point of the music video can be displayed on the play bar.

For example, in an arbitrary music video, a first playback section with a duration of 1 minute, a second playback section with a duration of 2 minutes, and a third playback section with a duration of 3 minutes have a playback length of 4 minutes. When eggplant is mixed in the order of the 4th play section, the play bar displays the total duration of 10 minutes, and bookmarks using cut lines and arrows at the 0, 1, 3, and 6 minute points, which are the start points of each music video. can be displayed

4 is a flowchart of a music video analysis method of the pre-process 210 according to an embodiment of the present invention.

4 , the music video analysis method may include an audio extraction step (S110), a text information extraction step (S120), and a melody pattern information generation step (S130).

In the audio analysis step ( S110 ), the audio analysis unit 213 may extract audio information from any music video image information stored in the image DB 211 .

In the text information extraction step (S120), lyrics may be extracted from the audio information using automatic speech recognition (Speech to Text) (S121) and converted into text (S122).

Then, the converted text is naturalized using tokenization, part fo speech tagging, stopword, and text extraction (S123) to extract noun words and corresponding learning topics. In the case of a noun word, it can be set as a keyword (S124).

The melody pattern information generation step (S130) extracts melody information from the audio information (S131) and analyzes frequency-domain data generated through a Fourier transform to obtain melody pattern information of the music video. can be generated (S133).

As shown in FIG. 5 , the music video similarity analysis method may include a similarity information generation step ( S140 ) of generating similarity information between music videos based on the melody pattern information.

The similarity information generation step (S140) is performed by receiving melody pattern information from the audio analysis unit 213, extracting pattern features through deep learning with a specific algorithm (S142), and comparing the pattern features of each music video to obtain similarity. information can be created.

As shown in FIG. 6 , the user analysis method may include a group information generation step consisting of user information analysis (S151), learning type information generation (S152), user group information generation (S153), and user group information storage steps. there is.

The user information analysis ( S151 ) may generate user learning type information ( S152 ) by receiving basic user information and user analysis information from the user DB 212 , and analyzing user result information for a provided test.

The user group information generation ( S153 ) may generate one group by extracting users having similar learning type information, and may generate user group information by analyzing user information of a user corresponding to the group.

As shown in Fig. 7, the music video list generation method includes a music video extraction step (S201), a list generation step (S202), a section melody generation step (S203), a problem extraction step, a section video generation step and a music video output. may include

In the music video extraction step (S201), a music video suitable for a user may be extracted from the image DB by analyzing the user information and user group information.

The list generation step ( S202 ) may generate a music video list arranged in order of music videos having a high difficulty from a music video having a low difficulty based on the extracted difficulty information of the music video.

The section melody generation step (S203) is a section melody that naturally connects the melody at the end of the output music video and the melody at the front of the next output music video by analyzing the melody pattern information of the music video included in the music video list. can create

The question extraction step (S204) may extract a quiz question similar to the learning topic of the currently output music video.

In the section video generation step (S205), a section video is created by mixing two music videos that are sequentially played back and a section melody corresponding to a quiz question, and a section video is inserted at the connection part between the two music videos to create one music video. can be mixed with

As shown in FIG. 8 , the method for regenerating a music video list of the main process may include a behavior information generation step (S211), a preference information generation step (S212), and a list regenerating step (S213).

The behavior information generating step (S211) may generate user behavior information by detecting a user behavior with respect to the currently output music video.

The user action may include registering a favorite, skipping, and repeating the music video being output.

The preference information generation step (S212) may set a high preference when the user registers and repeats the music video currently being output as a favorite, and sets a low preference when the user skips the currently output music video.

In the list regenerating step (S213), based on the preference information, if the preference is high, a music video with a high similarity to the corresponding music video is selected and output, and if the preference is low, a music video with a low similarity to the corresponding music video can be selected and output. there is.

In the caption generating step, the image setting unit 244 may receive text information and translation information to generate local language and foreign language subtitles, and output a subtitle corresponding to the user screen setting information on the screen of the user terminal 100 .

In the highlight display step, the caption corresponding to the keyword information extracted from the text information may be highlighted and output on the screen of the user terminal 100 .

In the bookmark output step, a play bar indicating the progress rate of the music video list may be output at the bottom of the screen of the user terminal 100 and a bookmark indicating the start point of each music video may be output on the play bar.

In the above, the functional operation described herein and the embodiments related to the subject matter, including the features disclosed herein and structural equivalents thereof, may be implemented in digital electronic circuitry or computer software, firmware or hardware, or in a combination of one or more thereof. can be implemented

Embodiments of the subject matter described herein are one or more computer program products, ie, one or more modules directed to computer program instructions encoded on a tangible program medium for execution by or for controlling the operation of a data processing device. can be implemented. A tangible program medium may be a radio wave signal or a computer-readable medium. A radio wave signal is an artificially generated signal, eg a machine-generated electrical, optical or electromagnetic signal, that is generated to encode information for transmission to an appropriate receiver device for execution by a computer. A computer-readable medium may be a machine-readable storage device, a machine-readable storage device, a machine-readable storage substrate, a memory device, a combination of materials that affect a machine-readable radio wave signal, or a combination of one or more thereof. can be

A computer program (also known as a program, software, software application, script or code) may be written in any form of programming language, including compiled or interpreted language or a priori or procedural language, and may be written as a stand-alone program or module; It can be deployed in any form, including components, subroutines, or other units suitable for use in a computer environment.

A computer program does not necessarily correspond to a file in a file system. A program may be stored in a single file provided with the requested program, or in multiple interactive files (eg, one or more scripts stored within a markup language document).

The computer program may be deployed to be executed on a single computer or multiple computers located at one site or distributed over a plurality of sites and interconnected by a communication network.

Additionally, the logic flows and structural block diagrams described in this patent document describe corresponding acts and/or specific methods supported by corresponding functions and steps supported by the disclosed structural means, and corresponding It can also be used to build software structures and algorithms and their equivalents.

The processes and logic flows described herein may be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.

Processors suitable for the execution of computer programs include, for example, both general and special purpose microprocessors and any one or more processors of any kind of digital computer. Typically, the processor will receive instructions and data from read-only memory, random access memory, or both.

A key element of a computer is one or more memory devices for storing instructions and data and a processor for executing instructions. In addition, a computer is generally operably coupled to receive data from, or transmit data to, one or more mass storage devices for storing data, such as, for example, magnetic, magneto-optical disks or optical disks, or to perform both such operations. or will include However, the computer need not have such a device.

The present description sets forth the best mode of the invention, and provides examples to illustrate the invention, and to enable any person skilled in the art to make or use the invention. This written specification does not limit the present invention to the specific terms presented.

Accordingly, although the present invention has been described in detail with reference to the above-described examples, those skilled in the art can make modifications, changes and modifications to the examples without departing from the scope of the present invention. In short, in order to achieve the intended effect of the present invention, it is not necessary to separately include all the functional blocks shown in the drawings or follow all the orders shown in the drawings. Note that it may fall within the scope.

Although the present invention has been described with reference to the embodiment(s) shown in the drawings, this is only exemplary, and various modifications may be made thereto by those skilled in the art, and the above-described embodiment It will be understood that all or part of (s) may optionally be combined. Accordingly, the true technical protection scope of the present invention should be determined by the technical spirit of the appended claims.

[Explanation of code]

100: user terminal 200: data server

210: pre-process 211: image DB

212: user DB 213: audio analysis unit

214: similarity measurement unit 215: user analysis unit

220: main process 221: list generator

222: melody generator 223: problem DB

224: video setting unit 225: quiz output unit

Claims

a user terminal that generates and transmits user information, and receives and outputs a recommended or selected music video;

Extracting audio information of a music video image to generate music video analysis information, determining the similarity of each music video image based on the music video analysis information, and grouping users with similar learning types based on user information pre-process; A data server comprising; a main process of extracting and listing user-customized music video images based on the information analyzed in the pre-process, and generating images and music inserted into the connection part when mixing images; artificial intelligence comprising Educational music video mixing system.
The method of claim 1,

The pre-process is

an image DB for storing music video image information and music video analysis information;

Users who store basic user information including user age, gender, regional information, occupation, and grade, user learning information, and user analysis information including skip, repeat playback, favorites, star ratings, and number of times information for arbitrary images DB;

an audio analysis unit extracting audio information from the music video image information to generate text information and melody pattern information;

a similarity measurer for generating similarity information by analyzing the melody information using a deep learning algorithm, and quantifying and storing the similarity between each music video based on the similarity information;

A music video mixing system for artificial intelligence education comprising a; a user analysis unit for grouping and storing users having a similar learning type based on the user basic information and the user analysis information.
3. The method of claim 2,

The music video analysis information includes music video difficulty information, the initial difficulty information for an arbitrary music video is set by an administrator, and the difficulty is adjusted by the user's correct rate of a test having a topic similar to the arbitrary music video. Music video mixing system for intelligent education.
3. The method of claim 2,

The audio analysis unit,

Extracting lyric information from the audio information, converting the lyric information into text, and naturalizing it to generate text information and

A music video mixing system for artificial intelligence education that extracts melody information from the audio information and stores the melody pattern information by analyzing the melody information.
The method of claim 1,

The main process is

a list generator for generating a music video list by randomly extracting a music video image corresponding to the user information;

a melody generator for generating a section melody similar to the progress of the two pieces of melody information by analyzing the patterns of the melody information at the end of the output music video and the melody information at the beginning of the music video to be output next based on the analysis information of the music video;

a problem DB for storing problem information output to a test that can determine the user's learning level;

A music video mixing system for artificial intelligence education comprising a; an image setting unit for outputting various functions on the screen of the user terminal based on the image setting information received from the user terminal.
6. The method of claim 5,

A music video mixing system for artificial intelligence education including a quiz output unit that extracts a quiz question similar to a learning topic of a music video output to the user terminal from the question DB, and outputs a quiz question on the screen of the user terminal together with the section melody .
6. The method of claim 5,

The list generator,

A music video mixing system for artificial intelligence education that lists randomly extracted music video videos by sorting them from low difficulty to high difficulty.
6. The method of claim 5,

The list generator,

When a candidate music video is additionally extracted and the user favorites the currently output music video, it is judged that the preference is high, and the music video with high similarity to the output music video is output next,

If the user skips the currently output music video, it is judged that the preference is low and the music video with low similarity to the output music video is output next.
6. The method of claim 5,

The video setting unit,

A music video mixing system for artificial intelligence education for generating subtitles based on the text information, highlighting the keyword subtitles corresponding to the learning topic, and outputting them on the screen of the user terminal.
6. The method of claim 5,

The video setting unit,

A music video mixing system for artificial intelligence education that includes a bookmark function that can move to the starting point of each music video on the play bar formed at the bottom of the music video video.
an audio extraction step of extracting audio information for an arbitrary music video stored in the image DB;

a text information extraction step of extracting lyrics from the audio information, converting them into text, and performing naturalization processing based on the text to extract a keyword in the form of a noun;

a melody pattern information generation step of extracting a melody from the audio information to generate melody information, and generating melody pattern information through analysis of the melody information;

a similarity information generation step of receiving the melody pattern information, extracting melody features through analysis applied with a deep learning algorithm, quantifying the melody features, and analyzing the melody features of each music video to generate similarity information;

A group information generation step of generating user learning type information by analyzing arbitrary user information, and grouping users having the same or similar learning type information to generate user group information; A music video mixing method for artificial intelligence education consisting of.
a music video extraction step of extracting an arbitrary music video based on the user information;

a list generation step of receiving the extracted difficulty information of the music video and arranging the music video having the high difficulty from the music video having the low difficulty in order to generate the music video list information;

A section in which the melody pattern information of the music video corresponding to the list information is received and section melody information is generated by analyzing the melody pattern information at the end of the first output music video and the melody pattern information at the front of the music video output next melody generation step;

a problem extraction step of extracting problem information corresponding to the first output music video from the problem information DB;

A music video mixing method for artificial intelligence education, including; a section image generation step of generating a section image to be inserted into a connection part between the music videos based on the section melody information and the problem information.
13. The method of claim 12

a behavior information generation step of generating user behavior information corresponding to the presented music video;

a preference information generation step of analyzing the user behavior information to generate preference information for the output music video;

A music video mixing method for artificial intelligence education, further comprising: a list regeneration step of regenerating a music video list by selecting an arbitrary music video having similarity information to a high preference music video based on the preference information.
13. The method of claim 12,

a subtitle generation step of generating subtitles translated into a native language and a foreign language by using artificial intelligence;

A bookmark generating step of receiving music video image information included in the music video list output to the user terminal, creating a play bar on the screen of the user terminal, and creating a bookmark for displaying each music video start point on the play bar; further comprising How to mix music video for AI education.