WO2019232959A1

WO2019232959A1 - Artificial intelligence-based composing method and system, computer device and storage medium

Info

Publication number: WO2019232959A1
Application number: PCT/CN2018/104715
Authority: WO
Inventors: 王义文; 刘奡智; 王健宗; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-06-04
Filing date: 2018-09-08
Publication date: 2019-12-12
Also published as: CN109192187A

Abstract

Disclosed in the present application are an artificial intelligence-based composing method and system, a computer device and a storage medium. The method comprises: acquiring musical note information, the musical note information comprising playback start time, a playback duration and a pitch value of each musical note; creating a music time sequence, performing an auto-regressive integrated moving average (ARIMA) model prediction to obtain a time sequence prediction result; constructing a music prediction model according to the musical note information and the time sequence prediction result; determining a topological structure of the music prediction model, obtaining predicted music according to the musical note information and the time sequence prediction result and combining the music prediction model and the determined topological structure, to implement automatic composing. Said method implements automatic composing, simplifying the manufacturing process, and a user can participate in creation, to generate different compositions by means of different inputs, without the need for intermediate intervention. The invention is flexible and abundant.

Description

Composition method, system, computer equipment and storage medium based on artificial intelligence

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 4, 2018, with application number 201810561621.5, and the invention name is "Artificial Intelligence-based Composition Method, System, Computer Equipment, and Storage Medium" Incorporated by reference in this application.

Technical field

The present application relates to the field of information technology, and in particular, to a composition method, system, computer equipment, and storage medium based on artificial intelligence.

Background technique

With the application of computer technology in music processing, computer music came into being. Computer music, as a new generation of art, has gradually penetrated into many aspects such as music creation, musical instrument performance, education, and entertainment. The use of artificial intelligence technology for automatic composition is a relatively new research direction in computer music. In recent years, it has received researchers in related fields. Highly valued.

The existing automatic composition methods based on artificial intelligence technology mainly include automatic composition based on heuristic search and automatic composition based on genetic algorithm. However, the existing automatic composition based on heuristic search is only suitable for the case where the length of the music piece is short, and its search efficiency decreases exponentially with the increase of the length of the music piece. Therefore, this method is not feasible for long pieces of music. However, the automatic composition method based on genetic algorithm inherits some typical disadvantages of genetic algorithm, such as heavy dependence on the initial population and difficult selection of genetic operators.

Summary of the Invention

Based on this, it is necessary to provide an artificial intelligence-based composition method, system, computer equipment, and storage medium in view of the disadvantages of the current automatic composition method.

A composition method based on artificial intelligence, including:

Acquiring note information, where the note information includes a playback start time, a playback duration, and a pitch value of each note;

Create music time series and perform autoregressive integrated moving average ARIMA model prediction to get time series prediction results;

Construct a music prediction model according to the note information and time series prediction results;

Determine the topological structure of the music prediction model, combine the music prediction model and the determined topological structure according to the note information and time series prediction results to obtain the predicted music, and realize automatic composition.

A composition system based on artificial intelligence, including:

An acquisition module, configured to acquire note information, where the note information includes a start time, a playback duration, and a pitch value of each note;

The creation module is set to create a music time series and perform autoregressive integrated moving average ARIMA model prediction to obtain the time series prediction results;

A construction module configured to construct a music prediction model according to the note information and a time series prediction result;

The composition module is configured to determine a topology structure of the music prediction model, combine the music prediction model and the determined topology structure according to the note information and the time series prediction result, obtain the predicted music, and implement automatic composition.

A computer device includes a memory and a processor. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the processor causes the processor to perform the following steps:

A storage medium storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:

Acquiring note information, where the note information includes a playback start time, a playback duration time, and a pitch value of each note;

The aforementioned artificial intelligence-based composition method, system, computer device, and storage medium obtain musical note information by including the note start time, duration, and pitch value of each note, creating a music time series, and performing autoregressive integration The moving average ARIMA model predicts the time series prediction results, and establishes a single-layer long-term and short-term memory network LSTM. The single-layer LSTM contains LSTM blocks, and the LSTM block contains a gate. The gate determines whether the input is important and whether It is remembered and can be output. The initial state is randomly initialized. Each step corresponds to an input value. The input value is the word vector corresponding to each word. The Dropout mechanism is used to randomly delete parts of the hidden layer and maintain input and output. The neuron is unchanged, and multiple basic LSTM units are summarized into one using a multilayer RNN unit. The state of the RNN unit is controlled through the gate structure, and information is deleted or added to it. Each time an RNN unit is added, a basic LSTM unit is called again. , Set the initial value for the initial state of each layer of the network, initialize the weight parameters, and use The backward propagation mechanism reversely adjusts the weight parameters in the two-layer network layer by layer, iteratively improves the training accuracy of the network, optimizes the loss function, constructs a music prediction model, determines the topological structure of the music prediction model, and according to the note information and time The sequence prediction result is combined with the music prediction model and the determined topological structure to obtain the predicted music and realize automatic composition. The production process is greatly simplified. Users can participate in the creation and generate different songs through different inputs without the need for midway intervention. It is flexible and rich.

BRIEF DESCRIPTION OF THE DRAWINGS

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the detailed description of the preferred embodiments below. The drawings are only for the purpose of illustrating preferred embodiments and are not to be considered as limiting the present application.

FIG. 1 is a flowchart of a composition method based on artificial intelligence in an embodiment; FIG.

2 is a flowchart of constructing a music prediction model based on note information and time series prediction results in an embodiment;

3 is a flowchart of establishing a single-layer long-term and short-term memory network LSTM using a Dropout mechanism to establish a hidden layer in an embodiment;

4 is a flowchart of copying a single-layer network into a double-layer network according to a single-layer LSTM according to an embodiment;

5 is a structural block diagram of a composition system based on artificial intelligence in an embodiment;

6 is a structural block diagram of a building module in an embodiment;

7 is a structural block diagram of an establishing unit in an embodiment;

FIG. 8 is a structural block diagram of a duplication unit in an embodiment.

Detailed ways

In order to make the purpose, technical solution, and advantages of the present application clearer, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and are not used to limit the application.

Those skilled in the art will understand that, unless specifically stated otherwise, the singular forms "a", "an", "the" and "the" may include plural forms. It should be further understood that the word "comprising" used in the specification of the present application refers to the presence of the described features, integers, steps, operations, elements and / or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components, and / or groups thereof.

As a better embodiment, as shown in FIG. 1, an artificial intelligence-based composition method includes the following steps:

Step S101: Acquire note information, where the note information includes a play start time, a play duration, and a pitch value of each note;

Long Short-Term Memory (LSTM) is a kind of RNN, which can solve the problem that long-term "memory" cannot be used in traditional recurrent neural networks, and has the ability of long-distance learning. This technical solution provides an artificial intelligence composition system based on a two-channel long-short memory network (LSTM). Intelligent composition is based on the use of two dimensions: pitch and duration to express music characteristics, and realizes simple input of notes to the system. After that, the machine can learn according to the characteristics of the arrangement of the notes, and write a more complete and rich tune, which simulates the performance of a musician, and the song played is like agile and rigid. Obtain note information, extract corresponding frame-level audio features, and then obtain a module based on the above-mentioned combination of frame-level audio features and a pre-built music band feature model to obtain frame-level audio features that carry frequency band information, and frame-level audio features based on the aforementioned band-frequency information Audio features and pre-built music prediction models to obtain predicted music, so that automatic composition can be achieved.

Step S102: Create a music time series, perform autoregressive integrated moving average ARIMA model prediction, and obtain a time series prediction result;

The method of music time series model stationary is generally tested according to the test model. Of course, if the time series is unstable, you can also use some operations to make the time series stable (such as taking logarithms and differences), and then perform ARIMA model prediction to obtain stable time series prediction results. The full name of the ARIMA model is the Autoregressive Integrated Moving Average Model (Autoregressive Integrated Moving Average Model). The data sequence formed by the predicted object over time is regarded as a random sequence, and a certain mathematical model is used to approximate this sequence. Once this model is identified, future values can be predicted from past and present values of the time series.

Step S103: construct a music prediction model according to the note information and the time series prediction result;

The advantage of the dual-channel long-short memory network (LSTM) used in this technical solution is that it can "smartly" forget long-term memories that are not needed, learn new and useful information, and store useful information in long-term memory again. For example, the previously played movements have no effect on the present, so the movement information is removed, and the melody in the latest movement is recorded in the "long-term memory". For example, suppose the input of the LSTM is xt, the output is yt, the weight of the input layer to the hidden layer is W, and the weight of the hidden layer to the hidden layer is U. The status is summarized as an aid for the next input.

Step S104: Determine the topological structure of the music prediction model, combine the music prediction model and the determined topological structure according to the note information and the time series prediction result, obtain the predicted music, and implement automatic composition.

Determine the topological structure of the music prediction model, combine the music prediction model and the determined topological structure according to the note information and time series prediction results to obtain the predicted music, and realize automatic composition. The topology is a hedged neural network structure. In this embodiment, a recurrent neural network (RNN) is used as an example. The topology includes two independent RNNs and a connection unit, and two independent RNNs. They are named LF_RNN and HF_RNN, which are used for low frequency multi-frequency feature combination and high frequency multi-frequency feature combination, respectively. Extract the frame-level audio characteristics of the music corresponding to the music file, and then obtain the module to obtain the frame-level audio characteristics carrying the frequency band information according to the combination model of the above-mentioned frame-level audio features and the pre-built music frequency band characteristics, and the frame-level audio information according to the above-mentioned frequency band Audio features and pre-built music prediction models to obtain predicted music, so that automatic composition can be achieved.

In one embodiment, performing the autoregressive integrated moving average ARIMA model prediction includes: checking the audio, sound frame, and their changing laws according to the autocorrelation function and partial autocorrelation function to identify the stability of the time series; non-stationary time The sequence is smoothed. According to the recognition rules of the time series, the ARIMA model is used to estimate the parameters of the ARIMA model. Hypothesis testing is performed to diagnose whether the residual sequence is white noise. Prediction analysis is performed using the tested model.

The ARIMA model prediction program is: testing the audio, sound frame, and their changes according to the autocorrelation function and partial autocorrelation function to identify the stability of the time series; smoothing the non-stationary time series; according to the recognition rules of the time series The ARIMA model is selected for parameter estimation of the ARIMA model; a hypothesis test is performed to diagnose whether the residual sequence is white noise; and a predictive analysis is performed using the tested model.

As shown in FIG. 2, in one embodiment, constructing a music prediction model according to the note information and the time series prediction result includes:

Step S201: Establish a single-layer long-term and short-term memory network LSTM, and use the Dropout mechanism to establish a hidden layer;

LSTM long-term and short-term memory is a kind of time-recurrent neural network that contains LSTM blocks (blocks) or other types of neural networks. LSTM blocks may be described as intelligent network units, because it can memorize values of indefinite time length. There is a gate that determines whether the input is important enough to be remembered and can be output. The initial state is initialized randomly, one input per step, and the actual input is the word vector corresponding to each word. Hidden state, the number of hidden nodes, can go to the state of the last step as the output, can also weight or directly average the state of all steps as the output, and flexibly adjust according to the specific task.

Step S202: copy the single-layer network into a double-layer network according to the single-layer LSTM;

LSTM has the form of a series of repeating neural network modules, which have different structures. It has four neural network layers that interact in a special way. The horizontal line represents the state of the unit and has a linear interaction, which can ensure that the information is passed down. Selectively let information pass, consisting of a sigmoid neural network layer and pointwise multiplication operations. The sigmoid layer maps variables between 0 and 1 and describes whether each component should pass a threshold. Sigmoid layer. Sigmoid is a neural network algorithm that uses the sigmoid function as the activation function. In order to describe the neural network, Sigmoid uses the sigmoid function to perform operations on each input data. You can also choose the hyperbolic tangent function (tanh). The preset 0 means "don't let any ingredient pass", and 1 means "let ingredients pass". LSTM has three similar thresholds: the "forgotten threshold sigmoid layer" that determines which information needs to be discarded from the unit state, the "input threshold layer" that determines what new information needs to be stored in the unit state, and the "input threshold layer" that adds values to the state tanh layer ", sigmoid layer and tanh function that determine which parts of the unit state need to be output. The dual-channel long and short memory network (LSTM), which is the use of two LSTM networks together, can enhance the "learning" ability of the neural network, so that the intelligent system can fully "learn" the characteristics of music and learn the style of music performance. For some complex sequences, multiple layers of networks are required. To build a model, first use multiple layers of RNN cells to aggregate multiple basic LSTM cells into one. The LSTM controls the state of the cell through a structure called a gate, and deletes or adds information to it. It is worth noting that each time a unit is added, a base LSTM unit needs to be recalled. Because the function declares internal variables once every time, if you don't do so, these variables will be reused, resulting in an error. Set an initial value for the initial state of each layer. You can also use the zero_state method to generate the initial value, but then you cannot display the intermediate state. Choose according to the actual application. Multi-layer LSTMs are better than single layers. This trend is in line with the general law of multi-layer neural network modeling when the data dimension becomes larger and nonlinear factors increase.

In step S203, the weight parameters are initialized, the back-propagation mechanism is used to adjust the weight parameters in the double-layer network layer by layer, and iteratively improves the training accuracy of the network to optimize the loss function and construct a music prediction model.

Before starting training, all the weights will be initialized with some different small random numbers. The network model uses the gradient descent method to minimize the loss function to adjust the weight parameters in the network layer by layer, and iteratively trains and improves Network accuracy. Data preprocessing, parameter initialization, batch normalized BN regularization, random inactivation of Dropout, loss function. Classification problems, regression problems, gradient checking. Before the model check training, during the training, the parameters are updated. The same gradient is calculated in the back-propagation to update the same parameters. From the perspective of the cost function, the parameter initialization cannot be too large, so we want the initial value of the weight to be very close to 0 and not equal to 0.

As shown in FIG. 3, in one embodiment, establishing a single-layer long-term and short-term memory network LSTM, and using the Dropout mechanism to establish a hidden layer include:

Step S301: Establish a single-layer long-term and short-term memory network LSTM. The single-layer LSTM contains LSTM blocks. The LSTM block contains a gate. The gate determines whether the input is important, whether it can be remembered, and whether it can be output.

The training process of a neural network is to forward the input through the network and then back-propagate the error. The Dropout mechanism is aimed at this process, randomly deleting some units of the hidden layer and performing the above process. In summary, the above process may include randomly deleting some hidden neurons in the network, keeping the input and output neurons unchanged; transmitting the input forward through the modified network, and then reversing the error through the modified network Propagation; for another batch of training samples, the above operation was repeated to achieve the role of a Vote mechanism. For a fully connected neural network, we can use the same data to train 5 different neural networks may get multiple different results, we can use a vote mechanism to determine the winner of multiple votes, so it is relatively improved Network accuracy and robustness. For a single neural network, if we batch it, although different networks may produce different degrees of overfitting, sharing it with a loss function is equivalent to optimizing them at the same time and taking an average. Therefore, overfitting can be prevented more effectively. Reduce complex co-adaptation between neurons. When the hidden layer neurons are randomly deleted, the fully connected network has a certain sparseness, which effectively reduces the synergistic effect of different features. In other words, some features may depend on the combined action of the hidden nodes of the fixed relationship. With the Dropout mechanism, the situation where some features are effective only in the presence of other features is effectively organized. Robustness. Robustness, which means robust and strong, is the key to system survival in abnormal and dangerous situations. For example, whether the computer software does not crash or crash under input errors, disk failures, network overload, or intentional attacks is the robustness of the software. The so-called "robustness" means that the control system maintains certain other performance characteristics under a certain (structure, size) parameter perturbation. According to different definitions of performance, it can be divided into stable robustness and performance robustness. The fixed controller designed with the robustness of the closed-loop system as the target is called a robust controller.

Step S302, the initial state is initialized randomly, each step corresponds to an input value, and the input value is a word vector corresponding to each word;

The initial state is initialized randomly, one input per step, and the actual input is the word vector corresponding to each word. Hidden state, the number of hidden nodes, can go to the state of the last step as the output, and can also weight or directly average the state of all steps as the output, and flexibly adjust according to the specific task.

In step S303, a Dropout mechanism is used to randomly delete some units in the hidden layer, keeping the input and output neurons unchanged.

The Dropout mechanism is aimed at this process, randomly deleting some units of the hidden layer and performing the above process. In summary, the above process may include randomly deleting some hidden neurons in the network, keeping the input and output neurons unchanged; transmitting the input forward through the modified network, and then reversing the error through the modified network Propagation; for another batch of training samples, the above operation was repeated to achieve the role of a Vote mechanism.

As shown in FIG. 4, in one embodiment, replicating a single-layer network into a dual-layer network according to a single-layer LSTM includes:

In step S401, a plurality of basic LSTM units are aggregated into one by using a multilayer RNN unit;

In step S402, the state of the RNN unit is controlled through the gate structure, and information is deleted or added to it, and a basic LSTM unit is called again each time an RNN unit is added;

In step S403, an initial value is set for an initial state of each layer of the network.

As shown in FIG. 5, in one embodiment, an artificial intelligence-based composition system is provided, including: an acquisition module configured to acquire note information, where the note information includes a playback start time and a playback duration of each note And pitch values; a creation module, set to create a music time series, and performing an autoregressive integral moving average ARIMA model prediction to obtain a time series prediction result; a construction module, set to construct a music prediction model based on the note information and the time series prediction result A composition module, configured to determine the topology of the music prediction model, combining the music prediction model and the determined topology according to the note information and time series prediction results, to obtain the predicted music, and realize automatic composition.

In one embodiment, the creation module further includes: a recognition unit configured to test the audio, sound frame, and its variation rule according to the auto-correlation function and the partial auto-correlation function to identify the smoothness of the time series; the smoothing processing unit , Set to smooth the non-stationary time series; parameter estimation unit, set to select the ARIMA model to estimate the parameters of the ARIMA model according to the recognition rules of the time series; diagnostic unit, set to perform hypothesis testing, and diagnose the residual sequence Whether it is white noise; the prediction unit is set to perform prediction analysis using a model that has passed the test.

As shown in FIG. 6, in one embodiment, the building module further includes: a building unit configured to build a single-layer long-term and short-term memory network LSTM, and a Dropout mechanism to establish a hidden layer; a replication unit configured to be based on the single layer LSTM, which replicates a single-layer network into a double-layer network; the optimization unit is set to initialize the weight parameters, and the back-propagation mechanism is used to adjust the weight parameters in the double-layer network inversely layer by layer to improve the training accuracy of the network and optimize the loss Function to build a music prediction model.

As shown in FIG. 7, in one embodiment, the establishing unit further includes: a establishing subunit configured to establish a single-layer long-term and short-term memory network LSTM, where the single-layer LSTM includes an LSTM block, and the LSTM block includes A gate that determines whether the input is important, whether it can be remembered, and whether it can be output; the corresponding subunit is set to the initial state and randomly initialized, each step corresponds to an input value, and the input value corresponds to each word Word vector; delete subunits, set to use the Dropout mechanism to randomly delete parts of the hidden layer, keeping the input and output neurons unchanged.

As shown in FIG. 8, the duplication unit further includes: a summary sub-unit configured to use a multi-layer RNN unit to summarize a plurality of basic LSTM units into one; a deletion sub-unit configured to control a state of the RNN unit through a gate structure, And delete or add information to it, add an RNN unit each time to recall a basic LSTM unit; set sub-units, set to set initial values for the initial state of each layer of the network.

In one embodiment, a computer device is provided. The computer device includes a memory and a processor. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the processor causes the processor to execute the computer program. To implement the steps in the composition method based on artificial intelligence in the above embodiments at times.

In one embodiment, a storage medium storing computer-readable instructions is provided. When the computer-readable instructions are executed by one or more processors, the one or more processors are caused to perform manual-based operations in the foregoing embodiments. Steps in a smart composition method. The storage medium may be a non-volatile storage medium.

A person of ordinary skill in the art may understand that all or part of the steps in the various methods of the foregoing embodiments may be implemented by a program instructing related hardware. The program may be stored in a computer-readable storage medium. The storage medium may include: Read-only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks, etc.

The technical features of the embodiments described above can be arbitrarily combined. In order to simplify the description, all possible combinations of the technical features in the above embodiments have not been described. However, as long as there is no contradiction in the combination of these technical features, It should be considered as the scope described in this specification.

The above-mentioned embodiments only express some exemplary embodiments of the present application, and their descriptions are more specific and detailed, but cannot be understood as a limitation on the scope of the patent of the present application. It should be noted that, for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can be made, and these all belong to the protection scope of the present application. Therefore, the protection scope of this application patent shall be subject to the appended claims.

Claims

A composition method based on artificial intelligence, including:

Acquiring note information, where the note information includes a playback start time, a playback duration, and a pitch value of each note;

Create music time series and perform autoregressive integrated moving average ARIMA model prediction to get time series prediction results;

Construct a music prediction model according to the note information and time series prediction results;

Determine the topological structure of the music prediction model, combine the music prediction model and the determined topological structure according to the note information and time series prediction results to obtain the predicted music, and realize automatic composition.
The composition method based on artificial intelligence according to claim 1, wherein the performing ARIMA model prediction of autoregressive integral moving average comprises:

Test the audio, sound frame and its changing rules according to the autocorrelation function and partial autocorrelation function to identify the stability of the time series;

Smoothing non-stationary time series;

According to the identification rules of the time series, the ARIMA model is selected for parameter estimation of the ARIMA model;

Perform a hypothesis test to diagnose whether the residual sequence is white noise;

Predictive analysis using validated models.
The composition method based on artificial intelligence according to claim 1, wherein the constructing a music prediction model based on the note information and the time series prediction result comprises:

Establish a single-layer long-term and short-term memory network LSTM, and use the Dropout mechanism to establish a hidden layer;

Copying a single-layer network into a double-layer network according to the single-layer LSTM;

The weight parameters are initialized, the back-propagation mechanism is used to adjust the weight parameters in the two-layer network layer by layer, and iteratively improves the training accuracy of the network to optimize the loss function and construct a music prediction model.
The composition method based on artificial intelligence according to claim 3, wherein the establishment of a single-layer long-term and short-term memory network LSTM and the use of a Dropout mechanism to establish a hidden layer comprises:

Establish a single-layer long-term and short-term memory network LSTM, where the single-layer LSTM contains LSTM blocks, and the LSTM block contains a gate that determines whether the input is important, can be remembered, and can be output;

The initial state is randomly initialized, each step corresponds to an input value, and the input value is a word vector corresponding to each word;

The Dropout mechanism is used to randomly delete some units of the hidden layer, keeping the input and output neurons unchanged.
The composition method based on artificial intelligence according to claim 3, wherein the replicating a single-layer network into a double-layer network according to the single-layer LSTM comprises:

Use multiple RNN units to summarize multiple basic LSTM units into one;

Control the state of the RNN unit through the gate structure, delete or add information to it, and add an RNN unit each time to recall a basic LSTM unit;

Set initial values for the initial state of each layer of the network.
A composition system based on artificial intelligence, including:

An acquisition module, configured to acquire note information, where the note information includes a start time, a playback duration, and a pitch value of each note;

The creation module is set to create a music time series and perform autoregressive integrated moving average ARIMA model prediction to obtain the time series prediction results;

A construction module configured to construct a music prediction model according to the note information and a time series prediction result;

The composition module is configured to determine a topology structure of the music prediction model, combine the music prediction model and the determined topology structure according to the note information and the time series prediction result, obtain the predicted music, and implement automatic composition.
The composition system based on artificial intelligence according to claim 6, wherein the creation module further comprises:

The recognition unit is configured to check the audio, sound frame and its change rule according to the autocorrelation function and the partial autocorrelation function, to identify the stability of the time series;

The smoothing processing unit is set to smooth the non-stationary time series;

The parameter estimation unit is set to select the ARIMA model to perform parameter estimation of the ARIMA model according to the recognition rules of the time series;

A diagnosis unit configured to perform a hypothesis test to diagnose whether the residual sequence is white noise;

Prediction unit, set to perform predictive analysis using a model that has passed the test.
The composition system based on artificial intelligence according to claim 6, wherein the building module further comprises:

The establishment unit is set to establish a single-layer long-term and short-term memory network LSTM, and a Dropout mechanism is used to establish a hidden layer;

A replication unit configured to replicate a single-layer network into a double-layer network according to the single-layer LSTM;

The optimization unit is set to initialize the weight parameters, and the backward propagation mechanism is used to adjust the weight parameters in the two-layer network layer by layer. The iteratively improves the training accuracy of the network and optimizes the loss function to construct a music prediction model.
The composition system based on artificial intelligence according to claim 8, wherein the establishing unit further comprises:

The establishment subunit is set up to establish a single-layer long-term and short-term memory network LSTM. The single-layer LSTM contains an LSTM block, and the LSTM block contains a gate. The gate determines whether the input is important, can be remembered, and can Be output

The corresponding subunit is set to be initialized randomly in an initial state, and each step corresponds to an input value, and the input value is a word vector corresponding to each word;

The deletion subunit is set to use the Dropout mechanism to randomly delete some of the hidden layer units, keeping the input and output neurons unchanged.
The composition system based on artificial intelligence according to claim 8, wherein the reproduction unit further comprises:

Aggregation subunit, set to use multiple RNN units to aggregate multiple basic LSTM units into one;

The deletion subunit is set to control the state of the RNN unit through the gate structure, and delete or add information to it, and each time an RNN unit is added, a basic LSTM unit is called again;

Set the sub-units to set initial values for the initial state of each layer of the network.
A computer device includes a memory and a processor. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the processor causes the processor to perform the following steps:

Acquiring note information, where the note information includes a playback start time, a playback duration, and a pitch value of each note;

Create music time series, and perform autoregressive integrated moving average ARIMA model prediction to obtain time series prediction results;

Construct a music prediction model according to the note information and time series prediction results;

Determine the topological structure of the music prediction model, combine the music prediction model and the determined topological structure according to the note information and time series prediction results to obtain the predicted music, and realize automatic composition.
The computer device according to claim 11, wherein the performing the autoregressive integral moving average ARIMA model prediction causes the processor to perform the following steps:

Test the audio, sound frame and its changing rules according to the autocorrelation function and partial autocorrelation function to identify the stability of the time series;

Smoothing non-stationary time series;

According to the identification rules of the time series, the ARIMA model is selected for parameter estimation of the ARIMA model;

Perform a hypothesis test to diagnose whether the residual sequence is white noise;

Predictive analysis using validated models.
The computer device according to claim 11, wherein when the music prediction model is constructed based on the note information and the time series prediction result, the processor is caused to perform the following steps:

Establish a single-layer long-term and short-term memory network LSTM, and use the Dropout mechanism to establish a hidden layer;

Copying a single-layer network into a double-layer network according to the single-layer LSTM;

The weight parameters are initialized, the back-propagation mechanism is used to adjust the weight parameters in the two-layer network layer by layer, and iteratively improves the training accuracy of the network to optimize the loss function and construct a music prediction model.
The computer device according to claim 13, wherein, when the single-layer long-term short-term memory network (LSTM) is established, and a hidden layer is established by using a Dropout mechanism, the processor is caused to perform the following steps:

Establish a single-layer long-term and short-term memory network LSTM, where the single-layer LSTM contains LSTM blocks, and the LSTM block contains a gate that determines whether the input is important, can be remembered, and can be output;

The initial state is randomly initialized, each step corresponds to an input value, and the input value is a word vector corresponding to each word;

The Dropout mechanism is used to randomly delete some units of the hidden layer, keeping the input and output neurons unchanged.
The computer device according to claim 13, wherein when the single-layer network is copied into a double-layer network according to the single-layer LSTM, the processor is caused to perform the following steps:

Use multiple RNN units to summarize multiple basic LSTM units into one;

Control the state of the RNN unit through the gate structure, delete or add information to it, and add an RNN unit each time to recall a basic LSTM unit;

Set initial values for the initial state of each layer of the network.
A storage medium storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:

Acquiring note information, where the note information includes a playback start time, a playback duration, and a pitch value of each note;

Create music time series and perform autoregressive integrated moving average ARIMA model prediction to get time series prediction results;

Construct a music prediction model according to the note information and time series prediction results;

Determine the topological structure of the music prediction model, combine the music prediction model and the determined topological structure according to the note information and time series prediction results to obtain the predicted music, and realize automatic composition.
The storage medium according to claim 16, wherein when performing the ARIMA model prediction of the autoregressive integral moving average, one or more processors are executed to perform the following steps:

Test the audio, sound frame and its changing rules according to the autocorrelation function and partial autocorrelation function to identify the stability of the time series;

Smoothing non-stationary time series;

According to the identification rules of the time series, the ARIMA model is selected for parameter estimation of the ARIMA model;

Perform a hypothesis test to diagnose whether the residual sequence is white noise;

Predictive analysis using validated models.
The storage medium according to claim 16, wherein when the music prediction model is constructed based on the note information and the time series prediction result, one or more processors are caused to perform the following steps:

Establish a single-layer long-term and short-term memory network LSTM, and use the Dropout mechanism to establish a hidden layer;

Copying a single-layer network into a double-layer network according to the single-layer LSTM;

The weight parameters are initialized, the back-propagation mechanism is used to adjust the weight parameters in the two-layer network layer by layer, and iteratively improves the training accuracy of the network to optimize the loss function and construct a music prediction model.
The storage medium according to claim 18, wherein when establishing a single-layer long-term and short-term memory network (LSTM) and using the Dropout mechanism to establish a hidden layer, one or more processors are executed:

Establish a single-layer long-term and short-term memory network LSTM, where the single-layer LSTM contains LSTM blocks, and the LSTM block contains a gate that determines whether the input is important, can be remembered, and can be output;

The initial state is randomly initialized, each step corresponds to an input value, and the input value is a word vector corresponding to each word;

The Dropout mechanism is used to randomly delete some units of the hidden layer, keeping the input and output neurons unchanged.
The storage medium according to claim 18, wherein when the single-layer network is copied into a double-layer network according to the single-layer LSTM, one or more processors are caused to perform the following steps:

Use multiple RNN units to summarize multiple basic LSTM units into one;

Control the state of the RNN unit through the gate structure, delete or add information to it, and add an RNN unit each time to recall a basic LSTM unit;

Set initial values for the initial state of each layer of the network.