CN113192472A - Information processing method, information processing device, electronic equipment and storage medium - Google Patents

Information processing method, information processing device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113192472A
CN113192472A CN202110475573.XA CN202110475573A CN113192472A CN 113192472 A CN113192472 A CN 113192472A CN 202110475573 A CN202110475573 A CN 202110475573A CN 113192472 A CN113192472 A CN 113192472A
Authority
CN
China
Prior art keywords
information
rhythm
melody
pitch
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110475573.XA
Other languages
Chinese (zh)
Inventor
吴健
孙炜岳
史学佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Smart Sound Technology Co ltd
Original Assignee
Beijing Smart Sound Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Smart Sound Technology Co ltd filed Critical Beijing Smart Sound Technology Co ltd
Priority to CN202110475573.XA priority Critical patent/CN113192472A/en
Publication of CN113192472A publication Critical patent/CN113192472A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • G10H1/0025Automatic or semi-automatic music composition, e.g. producing random music, applying rules from music theory or modifying a musical piece
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/40Rhythm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/101Music Composition or musical creation; Tools or processes therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/101Music Composition or musical creation; Tools or processes therefor
    • G10H2210/105Composing aid, e.g. for supporting creation, edition or modification of a piece of music
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/101Music Composition or musical creation; Tools or processes therefor
    • G10H2210/111Automatic composing, i.e. using predefined musical rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/341Rhythm pattern selection, synthesis or composition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • G10L2025/906Pitch tracking

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The application discloses an information processing method, an information processing device, electronic equipment and a storage medium, and the specific implementation scheme is as follows: acquiring rhythm information; generating a model according to the rhythm information and a pre-trained melody to obtain pitch information corresponding to the rhythm information; and synthesizing according to the rhythm information and the pitch information to obtain melody information. By adopting the method and the device, the melody can be automatically generated based on the given rhythm.

Description

Information processing method, information processing device, electronic equipment and storage medium
Technical Field
The present application relates to the field of digital music, and in particular, to an information processing method and apparatus, an electronic device, and a storage medium.
Background
Since the information revolution, the way music and multimedia are spread has changed in a short time. This variety of qualities has led to a dramatic increase in market demand for various types of music: a great deal of original music is required for either single songs, albums, MVs, karaoke, which are major elements of popular music or artistic creations, short videos, advertisements, animations, trailers, and movie works using music as an auxiliary, or radio stations, broadcasters, public space music using music as background content. Melody is a component for generating high-quality original music, and how to automatically generate melody based on given rhythm is a technical problem to be solved urgently.
Disclosure of Invention
The application provides an information processing method, an information processing device, electronic equipment and a storage medium.
According to an aspect of the present application, there is provided an information processing method including:
acquiring rhythm information;
generating a model according to the rhythm information and a pre-trained melody to obtain pitch information corresponding to the rhythm information;
and synthesizing according to the rhythm information and the pitch information to obtain melody information.
According to another aspect of the present application, there is provided an information processing apparatus including:
the rhythm acquisition module is used for acquiring rhythm information;
the pitch generation module is used for generating a model according to the rhythm information and a pre-trained melody to obtain pitch information corresponding to the rhythm information;
and the synthesis module is used for carrying out synthesis processing according to the rhythm information and the pitch information to obtain melody information.
According to another aspect of the present application, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method as provided by any one of the embodiments of the present application.
According to another aspect of the present application, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method provided by any one of the embodiments of the present application.
By adopting the method and the device, rhythm information can be acquired, pitch information corresponding to the rhythm information is acquired according to the rhythm information and the pre-trained melody generation model, and melody information is acquired by synthesizing the rhythm information and the pitch information, so that the rhythm can be automatically generated based on the given rhythm.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a schematic flow chart diagram of an information processing method according to an embodiment of the present application;
FIG. 2 is a system flow diagram illustrating an example application of an information processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a seq2seq model of an application example of the information processing method according to the embodiment of the present application;
fig. 4 is a schematic diagram of a pitch sequence generated by a given rhythm sequence of an application example of the information processing method according to the embodiment of the present application;
FIG. 5 is a schematic diagram of a configuration of an information processing apparatus according to an embodiment of the present application;
fig. 6 is a block diagram of an electronic device for implementing the information processing method according to the embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The term "at least one" herein means any combination of at least two of any one or more of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C. The terms "first" and "second" used herein refer to and distinguish one from another in the similar art, without necessarily implying a sequence or order, or implying only two, such as first and second, to indicate that there are two types/two, first and second, and first and second may also be one or more.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present application.
According to an embodiment of the present application, there is provided an information processing method, and fig. 1 is a flowchart of the information processing method according to the embodiment of the present application, which can be applied to an information processing apparatus, for example, where the apparatus can be deployed in a terminal or a server or other processing device to execute, rhythm information acquisition, pitch information generation, synthesized melody, and the like can be executed. Among them, the terminal may be a User Equipment (UE), a mobile device, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, and so on. In some possible implementations, the method may also be implemented by a processor calling computer readable instructions stored in a memory. As shown in fig. 1, includes:
s101, acquiring rhythm information.
And S102, generating a model according to the rhythm information and a pre-trained melody to obtain pitch information corresponding to the rhythm information.
In an example, the pre-trained melody generation model may be a neural network (e.g., a feedback neural network based on an "encoding-decoding" mechanism), and the pre-trained melody generation model may perform feature extraction on the rhythm information, generate a corresponding pitch by using the obtained rhythm feature, and finally synthesize corresponding melody information.
S103, synthesizing according to the rhythm information and the pitch information to obtain melody information.
In one example, the melodic information is a collection of time-distributed series of notes, each note including two elements, pitch and duration. Wherein the pitch determines how often the note is played and the duration determines how long the note is played. The temporal arrangement of the duration properties of the notes is the rhythm of the melody, in other words, the rhythm of the melody is the duration sequence of the notes that make up the melody.
Adopt this application, can acquire rhythm information, according to rhythm information and the melody generation model trained in advance obtain corresponding rhythm information's pitch information, according to rhythm information reaches pitch information carries out the synthesis processing, obtains melody information, namely, can input the rhythm of user's input into the melody generation model trained to the pitch in the generation melody, after making up rhythm and pitch again, give the pitch based on given rhythm, finally generate the melody, thereby, realized the automatic generation melody based on given rhythm.
In one embodiment, the pre-trained melody generation model includes: a first sub-model for encoding, and a second sub-model for decoding. The generating a model according to the rhythm information and the pre-trained melody to obtain pitch information corresponding to the rhythm information includes: inputting the rhythm information into the first submodel (the first submodel can be an encoder), and extracting rhythm characteristic information corresponding to the rhythm information through the first submodel; and inputting the rhythm characteristic information into the second submodel (the second submodel can be a decoder), and decoding the rhythm characteristic information through the second submodel to obtain the pitch information.
In one embodiment, the acquiring rhythm information includes: acquiring first user operation, extracting retrieval keywords from the first user operation, and performing query processing according to the retrieval keywords to obtain the rhythm information; or, in response to a first user operation, directly extracting the rhythm information from the first user operation.
In one example, the search keyword is a song name, and the song name is searched in a song library to obtain the rhythm information; in another example, the user manually inputs the rhythm information, and the rhythm information can be directly obtained without any additional processing such as retrieval. That is, the user can input a piece of rhythm information in various ways, for example, input the song name of an existing song to inquire the rhythm information corresponding to the song name; or directly input rhythm information.
In one embodiment, the method further comprises: collecting a plurality of melody sample information in advance; carrying out information separation processing on the plurality of melody sample information to obtain a rhythm sequence and a pitch sequence which form each melody sample information; and training a melody generation model according to the rhythm sequence and the pitch sequence to obtain the pre-trained melody generation model.
In one embodiment, the synthesizing according to the rhythm information and the pitch information to obtain the melody information includes: synthesizing according to the rhythm information and the pitch information to obtain a note duration sequence generated by arranging the duration attributes of notes in time; and taking the note duration sequence as the melody information.
Application example:
deep learning methods have found application in a number of artistic creation areas, such as pictorial style migration, novice creation, and the like. Music is an important form of art, and in this field of art, some methods based on deep learning have also been proposed for music creation. Melodies are important components of music. Generally, a melody is a collection of time distributions of a series of notes. Each note includes two elements, pitch and duration. Pitch determines how often the note is played and duration determines how long the note will be played. The arrangement of the time length attribute of the notes in time is the rhythm of the melody.
In composing, i.e. generating melodies, there are many possible ways: 1) firstly creating rhythm and then endowing the notes with pitches; 2) firstly creating a melody line (namely, a rough pitch trend exists), and then matching with a rhythm; 3) and simultaneously creating rhythm and pitch trend.
It is considered that composition is a highly skilled skill. Generally, a composer needs to have a firm music theory base and rich composing experience to be able to complete the composition of the composition, but most users are non-professional and how to help non-patent people to compose easily?
In view of the above problems, by applying the processing flow of an application example of the embodiment of the present application, it is possible for a non-professional person to automatically assign a pitch to a given rhythm and finally generate a melody, and to implement automatic composition by converting the rhythm into a beautiful melody, including the following contents:
a melody is a collection of notes distributed along time in a series. Each note includes two elements, pitch and duration. Pitch determines how often the note is played and duration determines how long the note will be played. The arrangement of the time length attribute of the notes in time is the rhythm of the melody. The melody is automatically generated according to the rhythm, as shown in fig. 2, including: acquiring a rhythm input by a user; inputting the rhythm into the trained melody generation model to generate the pitch in the melody; and combining the rhythm and the pitch to obtain the final melody.
First, obtaining rhythm input by user
The rhythm input by the user needs to be acquired first. The rhythm of a melody is a sequence of durations of notes that make up the melody. The user may input a tempo in a number of ways, such as inputting the title of an existing song, or directly inputting the tempo. When the user enters a song name, the song will be found from the database based on the song name and the tempo of the song will be extracted as input. When the user inputs the rhythm, the similar sequence input by the user is directly used as the rhythm:
[ quarter note, eighth note, quarter note … … ]
Inputting rhythm into trained melody generating model to generate pitch in melody
The rhythm is processed using a neural network-based melody generation model, features are extracted from the rhythm, and the melody is generated using the features. This melody generation model is based on a sequence-to-sequence (seq2seq) model, which is a neural network model based on the encoder-decoder concept. As shown in fig. 3, it uses a neural network as the encoder to encode the input rhythm sequence, and integrates the input information into a fixed-length vector; then, another neural network is used as a decoder to decode the vector, and finally, a complete output sequence is obtained. The task for which this seq2seq model is directed can be abstracted to a given sequence x ═ (x)1,x2,x3,…,xn1) It is desirable to obtain a corresponding output sequence y ═ (y) — (y)1,y2,y3,…,yn2) Calculate the probability P (y)1,y2,…,yn2|x1,x2,…,xn1)。
In particular in the task of generating melodies at a given rhythm, as shown in fig. 4, this will be doneRhythmic sequence [ quarter note, eighth note, quarter note … …]As sequence x. The pitch sequence [ C, E, G, E … … ]]As the sequence y, a feedback neural network (RNN) or other neural network is used as an Encoder (Encoder), and an RNN or other neural network is used as a Decoder (Decoder). Using an encoder to set the cadence sequence x to (x)1,x2,x3,…,xn1) The elements in (1) are input into the encoder model one by one, and the characteristics representing the rhythm sequence information are obtained:
h=Encoder(x)
h is a feature representing rhythm sequence information. The decoder uses h to decode the distribution that generates the pitch for each note:
P(y|x)=Decoder(h)
the final pitch of each note can then be sampled from the probability distribution.
The encoder and decoder networks herein are not limited in form. The RNN may be the above-described RNN, a Convolutional Neural Network (CNN), or an attention-based neural network.
In order to obtain a usable melody generation model, the model needs to be trained. First a data set containing a lot of melodies needs to be collected. For each melody in the data set, its rhythm sequence and pitch sequence need to be separated from it. Then, the model is optimized using a stochastic gradient descent algorithm according to the following optimization criteria:
L=argmax P(y|x)
thirdly, combining the rhythm and the pitch to obtain the final melody
After the rhythm sequence is input into the melody generation model to obtain a pitch sequence, the rhythm and the pitch are combined to obtain a note sequence, and the sequence is the final melody.
By adopting the method and the device, the melody can be automatically generated based on the given rhythm, so that the doorsill of the composition is effectively reduced, and non-professional persons can be helped to finish the creation of the melody based on the rhythm creativity of the non-professional persons.
According to an embodiment of the present application, there is provided an information processing apparatus, and fig. 5 is a schematic diagram of a configuration of the information processing apparatus according to the embodiment of the present application, and as shown in fig. 5, the information processing apparatus includes: a rhythm obtaining module 51, configured to obtain rhythm information; a pitch generation module 52, configured to generate a model according to the rhythm information and a pre-trained melody, and obtain pitch information corresponding to the rhythm information; and the synthesis module 53 is configured to perform synthesis processing according to the rhythm information and the pitch information to obtain melody information.
In one embodiment, the pre-trained melody generation model includes: a first submodel for encoding and a second submodel for decoding; the pitch generation module is configured to: inputting the rhythm information into the first submodel, and extracting rhythm characteristic information corresponding to the rhythm information through the first submodel; and inputting the rhythm characteristic information into the second submodel, and decoding the rhythm characteristic information through the second submodel to obtain the pitch information.
In one embodiment, the tempo acquisition module is configured to: acquiring first user operation, extracting retrieval keywords from the first user operation, and performing query processing according to the retrieval keywords to obtain the rhythm information; or, in response to a first user operation, directly extracting the rhythm information from the first user operation.
In one embodiment, the method further comprises: the sample acquisition module is used for acquiring a plurality of melody sample information in advance; the information separation module is used for carrying out information separation processing on the plurality of melody sample information to obtain a rhythm sequence and a pitch sequence which form each melody sample information; and the model training module is used for training the melody generation model according to the rhythm sequence and the pitch sequence so as to obtain the pre-trained melody generation model.
In one embodiment, the synthesis module is configured to: synthesizing according to the rhythm information and the pitch information to obtain a note duration sequence generated by arranging the duration attributes of notes in time; and taking the note duration sequence as the melody information.
The functions of each module in each apparatus in the embodiment of the present application may refer to corresponding descriptions in the above method, and are not described herein again.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 6, the electronic device is a block diagram for implementing the information processing method according to the embodiment of the present application. The electronic device may be the aforementioned deployment device or proxy device. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 6, the electronic apparatus includes: one or more processors 801, memory 802, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 6, a processor 801 is taken as an example.
The memory 802 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor, so that the at least one processor executes the information processing method provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the information processing method provided by the present application.
The memory 802, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the information processing method in the embodiments of the present application. The processor 801 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 802, that is, implements the information processing method in the above-described method embodiments.
The memory 802 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 802 may include high speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 802 optionally includes memory located remotely from the processor 801, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the information processing method may further include: an input device 803 and an output device 804. The processor 801, the memory 802, the input device 803, and the output device 804 may be connected by a bus or other means, as exemplified by the bus connection in fig. 6.
The input device 803 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic device, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 804 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (12)

1. An information processing method, characterized in that the method comprises:
acquiring rhythm information;
generating a model according to the rhythm information and a pre-trained melody to obtain pitch information corresponding to the rhythm information;
and synthesizing according to the rhythm information and the pitch information to obtain melody information.
2. The method of claim 1, wherein the pre-trained melody generation model comprises: a first submodel for encoding and a second submodel for decoding;
the generating a model according to the rhythm information and the pre-trained melody to obtain pitch information corresponding to the rhythm information includes:
inputting the rhythm information into the first submodel, and extracting rhythm characteristic information corresponding to the rhythm information through the first submodel;
and inputting the rhythm characteristic information into the second submodel, and decoding the rhythm characteristic information through the second submodel to obtain the pitch information.
3. The method according to claim 1 or 2, wherein the obtaining rhythm information comprises:
acquiring first user operation, extracting retrieval keywords from the first user operation, and performing query processing according to the retrieval keywords to obtain the rhythm information; alternatively, the first and second electrodes may be,
in response to a first user operation, the rhythm information is directly extracted from the first user operation.
4. The method of claim 1 or 2, further comprising:
collecting a plurality of melody sample information in advance;
carrying out information separation processing on the plurality of melody sample information to obtain a rhythm sequence and a pitch sequence which form each melody sample information;
and training a melody generation model according to the rhythm sequence and the pitch sequence to obtain the pre-trained melody generation model.
5. The method according to claim 1 or 2, wherein the synthesizing from the rhythm information and the pitch information to obtain melody information comprises:
synthesizing according to the rhythm information and the pitch information to obtain a note duration sequence generated by arranging the duration attributes of notes in time;
and taking the note duration sequence as the melody information.
6. An information processing apparatus characterized in that the apparatus comprises:
the rhythm acquisition module is used for acquiring rhythm information;
the pitch generation module is used for generating a model according to the rhythm information and a pre-trained melody to obtain pitch information corresponding to the rhythm information;
and the synthesis module is used for carrying out synthesis processing according to the rhythm information and the pitch information to obtain melody information.
7. The apparatus of claim 6, wherein the pre-trained melody generation model comprises: a first submodel for encoding and a second submodel for decoding;
the pitch generation module is configured to:
inputting the rhythm information into the first submodel, and extracting rhythm characteristic information corresponding to the rhythm information through the first submodel;
and inputting the rhythm characteristic information into the second submodel, and decoding the rhythm characteristic information through the second submodel to obtain the pitch information.
8. The apparatus of claim 6 or 7, wherein the tempo obtaining module is configured to:
acquiring first user operation, extracting retrieval keywords from the first user operation, and performing query processing according to the retrieval keywords to obtain the rhythm information; alternatively, the first and second electrodes may be,
in response to a first user operation, the rhythm information is directly extracted from the first user operation.
9. The apparatus of claim 6 or 7, further comprising:
the sample acquisition module is used for acquiring a plurality of melody sample information in advance;
the information separation module is used for carrying out information separation processing on the plurality of melody sample information to obtain a rhythm sequence and a pitch sequence which form each melody sample information;
and the model training module is used for training the melody generation model according to the rhythm sequence and the pitch sequence so as to obtain the pre-trained melody generation model.
10. The apparatus of claim 6 or 7, wherein the synthesis module is configured to:
synthesizing according to the rhythm information and the pitch information to obtain a note duration sequence generated by arranging the duration attributes of notes in time;
and taking the note duration sequence as the melody information.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
CN202110475573.XA 2021-04-29 2021-04-29 Information processing method, information processing device, electronic equipment and storage medium Pending CN113192472A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110475573.XA CN113192472A (en) 2021-04-29 2021-04-29 Information processing method, information processing device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110475573.XA CN113192472A (en) 2021-04-29 2021-04-29 Information processing method, information processing device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113192472A true CN113192472A (en) 2021-07-30

Family

ID=76980757

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110475573.XA Pending CN113192472A (en) 2021-04-29 2021-04-29 Information processing method, information processing device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113192472A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920968A (en) * 2021-10-09 2022-01-11 北京灵动音科技有限公司 Information processing method, information processing device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000231381A (en) * 1999-02-08 2000-08-22 Yamaha Corp Melody generating device, rhythm generating device and recording medium
CN109584846A (en) * 2018-12-21 2019-04-05 成都嗨翻屋科技有限公司 A kind of melody generation method based on generation confrontation network
CN109671416A (en) * 2018-12-24 2019-04-23 成都嗨翻屋科技有限公司 Music rhythm generation method, device and user terminal based on enhancing study
CN110853604A (en) * 2019-10-30 2020-02-28 西安交通大学 Automatic generation method of Chinese folk songs with specific region style based on variational self-encoder
CN112420003A (en) * 2019-08-22 2021-02-26 北京峰趣互联网信息服务有限公司 Method and device for generating accompaniment, electronic equipment and computer-readable storage medium
CN112420002A (en) * 2019-08-21 2021-02-26 北京峰趣互联网信息服务有限公司 Music generation method, device, electronic equipment and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000231381A (en) * 1999-02-08 2000-08-22 Yamaha Corp Melody generating device, rhythm generating device and recording medium
CN109584846A (en) * 2018-12-21 2019-04-05 成都嗨翻屋科技有限公司 A kind of melody generation method based on generation confrontation network
CN109671416A (en) * 2018-12-24 2019-04-23 成都嗨翻屋科技有限公司 Music rhythm generation method, device and user terminal based on enhancing study
CN112420002A (en) * 2019-08-21 2021-02-26 北京峰趣互联网信息服务有限公司 Music generation method, device, electronic equipment and computer readable storage medium
CN112420003A (en) * 2019-08-22 2021-02-26 北京峰趣互联网信息服务有限公司 Method and device for generating accompaniment, electronic equipment and computer-readable storage medium
CN110853604A (en) * 2019-10-30 2020-02-28 西安交通大学 Automatic generation method of Chinese folk songs with specific region style based on variational self-encoder

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920968A (en) * 2021-10-09 2022-01-11 北京灵动音科技有限公司 Information processing method, information processing device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112365882B (en) Speech synthesis method, model training method, device, equipment and storage medium
US20200342646A1 (en) Music driven human dancing video synthesis
CN111935537A (en) Music video generation method and device, electronic equipment and storage medium
CN110955764B (en) Scene knowledge graph generation method, man-machine conversation method and related equipment
CN112270920A (en) Voice synthesis method and device, electronic equipment and readable storage medium
US9064484B1 (en) Method of providing feedback on performance of karaoke song
CN112259072A (en) Voice conversion method and device and electronic equipment
CN110599985B (en) Audio content generation method, server device and client device
CN104866275B (en) Method and device for acquiring image information
CN110674241B (en) Map broadcasting management method and device, electronic equipment and storage medium
JP7240505B2 (en) Voice packet recommendation method, device, electronic device and program
CN112614478B (en) Audio training data processing method, device, equipment and storage medium
US20210407479A1 (en) Method for song multimedia synthesis, electronic device and storage medium
CN111225236A (en) Method and device for generating video cover, electronic equipment and computer-readable storage medium
CN112382287A (en) Voice interaction method and device, electronic equipment and storage medium
CN112446727A (en) Advertisement triggering method, device, equipment and computer readable storage medium
JP2023502815A (en) Method, Apparatus, Apparatus, and Computer Storage Medium for Producing Broadcast Audio
CN111177462B (en) Video distribution timeliness determination method and device
CN113192472A (en) Information processing method, information processing device, electronic equipment and storage medium
CN113158642A (en) Information processing method, information processing device, electronic equipment and storage medium
JP2022022080A (en) Video segment extraction method, video segment extraction apparatus, electronic device, computer-readable storage medium, and computer program
JP2021144742A (en) Similarity processing method, apparatus, electronic equipment, storage medium, and program
CN111147940B (en) Video playing method and device, computer equipment and medium
CN111353070A (en) Video title processing method and device, electronic equipment and readable storage medium
CN113920968A (en) Information processing method, information processing device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination