CN108960250A - Image is to the conversion method of melody, device and computer readable storage medium - Google Patents

Image is to the conversion method of melody, device and computer readable storage medium Download PDF

Info

Publication number
CN108960250A
CN108960250A CN201810427683.7A CN201810427683A CN108960250A CN 108960250 A CN108960250 A CN 108960250A CN 201810427683 A CN201810427683 A CN 201810427683A CN 108960250 A CN108960250 A CN 108960250A
Authority
CN
China
Prior art keywords
image
point
color
articulation
grid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810427683.7A
Other languages
Chinese (zh)
Other versions
CN108960250B (en
Inventor
邓立邦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Phase Intelligent Technology Co Ltd
Original Assignee
Guangdong Phase Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Phase Intelligent Technology Co Ltd filed Critical Guangdong Phase Intelligent Technology Co Ltd
Priority to CN201810427683.7A priority Critical patent/CN108960250B/en
Publication of CN108960250A publication Critical patent/CN108960250A/en
Application granted granted Critical
Publication of CN108960250B publication Critical patent/CN108960250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0033Recording/reproducing or transmission of music for electrophonic musical instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/101Music Composition or musical creation; Tools or processes therefor
    • G10H2210/111Automatic composing, i.e. using predefined musical rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/101Music Composition or musical creation; Tools or processes therefor
    • G10H2210/145Composing rules, e.g. harmonic or musical rules, for use in automatic composition; Rule generation algorithms therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/541Details of musical waveform synthesis, i.e. audio waveshape processing from individual wavetable samples, independently of their origin or of the sound they represent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Auxiliary Devices For Music (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The present invention provides a kind of conversion method of image to melody, device and computer readable storage mediums, this method comprises: obtaining the HSB value of each pixel in target image and carrying out color cluster processing, obtain the corresponding color cluster image of target image;Each color lump in color cluster image is normalized, the corresponding pronunciation point image of target image is obtained and is mapped in the grid pre-established, the mapping relations of each point of articulation and each scale in grid in pronunciation point image are established;According to mapping relations, direction initialization along grid extracts the corresponding scale of point of articulation in pronunciation point image, and audio is converted to using the corresponding virtual musical instrument that plays an instrument determined by color cluster image the corresponding scale of point of articulation in point image that will pronounce, generate the corresponding melody of target image.Target image can be converted to one section of specific melody by the above method, the duration and cost of music rhythm production is greatly reduced, meet people to the customization demand of music rhythm.

Description

Image is to the conversion method of melody, device and computer readable storage medium
Technical field
The present invention relates to image and music processing technology fields, and in particular to a kind of conversion method of image to melody, dress It sets and computer readable storage medium.
Background technique
Music is a kind of form of expression of human emotion, and melody is the most basic element for constituting music, music artist By creating melody, to complete musical composition.With the continuous development of digital music and computer-related technologies, more and more People wants to meet individual requirements using the automatic art music of computer technology, and the video of for example, one section shooting is unique with one section Background music, melody when for one group of photo with one section of browsing, be mobile phone setting solely personalization one without two incoming ring tone etc., so And one's own melody is created for common people and music is very difficult, moreover, musical composition needs at present Dedicated computer equipment and system are wanted, at high cost, time-consuming and complicated for operation, learning cost is very especially for ordinary user It is high, it is not easy to which that upper hand uses.
Summary of the invention
The object of the present invention is to provide a kind of conversion method of image to melody, device and computer readable storage medium, Target image can be converted to one section of specific music rhythm, greatly reduce the duration and cost of music rhythm production, it is full Customization demand of the sufficient people to music rhythm.
The embodiment of the invention provides a kind of conversion methods of image to melody, comprising:
The HSB value of each pixel in target image is obtained, and according to the HSB value to each pixel of the target image Color cluster processing is carried out, the corresponding color cluster image of the target image is obtained;
Each color lump in the color cluster image is normalized, the corresponding pronunciation of the target image is obtained Point image;
The pronunciation point image is mapped in the grid pre-established, establish in the pronunciation point image each point of articulation with The mapping relations of each scale in the grid;
According to the color cluster image, the dominant hue of the target image is extracted;
According to the dominant hue of the target image and the preset tone musical instrument table of comparisons, the type to play an instrument is determined;
According to the mapping relations, it is corresponding that point of articulation in the pronunciation point image is extracted along the direction initialization of the grid Scale, and use virtual musical instrument corresponding with the type to play an instrument by the corresponding sound of point of articulation in the pronunciation point image Rank is converted to audio, generates the corresponding melody of the target image.
Preferably, described image to melody conversion method further include:
Acquire the corresponding cover image of multiple musical works to play an instrument;
The HSB value of each pixel in any one of cover image is extracted, and according to the HSB value to any one institute Each pixel for stating cover image carries out color cluster processing, and it is poly- to obtain the corresponding template colors of any one of cover image Class image obtains N number of template colors cluster image altogether;
The area accounting for calculating each color lump in the template colors cluster image obtains the template colors cluster image pair The dominant hue and dominant hue area accounting answered, the distribution of color as template colors cluster image;
To the distribution of color and the corresponding performance of template colors cluster image of N number of template colors cluster image Musical instrument is for statistical analysis, and the distribution of color for establishing the template colors cluster image is corresponding with template colors cluster image The mapping relations to play an instrument, generate the tone musical instrument table of comparisons.
Preferably, the dominant hue and the preset tone musical instrument table of comparisons according to the target image is determined and is played The type of musical instrument, specifically includes:
The area accounting for calculating each color lump in the corresponding color cluster image of the target image, obtains the target image The corresponding dominant hue area accounting of dominant hue;
By the dominant hue of the target image and dominant hue area accounting with it is multiple in the tone musical instrument table of comparisons Distribution of color ratio is compared, determine in the tone musical instrument table of comparisons with the dominant hue of the target image and dominant hue face The smallest distribution of color of difference of product accounting is corresponding to play an instrument, and drills as dominant hue is corresponding in the color cluster image It plays music the type of device;
According to the dominant hue of the target image and the dominant hue area accounting, determine in the color cluster image The corresponding volume accounting to play an instrument of each color lump.
Preferably, the HSB value for obtaining each pixel in target image, and according to the HSB value to the target figure Each pixel of picture carries out color cluster processing, obtains the corresponding color cluster image of the target image, specifically includes:
Obtain the HSB value of each pixel in target image;
According to the HSB value of pixel each in the target image, form and aspect distance is obtained in the target image more than first The pixel of threshold value, and obtain multiple color change regions;
The form and aspect for the neighbor pixel that the difference for calculating HSB value in the color change region is less than second threshold are average Value, and the neighbor pixel is aggregated into the color lump corresponding to the form and aspect average value;
When the form and aspect distance of neighbor pixel in the color change region is zero, according to the color lump after polymerization, generate The color cluster image.
Preferably, each color lump in the color cluster image is normalized, and obtains the target figure As corresponding pronunciation point image, specifically include:
The smallest color lump of area in the color cluster image is obtained, and sets one for the smallest color lump of the area Point of articulation;
Other color lumps in the color cluster image are adjusted to the integral multiple of the point of articulation;
According to the corresponding point of articulation of color lump each in the color cluster image, the pronunciation point image is generated.
Preferably, described that the pronunciation point image is mapped in the grid pre-established, establish the pronunciation point image In in each point of articulation and the grid each scale mapping relations, specifically include:
According to the area and preset ratio of the point of articulation, grid area is set and establishes the grid;Wherein, described The corresponding scale of every a line of grid, each column of the grid corresponding time point;
Each point of articulation in the pronunciation point image is mapped in the grid;
When point of articulation is distributed on the grid lines of the grid, calculates separately the point of articulation and be connected on the grid Area accounting in the adjacent square of line, and by the point of articulation be assigned to the point of articulation in adjacent square area accounting compared with In a big grid;
According to each point of articulation in the pronunciation point image in the grid position and the grid in every row it is corresponding Scale, establish the mapping relations of each scale in each point of articulation and the grid in the pronunciation point image.
Preferably, described according to the mapping relations, the direction initialization along the grid extracts in the pronunciation point image The corresponding scale of point of articulation, and will be sent out in the pronunciation point image using virtual musical instrument corresponding with the type to play an instrument The corresponding scale of the point of articulation is converted to audio, generates the corresponding melody of the target image, specifically includes:
The direction initialization is the time-axis direction formed at the grid each column corresponding time point;
According to the mapping relations, extract in the pronunciation point image according to the corresponding time-axis direction of the grid The corresponding scale of point;
When in the adjacent grid of any a line that multiple point of articulation are located in the grid, by the multiple point of articulation tune Whole is the long of the corresponding scale of described any a line;
According to the time-axis direction, point of articulation corresponding time point in the reaction point image is extracted;
It plays an instrument according to the corresponding scale of point of articulation and time point in the pronunciation point image, and using with described The corresponding scale of point of articulation in the pronunciation point image is converted to audio by the corresponding virtual musical instrument of type, generates the target figure As corresponding melody.
Preferably, described according to the mapping relations, the direction initialization along the grid extracts in the pronunciation point image The corresponding scale of point of articulation, and will be sent out in the pronunciation point image using virtual musical instrument corresponding with the type to play an instrument The corresponding scale of the point of articulation is converted to audio, generates the corresponding melody of the target image, later further include:
The corresponding scale of the every a line of the grid is adjusted, each point of articulation and the net in the pronunciation point image are re-established The mapping relations of each scale and the corresponding melody of the target image is regenerated using the virtual musical instrument in lattice, obtains institute altogether State the corresponding N head melody of target image;
The corresponding N head melody of the target image is converted into waveform diagram respectively, obtains N number of waveform diagram altogether;
Point calculate any one of waveform diagram and the multiple template waveform diagram that is pre-stored in waveform diagram template database Similarity, and maximum value of any one of waveform diagram relative to the similarity of the multiple template waveforms figure is extracted, as The reference value of any one of waveform diagram;
The corresponding waveform diagram of maximum reference value is extracted from N number of waveform diagram;
Extract target melody of the corresponding melody of the corresponding waveform diagram of the maximum reference value as the target image.
The embodiment of the invention also provides a kind of conversion equipments of image to melody, which is characterized in that including processor, deposits Reservoir and storage in the memory and are configured as the computer program executed by the processor, and the processor is held The conversion method of above-mentioned image to melody is realized when the row computer program.
The embodiment of the invention also provides a kind of computer readable storage medium, the computer readable storage medium includes The computer program of storage, wherein control in computer program operation and set where the computer readable storage medium The standby conversion method for executing above-mentioned image to melody.
Compared with the existing technology, a kind of beneficial effect of conversion method of the image provided in an embodiment of the present invention to melody exists In: the conversion method of described image to melody, comprising: obtain the HSB value of each pixel in target image, and according to the HSB It is worth and color cluster processing is carried out to each pixel of the target image, obtains the corresponding color cluster figure of the target image Picture;Each color lump in the color cluster image is normalized, the corresponding pronunciation point diagram of the target image is obtained Picture;The pronunciation point image is mapped in the grid pre-established, establish in the pronunciation point image each point of articulation with it is described The mapping relations of each scale in grid;According to the color cluster image, the dominant hue of the target image is extracted;According to described The dominant hue of target image and the preset tone musical instrument table of comparisons determine the type to play an instrument;According to the mapping relations, The corresponding scale of point of articulation in the pronunciation point image is extracted along the direction initialization of the grid, and uses and plays an instrument with described Type corresponding virtual musical instrument the corresponding scale of point of articulation in the pronunciation point image is converted into audio, generate the target The corresponding melody of image.Target image can be converted to one section of specific music rhythm by the above method, greatly reduced The duration and cost of music rhythm production, meet people to the customization demand of music rhythm.The embodiment of the invention also provides A kind of image to melody conversion equipment and computer readable storage medium.
Detailed description of the invention
Fig. 1 is a kind of flow chart of conversion method of the image provided in an embodiment of the present invention to melody;
Fig. 2 is a kind of schematic diagram of conversion equipment of the image provided in an embodiment of the present invention to melody.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Referring to Fig. 1, it is a kind of flow chart of conversion method of the image provided in an embodiment of the present invention to melody, it is described The conversion method of image to melody includes:
S100: the HSB value of each pixel in target image is obtained, and according to the HSB value to each of the target image Pixel carries out color cluster processing, obtains the corresponding color cluster image of the target image;
S200: being normalized each color lump in the color cluster image, and it is corresponding to obtain the target image Pronunciation point image;
S300: the pronunciation point image is mapped in the grid pre-established, establishes in the pronunciation point image and respectively sends out The mapping relations of each scale in the point of articulation and the grid;
S400: according to the color cluster image, the dominant hue of the target image is extracted;
S500: it is played an instrument according to the dominant hue of the target image and the preset tone musical instrument table of comparisons, determination Type;
S600: according to the mapping relations, the direction initialization along the grid extracts point of articulation in the pronunciation point image Corresponding scale, and use virtual musical instrument corresponding with the type to play an instrument by point of articulation pair in the pronunciation point image The scale answered is converted to audio, generates the corresponding melody of the target image.
In the present embodiment, after by carrying out color cluster, normalized to target image, pronunciation point image is obtained, And the pronunciation point image is mapped in preset grid, the mapping relations of point of articulation and scale are established, then according to color class Dominant hue in poly- image determines the type to play an instrument;By the mapping relations using corresponding with the type to play an instrument Virtual musical instrument target image can be converted to one section of specific music rhythm according to the time-axis direction of grid, greatly drop The duration and cost of low music rhythm production, while the difficulty of music making is reduced, meet people to the customization of music rhythm Demand, so that the above method is in personalized handset the tinkle of bells, electron album background music, screen protection background music, video display Works, which are dubbed in background music etc., to be with a wide range of applications.
Wherein, step S400: according to the color cluster image, the dominant hue of the target image is extracted, specifically: adopting Color cluster image is subjected to color cluster with clustering algorithm, specially by dominant hue piecemeal, by constantly by adjacent HSB value The close point of color difference, which is averaged, is polymerized to same color lump, combines the color lump that target image is processed into various dominant hues, and such as three The combination of the figures such as angular, round, rectangle obtains the color lump combination of the dominant hue of target image.Extract the color of each color lump Value is the dominant hue of target image, and calculates separately the area accounting of each color lump in the target image.It further, will be described The color lump that the area accounting of color lump is greater than given threshold in color cluster image is determined as the dominant hue of the target image.
In an alternative embodiment, described image to melody conversion method further include:
Acquire the corresponding cover image of multiple musical works to play an instrument;
The HSB value of each pixel in any one of cover image is extracted, and according to the HSB value to any one institute Each pixel for stating cover image carries out color cluster processing, and it is poly- to obtain the corresponding template colors of any one of cover image Class image obtains N number of template colors cluster image altogether;
The area accounting for calculating each color lump in the template colors cluster image obtains the template colors cluster image pair The dominant hue and dominant hue area accounting answered, the distribution of color as template colors cluster image;
To the distribution of color and the corresponding performance of template colors cluster image of N number of template colors cluster image Musical instrument is for statistical analysis, and the distribution of color for establishing the template colors cluster image is corresponding with template colors cluster image The mapping relations to play an instrument, generate the tone musical instrument table of comparisons.
In the present embodiment, by acquiring the musical works largely to play an instrument (such as sounds such as CD, DVD, digital tone source Happy works) corresponding cover image, the HSB value of the cover image is extracted, constantly by HSB value adjacent in the cover image The close point of color difference is averaged polymerization, carries out color cluster to the cover image of acquisition respectively, cover image is processed into not With color lump combination zone, extracts the color value of each color lump and calculate area accounting of each color lump in cover image, sealed The dominant hue and area accounting of face image.By statistics law-analysing, the different mass-tones to play an instrument with cover image are obtained Reconcile the rule of area accounting, the dominant hue for the corresponding cover image that largely played an instrument, each dominant hue area accounting Statistical data, i.e., the distribution of color of the described cover image.
In an alternative embodiment, S500: according to the dominant hue of the target image and preset tone musical instrument The table of comparisons determines the type to play an instrument, specifically includes:
The area accounting for calculating each color lump in the corresponding color cluster image of the target image, obtains the target image The corresponding dominant hue area accounting of dominant hue;
By the dominant hue of the target image and dominant hue area accounting with it is multiple in the tone musical instrument table of comparisons Distribution of color ratio is compared, determine in the tone musical instrument table of comparisons with the dominant hue of the target image and dominant hue face The smallest distribution of color of difference of product accounting is corresponding to play an instrument, and drills as dominant hue is corresponding in the color cluster image It plays music the type of device;
According to the dominant hue of the target image and the dominant hue area accounting, determine in the color cluster image The corresponding volume accounting to play an instrument of each color lump.
In the present embodiment, specifically, the dominant hue for generating the target image of music rhythm is extracted, according to target figure When the dominant hue of picture and the area accounting of the dominant hue in the target image determine the corresponding type and its performance to play an instrument Volume.Because the COLOR COMPOSITION THROUGH DISTRIBUTION situation of each picture is different, the content of some expression is more, distribution of color is relatively abundanter, has The content of expression is few, COLOR COMPOSITION THROUGH DISTRIBUTION is relatively simple, therefore sets a threshold values to determine the quantity of the dominant hue of target image, and It is determined to be used alone or in combination according to the area accounting of each dominant hue and be played an instrument.By the dominant hue of obtained target image and Dominant hue area accounting compares the various corresponding relationship numbers to play an instrument with distribution of color for counting and being stored in advance server According to that is, described tone musical instrument table of comparisons finds out and drills corresponding to combining immediate distribution of color with target image dominant hue It plays music device, obtains the combination that plays an instrument of melody to be generated.For example, the target image includes M kind dominant hue, according to This M kind dominant hue and this corresponding dominant hue area accounting of M kind dominant hue, can determine that corresponding M kind plays an instrument, and adopt It is played an instrument with this M kind to merge generation melody.
For example, being played when the area accounting of color lump a certain in target image reaches 80% or more using single instrument.Again For example, when the corresponding dominant hue of target image and dominant hue are distributed accounting are as follows: the blue 40% of sky portion, snow mountain mountain portions Purplish grey 30%, withered grass slope part orange 20% and set stone part blackish green 10%, according to the tone musical instrument table of comparisons The various mapping relations to play an instrument with distribution of color can be learnt, to obtain the corresponding performance of each dominant hue in target image Instrument type, at this point, playing an instrument simultaneously using 40%, 30%, 20%, 10% dominant hue is corresponding to the sound marked in grid Rank is instrumental ensembled, and also carries out the distribution of volume in corresponding volume according to dominant hue area accounting.
In an alternative embodiment, S100: the HSB value of each pixel in target image is obtained, and according to the HSB It is worth and color cluster processing is carried out to each pixel of the target image, obtains the corresponding color cluster figure of the target image Picture specifically includes:
Obtain the HSB value of each pixel in target image;
According to the HSB value of pixel each in the target image, form and aspect distance is obtained in the target image more than first The pixel of threshold value, and obtain multiple color change regions;
The form and aspect for the neighbor pixel that the difference for calculating HSB value in the color change region is less than second threshold are average Value, and the neighbor pixel is aggregated into the color lump corresponding to the form and aspect average value;
When the form and aspect distance of neighbor pixel in the color change region is zero, according to the color lump after polymerization, generate The color cluster image.
In the present embodiment, the range of the first threshold is between 60 degree to 130 degree, it is preferable that the first threshold It is 60 degree.The second threshold is 15 degree.Such as when the form and aspect distance of two pixels in the target image is more than 60 degree, It is determined as color change region.After finding out color change region, pixel adjacent in target image is constantly analyzed, HSB It is worth close neighbor pixel and is averaged and be polymerized to a color lump, such as the HSB value of neighbor pixel A, B is respectively as follows: A point HSB value is H42 °, S43%, B21%, and the HSB value of B point is H38 °, S42%, B25%, due to the H value of two points is A42 °, B38 °, form and aspect distance is polymerized to a hue value within 15 degree after being averaged the hue value of pixel A and pixel B For H40 ° of color lump, the HSB value that different adjacent pixels is chosen in repetition carries out analysis and seeks form and aspect average value, until will be adjacent The close pixel of HSB value color difference take form and aspect average value to polymerize, target image is finally processed into multiple a different color blocks, it is raw At the color cluster image.
In an alternative embodiment, S200: being normalized each color lump in the color cluster image, The corresponding pronunciation point image of the target image is obtained, is specifically included:
The smallest color lump of area in the color cluster image is obtained, and sets one for the smallest color lump of the area Point of articulation;
Other color lumps in the color cluster image are adjusted to the integral multiple of the point of articulation;
According to the corresponding point of articulation of color lump each in the color cluster image, the pronunciation point image is generated.
In an alternative embodiment, S300: the pronunciation point image is mapped in the grid pre-established, is established The mapping relations of each point of articulation and each scale in the grid in the pronunciation point image, specifically include:
According to the area and preset ratio of the point of articulation, grid area is set and establishes the grid;Wherein, described The corresponding scale of every a line of grid, each column of the grid corresponding time point;
Each point of articulation in the pronunciation point image is mapped in the grid;
When point of articulation is distributed on the grid lines of the grid, calculates separately the point of articulation and be connected on the grid Area accounting in the adjacent square of line, and by the point of articulation be assigned to the point of articulation in adjacent square area accounting compared with In a big grid;
According to each point of articulation in the pronunciation point image in the grid position and the grid in every row it is corresponding Scale, establish the mapping relations of each scale in each point of articulation and the grid in the pronunciation point image.
In an alternative embodiment, S400: according to the mapping relations, the direction initialization along the grid extracts institute The corresponding scale of point of articulation in pronunciation point image is stated, and will be described using virtual musical instrument corresponding with the type to play an instrument The corresponding scale of point of articulation is converted to audio in pronunciation point image, generates the corresponding melody of the target image, specifically includes:
The direction initialization is the time-axis direction formed at the grid each column corresponding time point;
According to the mapping relations, extract in the pronunciation point image according to the corresponding time-axis direction of the grid The corresponding scale of point;
When in the adjacent grid of any a line that multiple point of articulation are located in the grid, by the multiple point of articulation tune Whole is the long of the corresponding scale of described any a line;
According to the time-axis direction, point of articulation corresponding time point in the reaction point image is extracted;
It plays an instrument according to the corresponding scale of point of articulation and time point in the pronunciation point image, and using with described The corresponding scale of point of articulation in the pronunciation point image is converted to audio by the corresponding virtual musical instrument of type, generates the target figure As corresponding melody.
In an alternative embodiment, described according to the mapping relations, the direction initialization along the grid extracts institute The corresponding scale of point of articulation in pronunciation point image is stated, and will be described using virtual musical instrument corresponding with the type to play an instrument The corresponding scale of point of articulation is converted to audio in pronunciation point image, generates the corresponding melody of the target image, later further include:
The corresponding scale of the every a line of the grid is adjusted, each point of articulation and the net in the pronunciation point image are re-established The mapping relations of each scale and the corresponding melody of the target image is regenerated using the virtual musical instrument in lattice, obtains institute altogether State the corresponding N head melody of target image;
The corresponding N head melody of the target image is converted into waveform diagram respectively, obtains N number of waveform diagram altogether;
Point calculate any one of waveform diagram and the multiple template waveform diagram that is pre-stored in waveform diagram template database Similarity, and maximum value of any one of waveform diagram relative to the similarity of the multiple template waveforms figure is extracted, as The reference value of any one of waveform diagram;
The corresponding waveform diagram of maximum reference value is extracted from N number of waveform diagram;
Extract target melody of the corresponding melody of the corresponding waveform diagram of the maximum reference value as the target image.
In the present embodiment, the corresponding scale of adjustable every a line of the grid, re-establishes the pronunciation point image In each point of articulation and the mapping relations of each scale in the grid and regenerate the corresponding melody of the target image, to make It obtains and multiple melody is generated according to grid.Every kind of music style has unique scale to combine, and is combined according to these unique scales In scale create the melody come and be just bound to the characteristic with the national music, therefore, according to the needs of creation style The scale of every row in the grid is set, so that making the melody of creation has specific music style.For example, Chinese five sound Rank, the sound for including are as follows: 123561;Japanese hexatone, the sound for including are as follows: 6712346;Romanian minor scale includes Sound are as follows: 671#234#56.By changing the scale combination of every row in grid, the music rhythm of different styles can be createed. Thus, it is possible to obtain the corresponding N head melody of the target image, N head melody is then converted into waveform diagram and and multiple template Waveform diagram is matched, and the maximum value of similarity of each waveform diagram relative to the multiple template waveforms figure is extracted, and is made A reference value is respectively corresponded for the reference value of each waveform diagram to get to each waveform diagram;Compare each waveform diagram The size of reference value obtains the corresponding waveform diagram of maximum reference value, and it is corresponding to extract the corresponding waveform diagram of the maximum reference value Target melody of the melody as the target image, the N head melody of generation can effectively be sieved by the above method Choosing improves the quality of melody creation to obtain the melody for creating style closest to existing music rhythm.
Referring to Fig. 2, it is a kind of schematic diagram of conversion equipment of the image provided in an embodiment of the present invention to melody, it is described The conversion equipment of image to melody includes:
Color cluster module 1, for obtaining the HSB value of each pixel in target image, and according to the HSB value to described Each pixel of target image carries out color cluster processing, obtains the corresponding color cluster image of the target image;
Normalization module 2, for each color lump in the color cluster image to be normalized, described in acquisition The corresponding pronunciation point image of target image;
Mapping relations establish module 3, for the pronunciation point image to be mapped in the grid pre-established, described in foundation The mapping relations of each point of articulation and each scale in the grid in point image of pronouncing;
Dominant hue extraction module 4, for extracting the dominant hue of the target image according to the color cluster image;
Instrument type determining module 5, for according to the dominant hue of the target image and the control of preset tone musical instrument Table determines the type to play an instrument;
First melody generation module 6, for according to the mapping relations, the direction initialization along the grid to extract the hair The corresponding scale of point of articulation in point of articulation image, and use virtual musical instrument corresponding with the type to play an instrument by the pronunciation The corresponding scale of point of articulation is converted to audio in point image, generates the corresponding melody of the target image.
In the present embodiment, after by carrying out color cluster, normalized to target image, pronunciation point image is obtained, And the pronunciation point image is mapped in preset grid, the mapping relations of point of articulation and scale are established, the mapping relations are passed through Target image can be converted to one section of specific music rhythm according to the time-axis direction of grid, greatly reduce music rhythm The duration and cost of production, while the difficulty of music making is reduced, meet people to the customization demand of music rhythm, to make Above-mentioned apparatus the side such as is dubbed in background music in personalized handset the tinkle of bells, electron album background music, screen protection background music, films and television programs Mask has wide practical use.
Wherein, the dominant hue extraction module 4 is mainly used for extracting the target image according to the color cluster image Dominant hue, specifically: color cluster image being carried out by color cluster using clustering algorithm, specially by dominant hue piecemeal, is led to It crosses constantly to be averaged the close point of adjacent HSB value color difference and is polymerized to same color lump, target image is processed into various masters The color lump of tone combines, such as the combination of triangle, circle, rectangle figure, obtains the color lump combination of the dominant hue of target image.It mentions The color value for taking out each color lump is the dominant hue of target image, and calculates separately the area of each color lump in the target image and account for Than.Further, the color lump that the area accounting of color lump in the color cluster image is greater than given threshold is determined as the mesh The dominant hue of logo image.
In an alternative embodiment, described image to melody conversion equipment further include:
Cover image acquisition module, for acquiring the corresponding cover image of multiple musical works to play an instrument;
Template clusters image generation module, for extracting the HSB value of each pixel in any one of cover image, and Color cluster processing is carried out according to each pixel of the HSB value to any one of cover image, obtains any one institute The corresponding template colors cluster image of cover image is stated, obtains N number of template colors cluster image altogether;
Distribution of color computing module is obtained for calculating the area accounting of each color lump in the template colors cluster image The template colors cluster the corresponding dominant hue of image and dominant hue area accounting, the face as template colors cluster image Color distribution;
Tone musical instrument table of comparisons generation module, for N number of template colors to be clustered with distribution of color and the institute of image It states that template colors cluster image is corresponding to play an instrument for statistical analysis, establishes the color point of template colors cluster image The cloth mapping relations to play an instrument corresponding with template colors cluster image, generate the tone musical instrument table of comparisons.
In the present embodiment, by acquiring the musical works largely to play an instrument (such as sounds such as CD, DVD, digital tone source Happy works) corresponding cover image, the HSB value of the cover image is extracted, constantly by HSB value adjacent in the cover image The close point of color difference is averaged polymerization, carries out color cluster to the cover image of acquisition respectively, cover image is processed into not With color lump combination zone, extracts the color value of each color lump and calculate area accounting of each color lump in cover image, sealed The dominant hue and area accounting of face image.By statistics law-analysing, the different mass-tones to play an instrument with cover image are obtained Reconcile the rule of area accounting, the dominant hue for the corresponding cover image that largely played an instrument, each dominant hue area accounting Statistical data, i.e., the distribution of color of the described cover image.
In an alternative embodiment, the instrument type determining module 5 includes:
Area accounting computing unit, for calculating the area of each color lump in the corresponding color cluster image of the target image Accounting obtains the corresponding dominant hue area accounting of dominant hue of the target image;
Area accounting comparing unit, for by the dominant hue of the target image and dominant hue area accounting and the color Adjust the musical instrument table of comparisons in multiple distribution of color ratios be compared, determine in the tone musical instrument table of comparisons with the target image Dominant hue and dominant hue area accounting the smallest distribution of color of difference it is corresponding play an instrument, as the color cluster The corresponding type to play an instrument of dominant hue in image;
Volume allocation unit, for according to the target image dominant hue and the dominant hue area accounting, determine The corresponding volume accounting to play an instrument of each color lump in the color cluster image.
In the present embodiment, specifically, the dominant hue for generating the target image of music rhythm is extracted, according to target figure When the dominant hue of picture and the area accounting of the dominant hue in the target image determine the corresponding type and its performance to play an instrument Volume.Because the COLOR COMPOSITION THROUGH DISTRIBUTION situation of each picture is different, the content of some expression is more, distribution of color is relatively abundanter, has The content of expression is few, COLOR COMPOSITION THROUGH DISTRIBUTION is relatively simple, therefore sets a threshold values to determine the quantity of the dominant hue of target image, and It is determined to be used alone or in combination according to the area accounting of each dominant hue and be played an instrument.By the dominant hue of obtained target image and Dominant hue area accounting compares the various corresponding relationship numbers to play an instrument with distribution of color for counting and being stored in advance server According to that is, described tone musical instrument table of comparisons finds out and drills corresponding to combining immediate distribution of color with target image dominant hue It plays music device, obtains the combination that plays an instrument of melody to be generated.For example, the target image includes M kind dominant hue, according to This M kind dominant hue and this corresponding dominant hue area accounting of M kind dominant hue, can determine that corresponding M kind plays an instrument, and adopt It is played an instrument with this M kind to merge generation melody.
For example, being played when the area accounting of color lump a certain in target image reaches 80% or more using single instrument.Again For example, when the corresponding dominant hue of target image and dominant hue are distributed accounting are as follows: the blue 40% of sky portion, snow mountain mountain portions Purplish grey 30%, withered grass slope part orange 20% and set stone part blackish green 10%, according to the tone musical instrument table of comparisons The various mapping relations to play an instrument with distribution of color can be learnt, to obtain the corresponding performance of each dominant hue in target image Instrument type, at this point, playing an instrument simultaneously using 40%, 30%, 20%, 10% dominant hue is corresponding to the sound marked in grid Rank is instrumental ensembled, and also carries out the distribution of volume in corresponding volume according to dominant hue area accounting.
In an alternative embodiment, color cluster module 1 includes: HSB value acquiring unit, color change acquisition list Member, color lump polymerized unit, color cluster image generation unit;
The HSB value acquiring unit, for obtaining the HSB value of each pixel in target image;
The color change acquiring unit obtains the mesh for the HSB value according to pixel each in the target image Form and aspect distance is more than the pixel of first threshold in logo image, and obtains multiple color change regions;
The color lump polymerized unit, the difference for calculating HSB value in the color change region are less than second threshold The form and aspect average value of neighbor pixel, and the neighbor pixel is aggregated into the color lump corresponding to the form and aspect average value;
The color cluster image generation unit, for the form and aspect distance when neighbor pixel in the color change region When being zero, according to the color lump after polymerization, the color cluster image is generated.
In the present embodiment, the range of the first threshold is between 60 degree to 130 degree, it is preferable that the first threshold It is 60 degree.The second threshold is 15 degree.Such as when the form and aspect distance of two pixels in the target image is more than 60 degree, It is determined as color change region.After finding out color change region, pixel adjacent in target image is constantly analyzed, HSB It is worth close neighbor pixel and is averaged and be polymerized to a color lump, such as the HSB value of neighbor pixel A, B is respectively as follows: A point HSB value is H42 °, S43%, B21%, and the HSB value of B point is H38 °, S42%, B25%, due to the H value of two points is A42 °, B38 °, form and aspect distance is polymerized to a hue value within 15 degree after being averaged the hue value of pixel A and pixel B For H40 ° of color lump, the HSB value that different adjacent pixels is chosen in repetition carries out analysis and seeks form and aspect average value, until will be adjacent The close pixel of HSB value color difference take form and aspect average value to polymerize, target image is finally processed into multiple a different color blocks, it is raw At the color cluster image.
In an alternative embodiment, normalization module 2 include: point of articulation setting unit, point of articulation adjustment unit, Point of articulation image generation unit;
The point of articulation setting unit, for obtaining the smallest color lump of area in the color cluster image, and will be described The smallest color lump of area is set as a point of articulation;
The point of articulation adjustment unit, for other color lumps in the color cluster image to be adjusted to the point of articulation Integral multiple;
The point of articulation image generation unit is used for according to the corresponding point of articulation of color lump each in the color cluster image, Generate the pronunciation point image.
In an alternative embodiment, it includes: that grid establishes unit, map unit, pronunciation that mapping relations, which establish module 3, Point allocation unit, mapping relations establish unit;
The grid establishes unit, and for the area and preset ratio according to the point of articulation, grid area is arranged simultaneously Establish the grid;Wherein, the corresponding scale of every a line of the grid, each column of the grid corresponding time Point;
The map unit, for each point of articulation in the pronunciation point image to be mapped in the grid;
The point of articulation allocation unit when for being distributed on the grid lines of the grid when point of articulation, calculates separately institute Area accounting of the point of articulation in the adjacent square for being connected on the grid lines is stated, and the point of articulation is assigned to the pronunciation Point is in adjacent square in the biggish grid of area accounting;
The mapping relations establish unit, for according to position of each point of articulation in the grid in the pronunciation point image Set and the grid in the corresponding scale of every row, establish in the pronunciation point image each scale in each point of articulation and the grid Mapping relations.
In an alternative embodiment, the first melody generation module 4 include: scale extraction unit, duration of a sound setting unit, Time point extraction unit, melody generating unit;
The direction initialization is the time-axis direction formed at the grid each column corresponding time point;
The scale extraction unit, for being mentioned according to the corresponding time-axis direction of the grid according to the mapping relations Take the corresponding scale of point of articulation in the pronunciation point image;
The duration of a sound setting unit, for when in the adjacent grid of any a line that multiple point of articulation are located in the grid When, the multiple point of articulation is adjusted to the long of the corresponding scale of described any a line;
The time point extraction unit, for extracting point of articulation in the reaction point image according to the time-axis direction Corresponding time point;
The melody generating unit was used for according to the corresponding scale of point of articulation and time point in the pronunciation point image, And the corresponding scale of point of articulation in the pronunciation point image is turned using virtual musical instrument corresponding with the type to play an instrument It is changed to audio, generates the corresponding melody of the target image.
In an alternative embodiment described image to melody conversion equipment further include:
Grid scale adjusts module and re-establishes the point of articulation for adjusting the corresponding scale of the every a line of the grid The mapping relations and the use virtual musical instrument of each point of articulation and each scale in the grid regenerate the target in image The corresponding melody of image obtains the corresponding N head melody of the target image altogether;
Waveform diagram generation module obtains altogether for the corresponding N head melody of the target image to be converted to waveform diagram respectively N number of waveform diagram;
Similarity calculation module calculates any one of waveform diagram and is pre-stored in waveform diagram template database for point Multiple template waveform diagram similarity, and extract phase of any one of waveform diagram relative to the multiple template waveforms figure Reference value like the maximum value of degree, as any one of waveform diagram;
Waveform diagram extraction module, for extracting the corresponding waveform diagram of maximum reference value from N number of waveform diagram;
Melody extraction module, for extracting the corresponding melody of the corresponding waveform diagram of the maximum reference value as the target The target melody of image.
In the present embodiment, the corresponding scale of adjustable every a line of the grid, re-establishes the pronunciation point image In each point of articulation and the mapping relations of each scale in the grid and regenerate the corresponding melody of the target image, to make It obtains and multiple melody is generated according to grid.Every kind of music style has unique scale to combine, and is combined according to these unique scales In scale create the melody come and be just bound to the characteristic with the national music, therefore, according to the needs of creation style The scale of every row in the grid is set, so that making the melody of creation has specific music style.For example, Chinese five sound Rank, the sound for including are as follows: 123561;Japanese hexatone, the sound for including are as follows: 6712346;Romanian minor scale includes Sound are as follows: 671#234#56.By changing the scale combination of every row in grid, the music rhythm of different styles can be createed. Thus, it is possible to obtain the corresponding N head melody of the target image, N head melody is then converted into waveform diagram and and multiple template Waveform diagram is matched, and the maximum value of similarity of each waveform diagram relative to the multiple template waveforms figure is extracted, and is made A reference value is respectively corresponded for the reference value of each waveform diagram to get to each waveform diagram;Compare each waveform diagram The size of reference value obtains the corresponding waveform diagram of maximum reference value, and it is corresponding to extract the corresponding waveform diagram of the maximum reference value Target melody of the melody as the target image, the N head melody of generation can effectively be sieved by the above method Choosing improves the quality of melody creation to obtain the melody for creating style closest to existing music rhythm.
The embodiment of the invention also provides a kind of conversion equipments of image to melody, which is characterized in that including processor, deposits Reservoir and storage in the memory and are configured as the computer program executed by the processor, and the processor is held The conversion method of above-mentioned image to melody is realized when the row computer program.
Illustratively, the computer program can be divided into one or more module/units, one or more A module/unit is stored in the memory, and is executed by the processor, to complete the present invention.It is one or more A module/unit can be the series of computation machine program instruction section that can complete specific function, and the instruction segment is for describing institute State implementation procedure of the computer program in the conversion equipment of described image to melody.For example, the computer program can be by It is divided into color cluster module 1 as shown in Figure 2, normalization module 2, mapping relations to establish module 3, dominant hue extraction module 4, instrument type determining module 5, the first melody generation module 6, each module concrete function are as follows: color cluster module 1, for obtaining The HSB value of each pixel in target image is taken, and color is carried out according to each pixel of the HSB value to the target image and is gathered Class processing, obtains the corresponding color cluster image of the target image;Normalization module 2, for the color cluster figure Each color lump as in is normalized, and obtains the corresponding pronunciation point image of the target image;Mapping relations establish module 3, for the pronunciation point image to be mapped in the grid pre-established, establish each point of articulation and institute in the pronunciation point image State the mapping relations of each scale in grid;Dominant hue extraction module 4, for extracting the mesh according to the color cluster image The dominant hue of logo image;Instrument type determining module 5, for happy according to the dominant hue of the target image and preset tone The device table of comparisons determines the type to play an instrument;First melody generation module 6, for according to point of articulation in the pronunciation point image Corresponding scale and time point, and use virtual musical instrument corresponding with the type to play an instrument by the pronunciation point image The corresponding scale of middle point of articulation is converted to audio, generates the corresponding melody of the target image.
The conversion equipment of described image to melody can be desktop PC, notebook, palm PC and cloud service Device etc. calculates equipment.The conversion equipment of described image to melody may include, but be not limited only to, processor, memory.This field skill Art personnel be appreciated that the schematic diagram be only image to melody conversion equipment example, do not constitute to image to revolving The restriction of the conversion equipment of rule may include perhaps combining certain components or different than illustrating more or fewer components Component, such as the conversion equipment of described image to melody can also include input-output equipment, network access equipment, bus etc..
Alleged processor can be central processing unit (Central Processing Unit, CPU), can also be it His general processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng, the processor be described image to melody conversion equipment control centre, it is entire using various interfaces and connection Image to melody conversion equipment various pieces.
The memory can be used for storing the computer program and/or module, and the processor is by operation or executes Computer program in the memory and/or module are stored, and calls the data being stored in memory, described in realization Image to melody conversion equipment various functions.The memory can mainly include storing program area and storage data area, In, storing program area can application program needed for storage program area, at least one function (such as sound-playing function, image Playing function etc.) etc.;Storage data area, which can be stored, uses created data (such as audio data, phone directory according to mobile phone Deng) etc..In addition, memory may include high-speed random access memory, it can also include nonvolatile memory, such as firmly Disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) block, flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-states Part.
Wherein, if the integrated module/unit of the conversion equipment of described image to melody is in the form of SFU software functional unit Realize and when sold or used as an independent product, can store in a computer readable storage medium.Based on this The understanding of sample, the present invention realize all or part of the process in above-described embodiment method, can also be referred to by computer program Relevant hardware is enabled to complete, the computer program can be stored in a computer readable storage medium, the computer journey Sequence is when being executed by processor, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes calculating Machine program code, the computer program code can for source code form, object identification code form, executable file or it is certain in Between form etc..The computer-readable medium may include: any entity or dress that can carry the computer program code It sets, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software Distribution medium etc..It should be noted that the content that the computer-readable medium includes can be according to making laws in jurisdiction Requirement with patent practice carries out increase and decrease appropriate, such as in certain jurisdictions, according to legislation and patent practice, computer Readable medium does not include electric carrier signal and telecommunication signal.
It should be noted that the apparatus embodiments described above are merely exemplary, wherein described be used as separation unit The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual It needs that some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.In addition, device provided by the invention In embodiment attached drawing, the connection relationship between module indicate between them have communication connection, specifically can be implemented as one or A plurality of communication bus or signal wire.Those of ordinary skill in the art are without creative efforts, it can understand And implement.
The embodiment of the invention also provides a kind of computer readable storage medium, the computer readable storage medium includes The computer program of storage, wherein control in computer program operation and set where the computer readable storage medium The standby conversion method for executing above-mentioned image to melody.
Compared with the existing technology, a kind of beneficial effect of conversion method of the image provided in an embodiment of the present invention to melody exists In: the conversion method of described image to melody, comprising: obtain the HSB value of each pixel in target image, and according to the HSB It is worth and color cluster processing is carried out to each pixel of the target image, obtains the corresponding color cluster figure of the target image Picture;Each color lump in the color cluster image is normalized, the corresponding pronunciation point diagram of the target image is obtained Picture;The pronunciation point image is mapped in the grid pre-established, establish in the pronunciation point image each point of articulation with it is described The mapping relations of each scale in grid;According to the color cluster image, the dominant hue of the target image is extracted;According to described The dominant hue of target image and the preset tone musical instrument table of comparisons determine the type to play an instrument;According to the mapping relations, The corresponding scale of point of articulation in the pronunciation point image is extracted along the direction initialization of the grid, and uses and plays an instrument with described Type corresponding virtual musical instrument the corresponding scale of point of articulation in the pronunciation point image is converted into audio, generate the target The corresponding melody of image.Target image can be converted to one section of specific music rhythm by the above method, greatly reduced The duration and cost of music rhythm production, meet people to the customization demand of music rhythm.The embodiment of the invention also provides A kind of image to melody conversion equipment and computer readable storage medium.
The above is a preferred embodiment of the present invention, it is noted that for those skilled in the art For, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also considered as Protection scope of the present invention.

Claims (10)

1. a kind of image is to the conversion method of melody characterized by comprising
The HSB value of each pixel in target image is obtained, and is carried out according to each pixel of the HSB value to the target image Color cluster processing, obtains the corresponding color cluster image of the target image;
Each color lump in the color cluster image is normalized, the corresponding pronunciation point diagram of the target image is obtained Picture;
The pronunciation point image is mapped in the grid pre-established, establish in the pronunciation point image each point of articulation with it is described The mapping relations of each scale in grid;
According to the color cluster image, the dominant hue of the target image is extracted;
According to the dominant hue of the target image and the preset tone musical instrument table of comparisons, the type to play an instrument is determined;
According to the mapping relations, the direction initialization along the grid extracts the corresponding sound of point of articulation in the pronunciation point image Rank, and use virtual musical instrument corresponding with the type to play an instrument by the corresponding scale of point of articulation in the pronunciation point image Audio is converted to, the corresponding melody of the target image is generated.
2. image as described in claim 1 is to the conversion method of melody, which is characterized in that described image to the conversion side of melody Method further include:
Acquire the corresponding cover image of multiple musical works to play an instrument;
The HSB value of each pixel in any one of cover image is extracted, and according to the HSB value to any one of envelope Each pixel of face image carries out color cluster processing, obtains the corresponding template colors dendrogram of any one of cover image Picture obtains N number of template colors cluster image altogether;
It is corresponding to obtain the template colors cluster image for the area accounting for calculating each color lump in the template colors cluster image Dominant hue and dominant hue area accounting, the distribution of color as template colors cluster image;
Image is corresponding plays an instrument for distribution of color and template colors cluster to N number of template colors cluster image For statistical analysis, the distribution of color for establishing template colors cluster image clusters that image is corresponding to drill with the template colors It plays music the mapping relations of device, generates the tone musical instrument table of comparisons.
3. image as claimed in claim 2 is to the conversion method of melody, which is characterized in that described according to the target image Dominant hue and the preset tone musical instrument table of comparisons determine the type to play an instrument, specifically include:
The area accounting for calculating each color lump in the corresponding color cluster image of the target image, obtains the master of the target image The corresponding dominant hue area accounting of tone;
By multiple colors in the dominant hue of the target image and dominant hue area accounting and the tone musical instrument table of comparisons Distribution ratio is compared, and is determined in the tone musical instrument table of comparisons and is accounted for the dominant hue of the target image and dominant hue area The smallest distribution of color of the difference of ratio is corresponding to play an instrument, as the corresponding performance pleasure of dominant hue in the color cluster image The type of device;
According to the dominant hue of the target image and the dominant hue area accounting, determine assorted in the color cluster image The corresponding volume accounting to play an instrument of block.
4. image as described in claim 1 is to the conversion method of melody, which is characterized in that each picture in the acquisition target image The HSB value of vegetarian refreshments, and color cluster processing is carried out to each pixel of the target image according to the HSB value, described in acquisition The corresponding color cluster image of target image, specifically includes:
Obtain the HSB value of each pixel in target image;
According to the HSB value of pixel each in the target image, obtaining form and aspect distance in the target image is more than first threshold Pixel, and obtain multiple color change regions;
Calculate HSB value in the color change region difference be less than second threshold neighbor pixel form and aspect average value, and The neighbor pixel is aggregated into the color lump corresponding to the form and aspect average value;
When the form and aspect distance of neighbor pixel in the color change region is zero, according to the color lump after polymerization, described in generation Color cluster image.
5. image as claimed in claim 4 is to the conversion method of melody, which is characterized in that described to the color cluster image In each color lump be normalized, obtain the corresponding pronunciation point image of the target image, specifically include:
The smallest color lump of area in the color cluster image is obtained, and sets a pronunciation for the smallest color lump of the area Point;
Other color lumps in the color cluster image are adjusted to the integral multiple of the point of articulation;
According to the corresponding point of articulation of color lump each in the color cluster image, the pronunciation point image is generated.
6. image as claimed in claim 5 is to the conversion method of melody, which is characterized in that described to reflect the pronunciation point image It is mapped in the grid pre-established, the mapping for establishing each point of articulation and each scale in the grid in the pronunciation point image is closed System, specifically includes:
According to the area and preset ratio of the point of articulation, grid area is set and establishes the grid;Wherein, the grid The corresponding scale of every a line, each column of the grid corresponding time point;
Each point of articulation in the pronunciation point image is mapped in the grid;
When point of articulation is distributed on the grid lines of the grid, calculates separately the point of articulation and be connected on the grid lines Area accounting in adjacent square, and it is biggish that the point of articulation is assigned to point of articulation area accounting in adjacent square In one grid;
According to each point of articulation in the pronunciation point image in the grid position and the grid in the corresponding sound of every row Rank establishes the mapping relations of each point of articulation and each scale in the grid in the pronunciation point image.
7. image as claimed in claim 6 is to the conversion method of melody, which is characterized in that it is described according to the mapping relations, The corresponding scale of point of articulation in the pronunciation point image is extracted along the direction initialization of the grid, and uses and plays an instrument with described Type corresponding virtual musical instrument the corresponding scale of point of articulation in the pronunciation point image is converted into audio, generate the target The corresponding melody of image, specifically includes:
The direction initialization is the time-axis direction formed at the grid each column corresponding time point;
According to the mapping relations, point of articulation pair in the pronunciation point image is extracted according to the corresponding time-axis direction of the grid The scale answered;
When in the adjacent grid of any a line that multiple point of articulation are located in the grid, the multiple point of articulation is adjusted to The long of the corresponding scale of described any a line;
According to the time-axis direction, point of articulation corresponding time point in the reaction point image is extracted;
According to the corresponding scale of point of articulation and time point in the pronunciation point image, and use and the type to play an instrument The corresponding scale of point of articulation in the pronunciation point image is converted to audio by corresponding virtual musical instrument, generates the target image pair The melody answered.
8. image as claimed in claim 7 is to the conversion method of melody, which is characterized in that it is described according to the mapping relations, The corresponding scale of point of articulation in the pronunciation point image is extracted along the direction initialization of the grid, and uses and plays an instrument with described Type corresponding virtual musical instrument the corresponding scale of point of articulation in the pronunciation point image is converted into audio, generate the target The corresponding melody of image, later further include:
The corresponding scale of the every a line of the grid is adjusted, is re-established in the pronunciation point image in each point of articulation and the grid The mapping relations of each scale simultaneously regenerate the corresponding melody of the target image using the virtual musical instrument, obtain the mesh altogether The corresponding N head melody of logo image;
The corresponding N head melody of the target image is converted into waveform diagram respectively, obtains N number of waveform diagram altogether;
It is similar to the multiple template waveform diagram that is pre-stored in waveform diagram template database point to calculate any one of waveform diagram Degree, and maximum value of any one of waveform diagram relative to the similarity of the multiple template waveforms figure is extracted, as any The reference value of one waveform diagram;
The corresponding waveform diagram of maximum reference value is extracted from N number of waveform diagram;
Extract target melody of the corresponding melody of the corresponding waveform diagram of the maximum reference value as the target image.
9. a kind of image is to the conversion equipment of melody, which is characterized in that including processor, memory and be stored in the storage In device and it is configured as the computer program executed by the processor, the processor is realized when executing the computer program Image as claimed in any of claims 1 to 8 in one of claims to melody conversion method.
10. a kind of computer readable storage medium, which is characterized in that the computer readable storage medium includes the calculating of storage Machine program, wherein equipment where controlling the computer readable storage medium in computer program operation is executed as weighed Benefit require any one of 1 to 8 described in image to melody conversion method.
CN201810427683.7A 2018-05-07 2018-05-07 Method and device for converting image into melody and computer readable storage medium Active CN108960250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810427683.7A CN108960250B (en) 2018-05-07 2018-05-07 Method and device for converting image into melody and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810427683.7A CN108960250B (en) 2018-05-07 2018-05-07 Method and device for converting image into melody and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN108960250A true CN108960250A (en) 2018-12-07
CN108960250B CN108960250B (en) 2020-08-25

Family

ID=64498915

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810427683.7A Active CN108960250B (en) 2018-05-07 2018-05-07 Method and device for converting image into melody and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN108960250B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109872340A (en) * 2019-01-03 2019-06-11 广东智媒云图科技股份有限公司 Patterning process and its electronic equipment, computer readable storage medium
CN110322520A (en) * 2019-07-04 2019-10-11 厦门美图之家科技有限公司 Image key color extraction method, apparatus, electronic equipment and storage medium
CN112652030A (en) * 2020-12-11 2021-04-13 浙江工商大学 Color space position layout recommendation method based on specific scene
CN113160781A (en) * 2021-04-12 2021-07-23 广州酷狗计算机科技有限公司 Audio generation method and device, computer equipment and storage medium
CN113885829A (en) * 2021-10-25 2022-01-04 北京字跳网络技术有限公司 Sound effect display method and terminal equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005165194A (en) * 2003-12-05 2005-06-23 Nippon Hoso Kyokai <Nhk> Music data converter and music data conversion program
CN1862656A (en) * 2005-05-13 2006-11-15 杭州波导软件有限公司 Method for converting musci score to music output and apparatus thereof
KR20070059253A (en) * 2005-12-06 2007-06-12 최종민 The method for transforming the language into symbolic melody
KR20080083433A (en) * 2007-03-12 2008-09-18 주식회사 하모니칼라시스템 Method and apparatus for converting image to sound
CN102289778A (en) * 2011-05-10 2011-12-21 南京大学 Method for converting image into music
CN104918059A (en) * 2015-05-19 2015-09-16 京东方科技集团股份有限公司 Method and device for image transmission and terminal device
CN105205047A (en) * 2015-09-30 2015-12-30 北京金山安全软件有限公司 Playing method, converting method and device of musical instrument music score file and electronic equipment
CN106203465A (en) * 2016-06-24 2016-12-07 百度在线网络技术(北京)有限公司 A kind of method and device generating the music score of Chinese operas based on image recognition
CN107239482A (en) * 2017-04-12 2017-10-10 中国科学院光电研究院 A kind of processing method and server for converting the image into music
CN107967476A (en) * 2017-12-05 2018-04-27 北京工业大学 A kind of method that image turns sound

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005165194A (en) * 2003-12-05 2005-06-23 Nippon Hoso Kyokai <Nhk> Music data converter and music data conversion program
CN1862656A (en) * 2005-05-13 2006-11-15 杭州波导软件有限公司 Method for converting musci score to music output and apparatus thereof
KR20070059253A (en) * 2005-12-06 2007-06-12 최종민 The method for transforming the language into symbolic melody
KR20080083433A (en) * 2007-03-12 2008-09-18 주식회사 하모니칼라시스템 Method and apparatus for converting image to sound
CN102289778A (en) * 2011-05-10 2011-12-21 南京大学 Method for converting image into music
CN104918059A (en) * 2015-05-19 2015-09-16 京东方科技集团股份有限公司 Method and device for image transmission and terminal device
CN105205047A (en) * 2015-09-30 2015-12-30 北京金山安全软件有限公司 Playing method, converting method and device of musical instrument music score file and electronic equipment
CN106203465A (en) * 2016-06-24 2016-12-07 百度在线网络技术(北京)有限公司 A kind of method and device generating the music score of Chinese operas based on image recognition
CN107239482A (en) * 2017-04-12 2017-10-10 中国科学院光电研究院 A kind of processing method and server for converting the image into music
CN107967476A (en) * 2017-12-05 2018-04-27 北京工业大学 A kind of method that image turns sound

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109872340A (en) * 2019-01-03 2019-06-11 广东智媒云图科技股份有限公司 Patterning process and its electronic equipment, computer readable storage medium
CN109872340B (en) * 2019-01-03 2023-06-27 广东智媒云图科技股份有限公司 Composition method, electronic device and computer readable storage medium
CN110322520A (en) * 2019-07-04 2019-10-11 厦门美图之家科技有限公司 Image key color extraction method, apparatus, electronic equipment and storage medium
CN112652030A (en) * 2020-12-11 2021-04-13 浙江工商大学 Color space position layout recommendation method based on specific scene
CN112652030B (en) * 2020-12-11 2023-09-19 浙江工商大学 Color space position layout recommendation method based on specific scene
CN113160781A (en) * 2021-04-12 2021-07-23 广州酷狗计算机科技有限公司 Audio generation method and device, computer equipment and storage medium
CN113160781B (en) * 2021-04-12 2023-11-17 广州酷狗计算机科技有限公司 Audio generation method, device, computer equipment and storage medium
CN113885829A (en) * 2021-10-25 2022-01-04 北京字跳网络技术有限公司 Sound effect display method and terminal equipment
CN113885829B (en) * 2021-10-25 2023-10-31 北京字跳网络技术有限公司 Sound effect display method and terminal equipment

Also Published As

Publication number Publication date
CN108960250B (en) 2020-08-25

Similar Documents

Publication Publication Date Title
CN108960250A (en) Image is to the conversion method of melody, device and computer readable storage medium
CN108805171A (en) Image is to the conversion method of music rhythm, device and computer readable storage medium
Barrett Spatio-musical composition strategies
Dixon et al. The Performance Worm: Real time visualisation of expression based on Langner's tempo-loudness animation
CN109783178A (en) A kind of color adjustment method of interface assembly, device, equipment and medium
CN110163054A (en) A kind of face three-dimensional image generating method and device
KR20150079763A (en) Asynchronous chorus method and device
CN107437272B (en) Interactive entertainment method and device based on augmented reality and terminal equipment
CN110444185B (en) Music generation method and device
KR20150112048A (en) music-generation method based on real-time image
CN109828740B (en) Audio adjusting method and device
CN109584153A (en) Modify the methods, devices and systems of eye
CN105118444B (en) A kind of information processing method and electronic equipment
CN105915687A (en) User Interface Adjusting Method And Apparatus Using The Same
CN108737878A (en) The method and system of user interface color is changed for being presented in conjunction with video
CN112967705A (en) Mixed sound song generation method, device, equipment and storage medium
CN108764114A (en) A kind of signal recognition method and its equipment, storage medium, terminal
CN110602827B (en) Kara OK light effect implementation method, intelligent projector and related product
US10121249B2 (en) Enhanced visualization of areas of interest in image data
CN109859284B (en) Dot-based drawing implementation method and system
Worrall Space in sound: sound of space
CN111552836A (en) Lyric display method, device and storage medium
CN103369254A (en) Image processing apparatus, image processing method and computer program
CN116681613A (en) Illumination-imitating enhancement method, device, medium and equipment for face key point detection
CN110099304A (en) Mobile TV advertisement broadcast method, device and equipment in a kind of elevator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant