WO2020244215A1 - 一种基于数据分布的调色板生成方法及系统 - Google Patents
一种基于数据分布的调色板生成方法及系统 Download PDFInfo
- Publication number
- WO2020244215A1 WO2020244215A1 PCT/CN2019/130087 CN2019130087W WO2020244215A1 WO 2020244215 A1 WO2020244215 A1 WO 2020244215A1 CN 2019130087 W CN2019130087 W CN 2019130087W WO 2020244215 A1 WO2020244215 A1 WO 2020244215A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- palette
- color
- separation
- solution
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/904—Browsing; Visualisation therefor
Definitions
- the technical field of data visualization of the present invention specifically relates to a method and system for generating a palette based on data distribution.
- the commonly used methods for visualizing categorical data are histograms, line graphs and scatter plots. Each class is usually represented by a color. The main task is to better distinguish different classes. People perceive that the degree of distinction between different categories is greatly affected by color, but how to find an appropriate color palette is still a complicated and time-consuming task, even for experts.
- palette generation For the palette design of categorized data, there are currently three main methods, namely palette generation, palette color allocation, and palette color optimization:
- Color harmony This type of method generally generates different palettes based on some existing harmony templates, such as Adobe Color CC and COLORLovers. Although these palettes are aesthetically pleasing, they may not be suitable for visualization tasks that require a high degree of discrimination.
- Gramazio et al. (CCGramazio, DHLaidlaw, and KBSchloss.Colorgorical: Creating discriminable and preferable color palettes for information visualization.IEEE Trans.Vis.&Comp.Graphics,23(1):521-530,2017.doi:10.1109/ tvcg.2016.2598918) reorganized this model and used it (Colorgorical) to generate a beautiful palette.
- Class discrimination This type of method is generally based on some principles of perceptual constraints, such as colors should have a greater degree of separation, should not compete with each other, and should be attractive. Healey (CGHealey.Choosing effective colours for data visualization.In Proc.IEEE Conf.on Visualization,pp.263–270,1996.doi:10.1109/visual.1996.568118) proposed to divide the Munsell space into 10 tonal intervals, Choose a representative color in each interval, and at the same time satisfy the maximum perceptual distance between all colors. But he has two flaws: a) he ignores aesthetics; b) he is limited by geographic data. Colorgorical overcomes these two shortcomings, but it does not take the data distribution into consideration, so it cannot distinguish the given data well.
- ColorBrewer Another way to design a highly differentiated palette is to use a pre-designed palette.
- a typical example is ColorBrewer, which is an online tool for selecting palettes.
- ColorBrewer provides many high-quality color palettes, it does not allow users to make adjustments.
- Colorgorical allows users to generate palettes by specifying the desired hue, but it does not consider the underlying data, so it cannot design colors for the specified class.
- This kind of method generally allocates the colors in the palette to each class by maximizing the class separation in the multi-class scatter plot, so as to improve the efficiency of analyzing the multi-class scatter plot.
- Wang et al. (Y. Wang, X. Chen, T. Ge, C. Bao, M. Sedlmair, C.-W. Fu, O. Deussen, and B. Chen. Optimizing color assignment for perception of class separation in multiclass scatterplots.IEEE Trans.Vis.&Comp.Graphics, 25(1):820–829,2019.doi:10.1109/TVCG.2018.2864912) proposed to use KNNG to measure the separation between two classes to achieve the distribution of colors.
- their method requires the user to provide a palette with a high degree of discrimination.
- the present invention provides a palette generation method and system based on data distribution, which is suitable for the visualization of categorized data such as scatter graphs, line graphs, and histograms, and can generate highly distinguishable
- data distribution which is suitable for the visualization of categorized data such as scatter graphs, line graphs, and histograms, and can generate highly distinguishable
- the color palette combined with the distinction and aesthetics of colors improves the effect of data visualization, thereby improving the efficiency of visual analysis.
- one or more embodiments of the present invention provide the following technical solutions:
- One or more embodiments provide a method for generating a palette based on data distribution, including the following steps:
- One or more embodiments provide a palette generation system based on data distribution, including:
- a data loading module that receives classification data and color data, where the color data includes a discretized color space
- a data distribution determining module projecting the classified data into the visual space, and obtaining location information of the classified data
- a degree of separation measurement module based on the degree of separation between the position information measurement classes
- the palette optimization module randomly selects multiple colors from the discretized color space as the initial solution, combines the separation between classes, finds the approximate optimal solution based on the simulated annealing algorithm, and generates the palette;
- the data rendering module renders classification data based on the palette.
- One or more embodiments provide an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
- the processor implements the aforementioned Palette generation method for data distribution.
- One or more embodiments provide a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method for generating a palette based on data distribution is implemented.
- the above one or more technical solutions have the following beneficial effects:
- the present invention uses simulated annealing algorithm to automatically generate color palettes on the basis of obtaining the distribution of classified data. On the basis of considering the data distribution, combined with the degree of distinction and aesthetics of colors, the effect of data visualization is improved, thereby improving The efficiency of visual analysis.
- FIG. 1 is a flowchart of a method for generating a palette based on data distribution in one or more embodiments of the present invention
- Figures 2(a)-2(d) are respectively schematic diagrams of initializing, iterating 30 times, 150 times and the final process of one iteration of simulated annealing in one or more embodiments of the present invention
- Figures 4(a)-4(d) are schematic diagrams of the results with the perturbation probabilities of 0.1, 0.5, 0.8 and 1.0 in one or more embodiments of the present invention, respectively;
- 5(a)-5(f) are schematic diagrams of the effect of setting different weights in one or more embodiments of the present invention.
- 6(a)-6(b) are schematic diagrams of calculating the histogram and the line graph in one or more embodiments of the present invention.
- FIG. 7(a)-7(d) are schematic diagrams of the effect of the histogram and the broken line graph in one or more embodiments of the present invention.
- This embodiment provides a method for generating a palette based on data distribution, as shown in FIG. 1, which specifically includes the following steps:
- Step 1 Load classification data and color data.
- the color data contains the discretized LAB color space.
- the classification data is a scatter chart, a line chart, a histogram, etc.
- Step 2 Project the classified data into the visual space (screen space), and obtain the position data of the classified data.
- the classification data is a scatter plot, acquiring the position of each point in various scatter points;
- the classified data is a broken line graph
- the broken lines in the graph are discretized at a set interval, and the position of each discrete point in each broken line is obtained, as shown in Fig. 6(b).
- Step 3 Calculate the class separation degree.
- KNNG density-sensitive K-Nearest Neighbor Graph
- DSC distance Consistent
- n i represents the number of points in the C i class
- ⁇ (x s ) represents all neighbors of the point x s
- ⁇ (l(x t ), j) 1, otherwise 0
- Two classes that cross each other will get a larger degree of non-separation, which means that the color difference between them should be large.
- kns(C i , C j ) is different from kns(C j , C i ), so we need to count them all.
- the "KNNG distance” only calculates the distance between two adjacent classes that have intersections, and does not calculate the distances of other classes. This may result in two classes without intersections being given similar colors, as shown in Figure 2a.
- the "DSC distance” based on the class center.
- a(x i ) is the intra-class distance
- b(x i ) is the inter-class distance.
- ns(C i , C j ) ⁇ dns(C i , C j )+(1- ⁇ )kns(C i , C j )
- ns(C i , C j ) is the degree of non-separation between any two classes we finally get.
- ⁇ is a weight that can be adjusted by the user (the effect of different ⁇ is shown in Figure 2. It can be seen that when there is only KNNG, it will produce more similar colors, and when only DSC, the contrast of the cross type is weaker), and different ⁇ weights The effect is shown in Figure 3(a)-3(d).
- Step 4 Use the simulated annealing algorithm to quickly find the approximate optimal solution.
- the user influences the final result by setting the weight, as shown in Figure 5(a)-5(f).
- Step 4.1 Randomly select m colors from the discretized color space to form the initial solution (m is the number of classes in the data), set the initial temperature, the cooling coefficient and the minimum temperature. A complete iterative process is shown in Figure 2(a)- As shown in 2(d).
- Step 4.2 If the current temperature is greater than the minimum temperature, execute the next step, otherwise exit the iteration and return to the final result.
- Step 4.3 Randomly perturb the initial solution to obtain a new solution. Considering that completely random disturbance is difficult to obtain the optimal solution in a finite time, the color is randomly disturbed when the random probability is less than 0.5. The effect of different random probabilities is shown in Figure 4(a)-4(d), when it is greater than or equal to 0.5 , The perturbation of colors follows the following principles: new colors should make the current palette score higher.
- Step 4.4 Check whether all colors in the new solution meet the larger JND requirements, if not, perturb the colors until they are satisfied; if the current JND cannot find a palette that meets the conditions (the color difference is large), adjust the color Make JND decrease until JND cannot decrease, exit the iteration and return to the current solution.
- JND For the calculation method of JND, please refer to "Just Noticeable Difference, M. Stone, DASzafir, and V. Setlur. An engineering model for color difference as a function of size. In Color and Imaging Conference, vol. 2014, pp. 253- 258. Society for Imaging Science and Technology, 2014".
- Step 4.5 Score the current solution.
- the specific scoring function is as follows:
- Aesthetics score We use color combination preference (Pair Preference) and saturation variance (Saturation Variance) to measure the beauty of the palette, as follows:
- Color combination preference refers to people's preference for color combination, Schloss and Palmer (KBSchloss and SEPalmer.Aesthetic response to color combination: preference, harmony, and similarity. Attention, Perception, & Psychophysics, 73(2):551 –571,2011.doi:10.3758/s13414-010-0027-0)
- a linear regression model is found to predict the user’s preference for color combinations, which is mainly composed of three factors: coolness (coolness, ⁇ ), color similarity (hue similarity, ⁇ H) and brightness contrast (lightness contrast, ⁇ L):
- the color preference degree among all colors obtained by calculating the discretized LAB color space by the above formula forms the final color preference matrix.
- S(c i ) represents the saturation of the color c i
- ⁇ represents the average value of the saturation of the entire palette.
- ND Name Difference
- J.Heer and M.Stone.Color naming models for color selection, image editing and palette design.In ACM Human Factors in Computing Systems (CHI), pages 1007-1016, New York, NY, USA, 2012. ACM.). The specific calculation method is:
- T is an associative matrix of color names with C rows and W columns.
- the color name association matrix and the color combination preference matrix are constructed in advance for subsequent calls.
- CD Class Discriminability
- ⁇ (c i , c j ) represents the CIEDE 2000 distance
- ns(C i , C j ) represents the degree of non-separation of the two classes C i and C j .
- c i represents the color of class C i
- ⁇ L (c i , c 0 ) represents the brightness difference between c i and the background color c 0 .
- E(P) ⁇ 1 CD(C,P)+ ⁇ 2 CB(C,P,c 0 )+ ⁇ 3 ND(P)+ ⁇ 4 PP(P)+ ⁇ 5 SV(P)
- ⁇ 1 ⁇ 5 are user-defined weight coefficients, as shown in Figures 5(a)-5(f), and Figures 5(a) and 5(d) are the results generated using high Class Visibility and high Name Difference , Figures 5(b) and 5(e) are generated using High Pair Preference and High Contrast with Background.
- the Saturation Variance of the palette in Figure 5(c) is smaller, and the SV in Figure 5(f) is larger.
- Steps 4-6 If the score of the current solution is better than the previous solution, assign colors to the current solution, and assign multiple colors corresponding to the current solution to each class to obtain better results.
- the specific process is as follows:
- C r ⁇ (l(x i ))
- C S ⁇ (l(x j )) respectively represent the colors of x i and x j
- ⁇ i is the set of k neighbors of point x i
- ⁇ is CIEDE2000 distance matrix
- g(d(x i , x j )) is a function based on the distance between two points, the purpose is to make the closer points give greater weight
- g(d) 1/d.
- ⁇ L(C r , C b ) is the difference between the brightness of the point and the background color
- C b is the background color
- ns(x i ) is the degree of non-separation based on the position of the point.
- the final objective function is:
- ⁇ is the weight coefficient
- Step 4-7 If the score of the current solution is worse than the previous solution, accept the current solution with probability exp( ⁇ E/T t ), where ⁇ E represents the difference between the scores of the current solution and the previous solution, and T t is the current temperature .
- Step 4-8 The temperature is lowered, and return to step 4-2 to continue iteration.
- Step 5 Use the generated palette to render the data.
- the purpose of this embodiment is to provide a palette generation system based on data distribution.
- this embodiment provides a palette generation system based on data distribution, including:
- a data loading module that receives classification data and color data, where the color data includes a discretized color space
- a data distribution determining module projecting the classified data into the visual space, and obtaining location information of the classified data
- a degree of separation measurement module based on the degree of separation between the position information measurement classes
- the palette optimization module randomly selects multiple colors from the discretized color space as the initial solution, combines the separation between classes, finds the approximate optimal solution based on the simulated annealing algorithm, and generates the palette;
- the data rendering module renders classification data based on the palette.
- the purpose of this embodiment is to provide an electronic device.
- this embodiment provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
- the processor executes the program, the following steps are implemented, including: :
- the purpose of this embodiment is to provide a computer-readable storage medium.
- this embodiment provides a computer-readable storage medium having a computer program stored thereon, and the following steps are performed when the program is executed by a processor:
- computer-readable storage medium should be understood to include a single medium or multiple media including one or more instruction sets; it should also be understood to include any medium that can store, encode, or carry for use by a processor The set of instructions executed and causes the processor to execute any method in the present invention.
- the present invention uses simulated annealing algorithm to automatically generate color palettes on the basis of obtaining the distribution of classified data. On the basis of considering the data distribution, combined with the degree of distinction and aesthetics of colors, the effect of data visualization is improved, thereby improving The efficiency of visual analysis.
- the present invention uses the simulated annealing algorithm to automatically solve the optimal palette color combination, where the solution score combines the user's color preference, color saturation variance, color name difference, class separation and contrast with the background to obtain a A palette that takes into account both aesthetics and color discrimination makes the visualization effect more reasonable, consistent with human perception and good user experience.
- the present invention sets a user-adjustable weight coefficient when calculating the class separation degree, and by adjusting the coefficient, it can generate a biased separation degree and a visualized result that meets the user's preference;
- the palette generation method of the present invention can be applied to various types of classified data, including but not limited to scatter graphs, line graphs, and histograms.
- modules or steps of the present invention can be implemented by a general computer device. Alternatively, they can be implemented by program code executable by the computing device, so that they can be stored in a storage device. The device is executed by a computing device, or they are separately fabricated into individual integrated circuit modules, or multiple modules or steps in them are fabricated into a single integrated circuit module for implementation.
- the present invention is not limited to any specific combination of hardware and software.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Image Generation (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
一种基于数据分布的调色板生成方法及系统,所述方法包括以下步骤:接收分类数据和颜色数据,所述颜色数据包括离散化的颜色空间;将所述分类数据投影到可视空间,获取所述分类数据的位置信息;从离散化的颜色空间中随机选择多个颜色作为初始解,基于模拟退火算法寻找近似最优解,生成调色板;基于所述调色板渲染分类数据。该调色板生成方法考虑了数据的分布、颜色的差异化和美观性等因素,可视化效果好,且提高了可视化分析的效率。
Description
本发明数据可视化技术领域,具体涉及一种基于数据分布的调色板生成方法及系统。
对分类数据进行可视化常用的方法有直方图、折线图和散点图,通常是将每个类用一个颜色来表示,主要任务是可以较好的区分不同类。人们感知不同类的区分度受颜色的影响很大,然而如何寻找一个恰当的调色板现在仍然是一个复杂且耗时的任务,即使是对专家来说也如此。
针对分类数据的调色板设计,目前主要有三种方法,分别是调色板生成、调色板颜色分配及调色板颜色优化:
(1)调色板生成
生成合适的用于分类数据的调色板在视觉设计领域受到很大关注,大多数现有的颜色选择方法基于三种策略:颜色和谐度、颜色可区分度以及颜色名关系。
颜色的和谐度:该类方法一般是根据现有的一些和谐度模板来生成不同的调色板,如Adobe Color CC和COLOURLovers。虽然这些调色板是美学上令人满意的,但他们可能并不适用于需要较高区分度的可视化任务。
另一种根据和谐度生成调色板的方法是使用美学偏好。美学偏好和颜色和谐度通常被认为是一致的,尽管它们并不相同。Schloss和Palmer(K.B.Schloss and S.E.Palmer.Aesthetic response to color combinations:preference,harmony,and similarity.Attention,Perception,&Psychophysics,73(2):551–571,2011.doi:10.3758/s13414-010-0027-0)研究发现和谐的颜色是美观的,但却并没有将用户是否喜欢他们的组合考虑在内,而组合偏好被定义为“一个观察者有多么喜欢一对颜色搭配在一起。”基于这个研究他们拟合了一个线性回归模型用于组合 偏好的评分。Gramazio等人(C.C.Gramazio,D.H.Laidlaw,and K.B.Schloss.Colorgorical:Creating discriminable and preferable color palettes for information visualization.IEEE Trans.Vis.&Comp.Graphics,23(1):521–530,2017.doi:10.1109/tvcg.2016.2598918)重新组织了这个模型,并使用它(Colorgorical)生成美观的调色板。
类区分度:该类方法一般是基于感知约束的一些原则,如颜色应当有较大的分离度,不应当互相竞争,应当是吸引人的。Healey(C.G.Healey.Choosing effective colours for data visualization.In Proc.IEEE Conf.on Visualization,pp.263–270,1996.doi:10.1109/visual.1996.568118)提出将蒙塞尔空间划分为10块色调区间,在每个区间选取一个有代表性的颜色,同时要满足所有颜色间的感知距离最大。但他有两个缺陷:a)他忽略了美观性;b)他受限于地理数据。Colorgorical克服了这两个缺陷,但他并没有将数据分布考虑在内,因此并不能很好的将给定数据区分开。
另一种设计区分度较高的调色板的方法是使用预先设计好的调色板,一个典型的例子是ColorBrewer,它是一个在线工具,用于选择调色板。尽管ColorBrewer提供了很多高质量的调色板,但它不允许用户做调整。Colorgorical允许用户通过指定想要的色调来生成调色板,但它并不考虑底层数据,因此无法为指定的类设计颜色。
颜色概念的关联:该类方法一般是根据某些颜色的语义信息来生成分类调色板。Lin等人(S.Lin,J.Fortuna,C.Kulkarni,M.Stone,and J.Heer.Selecting semantically-resonant colors for data visualization.Computer Graphics Forum,32(3pt4):401–410,2013.doi:10.1111/cgf.12127)提出了一种具有语义的颜色自动选择方法。然而在散点图中大多数类可能并没有清晰的语义。
(2)调色板颜色分配
这类方法一般是通过最大化多类散点图中的类分离度来分配调色板中的颜色到每个类上,以提高分析多类散点图的效率。Wang等人(Y.Wang,X.Chen,T. Ge,C.Bao,M.Sedlmair,C.-W.Fu,O.Deussen,and B.Chen.Optimizing color assignment for perception of class separability in multiclass scatterplots.IEEE Trans.Vis.&Comp.Graphics,25(1):820–829,2019.doi:10.1109/TVCG.2018.2864912)提出利用KNNG度量两个类之间的分离度实现对颜色的分配。然而他们的方法需要用户提供一个本身区分度较高的调色板。
(3)调色板颜色优化
此种方法是通过应用不同的原则,如数据理解、美观、能量保持和色盲等,实现对原始调色板颜色的优化。Lee等人(S.Lee,M.Sips,and H.-P.Seidel.Perceptually driven visibility optimization for categorical data visualization.IEEE Trans.Vis.&Comp.Graphics,19(10):1746–1757,2013.doi:10.1109/tvcg.2012.315)提出通过计算每个点的视觉显著程度来计算类分离度,并使用这种度量方式优化调色板得到了更好的类区分度。然而,这个方法的两个缺陷限制了它的可应用性:首先,它没有将颜色于背景的对比度考虑在内,导致优化结果不适应于所有场景。其次,它被设计用于地图数据的可视化,无法支持更多的分类信息可视化任务。不同于为一个特定可视化优化调色板,Fang等人(H.Fang,S.Walton,E.Delahaye,J.Harris,D.Storchak,and M.Chen.Categorical colormap optimization with visualization case studies.IEEE Trans.Vis.&Comp.Graphics,23(1):871–880,2017.doi:10.1109/tvcg.2016.2599214)提出一种最大化给定颜色间感知距离的方法,尽管这个方法可以结合不同的用户指定的约束,它并没有考虑数据分布,也因此会导致生成的可视化不能很好的展示不同类之间的数据结构。
发明内容
为克服上述现有技术的不足,本发明提供了一种基于数据分布的调色板生成方法及系统,适用于散点图、折线图和柱状图等分类数据的可视化,能够生成高可区分度的调色板,在考虑了数据分布的基础上,结合颜色的区分度和美观度,提高了数据可视化的效果,从而提升了可视分析的效率。
为实现上述目的,本发明的一个或多个实施例提供了如下技术方案:
一个或多个实施例提供了一种基于数据分布的调色板生成方法,包括以下步骤:
接收分类数据和颜色数据,所述颜色数据包括离散化的颜色空间;
将所述分类数据投影到可视空间,获取所述分类数据的位置信息;
基于所述位置信息度量类之间的分离度;
从离散化的颜色空间中随机选择多个颜色作为初始解,结合类之间的分离度,基于模拟退火算法快速寻找近似最优解,生成调色板;
基于所述调色板渲染分类数据。
一个或多个实施例提供了一种基于数据分布的调色板生成系统,包括:
数据加载模块,接收分类数据和颜色数据,所述颜色数据包括离散化的颜色空间;
数据分布确定模块,将所述分类数据投影到可视空间,获取所述分类数据的位置信息;
分离度度量模块,基于所述位置信息度量类之间的分离度;
调色板优化模块,从离散化的颜色空间中随机选择多个颜色作为初始解,结合类之间的分离度,基于模拟退火算法寻找近似最优解,生成调色板;
数据渲染模块,基于所述调色板渲染分类数据。
一个或多个实施例提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现所述的一种基于数据分布的调色板生成方法。
一个或多个实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现所述的一种基于数据分布的调色板生成方法。以上一个或多个技术方案存在以下有益效果:
本发明在获取了分类数据分布的基础上,采用模拟退火算法自动生成调色板,在考虑了数据分布的基础上,结合颜色的区分度和美观度,提高了数据可 视化的效果,从而提升了可视分析的效率。
构成本发明的一部分的说明书附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。
图1为本发明一个或多个实施例中基于数据分布的调色板生成方法流程图;
图2(a)-2(d)分别为本发明一个或多个实施例中模拟退火一次迭代过程初始化、迭代30次、150次和最终的示意图;
图3(a)-3(d)分别为本发明一个或多个实施例中权重λ=0、λ=0.3、λ=0.6和λ=1.0的结果示意图;
图4(a)-4(d)分别为本发明一个或多个实施例中扰动概率为0.1、0.5、0.8和1.0的结果示意图;
图5(a)-5(f)分别为本发明一个或多个实施例中不同权重设置效果示意图;
图6(a)-6(b)为本发明一个或多个实施例中直方图与折线图的计算示意图;
图7(a)-7(d)为本发明一个或多个实施例中直方图与折线图的效果示意图。
应该指出,以下详细说明都是示例性的,旨在对本发明提供进一步的说明。除非另有指明,本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。
需要注意的是,这里所使用的术语仅是为了描述具体实施方式,而非意图限制根据本发明的示例性实施方式。如在这里所使用的,除非上下文另外明确指出,否则单数形式也意图包括复数形式,此外,还应当理解的是,当在本说明书中使用术语“包含”和/或“包括”时,其指明存在特征、步骤、操作、器件、组件和/或它们的组合。
在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。
实施例一
本实施例提供了一种基于数据分布的调色板生成方法,如图1所示,具体包括以下步骤:
步骤1:加载分类数据和颜色数据,颜色数据包含离散化的LAB颜色空间。
所述分类数据为散点图、折线图和柱状图等。
加载颜色数据的同时会剔除一些大部分用户不喜欢的颜色,如亮度小于35或者亮度大于95的颜色,亮度在35到75之间并且色调在85到114之间的颜色,这些颜色大部分是黄绿色。
步骤2:将分类数据投影到可视空间(屏幕空间),获取所述分类数据的位置数据。
当所述分类数据为散点图时,获取各类散点中每个点的位置;
当所述分类数据为柱状图时,获取各柱形的几何中心的位置,如图6(a)所示;
当所述分类数据为折线图时,按设定间隔对图中的折线进行离散,获取各条折线中每个离散点的位置,如图6(b)所示。
步骤3:计算类分离度。
本实施例中,结合对密度敏感的K近邻图(K-Nearest Neighbor Graph,KNNG)和距离一致性(Dsitance Consistencty,DSC)方法对类分离度进行度量:分别基于KNNG和DSC计算类之间的不分离度,将两个不分离度进行线性加权组合得到任意两个类之间的不分离度,并且其中所涉及的权重可由用户根据需求进行调整。具体包括:
计算两个类C
i和C
j的基于KNNG的不分离程度:
其中n
i表示C
i类中点的数量,Ω(x
s)表示点x
s的所有邻居;当x
t的类标签为j 时,δ(l(x
t),j)=1,否则为0;
是两个点之间的欧式距离。两个互相交叉的类会得到一个较大的不分离度,表示他们之间的颜色差异应变大。需要注意的是kns(C
i,C
j)与kns(C
j,C
i)不同,因此我们需要把它们都计算在内。
然而,“KNNG距离”仅计算相邻有交集的两个类的距离,而不计算其它类的距离,这可能导致没有交集的两个类被赋予了相似的颜色,如图2a所示,为了解决这个问题,我们引入了基于类中心的“DSC距离”。
其中,a(x
i)是类内距离,b(x
i)是类间距离,通过结合类内距离和类间距离,我们获得了类的密度分布,只有“DSC距离”的效果如图2d所示。
将基于KNNG和DSC的不分离度按比例组合,我们得到:
ns(C
i,C
j)=λdns(C
i,C
j)+(1-λ)kns(C
i,C
j)
ns(C
i,C
j)就是我们最终获得的任意两个类之间的不分离程度。λ是一个可由用户调整的权重(不同λ的效果如图2所示,可以看出,当只有KNNG时会产生较多相似的颜色,而只有DSC时交叉类的对比度较弱),不同λ权重效果如图3(a)-3(d)所示。
步骤4:使用模拟退火算法快速寻找近似最优解,用户通过设置权重影响最终结果,如图5(a)-5(f)。
具体过程如下所述:
步骤4.1:从离散化的颜色空间中随机选择m个颜色组成初始解(m为数据中类的数量),设置初始温度、降温系数与最低温度,一次完整的迭代过程如图2(a)-2(d)所示。
步骤4.2:如果当前温度大于最低温度,执行下一步,否则退出迭代,返回 最终结果。
步骤4.3:对初始解进行随机扰动获得一个新解。考虑到完全随机扰动难以在有限时间内得到最优解,在随机概率小于0.5时对颜色进行随机扰动,不同随机概率的效果如图4(a)-4(d)所示,大于等于0.5时,对颜色的扰动遵循以下原则:新的颜色要使得当前的调色板评分更高。
步骤4.4:检查新解中所有颜色是否均满足较大的JND要求,如果不满足,对颜色进行扰动直到满足;如果在当前JND无法找到符合条件的调色板(颜色差异较大),调整颜色使得JND减小直到JND无法减小,退出迭代,返回当前解。JND的计算方法具体可参见“Just Noticeable Difference,M.Stone,D.A.Szafir,and V.Setlur.An engineering model for color difference as a function of size.In Color and Imaging Conference,vol.2014,pp.253–258.Society for Imaging Science and Technology,2014”。
步骤4.5:对当前解进行评分,具体评分函数如下所述:
(1)美观性评分:我们使用颜色组合偏好(Pair Preference)和饱和度方差(Saturation Variance)来度量调色板的美观性,具体如下:
a.颜色组合偏好是指人们对于颜色组合的喜好程度,Schloss和Palmer(K.B.Schloss and S.E.Palmer.Aesthetic response to color combinations:preference,harmony,and similarity.Attention,Perception,&Psychophysics,73(2):551–571,2011.doi:10.3758/s13414-010-0027-0)从实验数据中发现了一个线性回归模型用于预测用户对于颜色组合的喜欢程度,主要由三个因子组成:冷度(coolness,κ),色度相似性(hue similarity,ΔH)以及亮度对比度(lightness contrast,ΔL):
PP(c
1,c
2)=75.15(κ
1,κ
1)+47.61|ΔL|-46.42|ΔH|
通过以上公式计算离散化LAB颜色空间得到的所有颜色间的颜色偏好程度组成最终的颜色偏好矩阵。
b.饱和度方差是指给定调色板中所有颜色的饱和度方差。给定一个调色板P={c
1,...,c
m},
其中S(c
i)表示颜色c
i的饱和度,μ表示整个调色板的饱和度平均值。
(2)可区分度评分:我们使用颜色名差异(Name Difference)、类区分度(Class Discriminability)和背景对比度(Contrast with Background)来度量调色板和最终可视化效果的可区分度,具体如下:
a.颜色名差异(Name Difference,ND)是指两个颜色间颜色名的差异,因为两个感知差异大的颜色可能具有相同的颜色名,如紫罗兰色和靛蓝色都属于紫色,但它们的感知距离很大,为了使得生成的调色板有更多不同的颜色,我们引入了Name Difference(J.Heer and M.Stone.Color naming models for color selection,image editing and palette design.In ACM Human Factors in Computing Systems(CHI),pages 1007–1016,New York,NY,USA,2012.ACM.)。具体计算方式为:
其中c为给定颜色,T为一个有C行W列的颜色名关联矩阵。
本实施例中,颜色名关联矩阵及颜色组合偏好矩阵均预先构建好以供后续调用。
b.类区分度(Class Discriminability,CD)表示两个类之间的类分离度,我们使用CIEDE2000颜色差异来度量两个颜色间的感知距离,并将感知距离与不分离度结合起来组成类分离度:
其中,Δε(c
i,c
j)表示CIEDE 2000距离,ns(C
i,C
j)表示两个类C
i,C
j的不分离程度。
c.背景对比度(Contrast with Background,CB)是指每个类与背景的对比度。Ware等人(C.Ware.Information visualization:perception for design.Elsevier,2012.)的研究发现,有较大交叉的类应该被赋予具有更大对比度的颜色以提升可读性。具体如下:
其中c
i表示类C
i的颜色,ΔL(c
i,c
0)表示c
i与背景色c
0的亮度差异。
综上,我们将最终的评分函数E(P)定义为:
E(P)=ω
1CD(C,P)+ω
2CB(C,P,c
0)+ω
3ND(P)+ω
4PP(P)+ω
5SV(P)
其中,ω
1~ω
5为用户定义的权重系数,如图5(a)-5(f)所示,图5(a)和5(d)为使用高Class Visibility和高Name Difference生成的结果,图5(b)和5(e)为使用高Pair Preference和高Contrast with Background生成的,图5(c)中调色板的Saturation Variance较小,图5(f)中SV较大。
步骤4-6:如果当前解的评分优于上一个解,则对当前解做颜色分配,将当前解所对应的多个颜色分配到各个类中,以获取更好的结果,具体过程如下:
(1)颜色分配:将当前解所对应的多个颜色的分配进行优化,具体计算方法为:
计算点的区分度(Point Distinctness):
其中,C
r=τ(l(x
i)),C
S=τ(l(x
j))分别表示x
i,x
j的颜色,Ω
i为点x
i的k个邻居的集合,Δε为CIEDE2000距离矩阵,g(d(x
i,x
j))是基于两点之间距离的函数,目的是使得较近的点赋予较大的权重,g(d)=1/d。
计算点与背景的对比度(Point contrast with background):
其中,ΔL(C
r,C
b)为点与背景色的亮度的差值,C
b为背景颜色,ns(x
i)为基于点位置的不分离程度。
最终的目标函数为:
其中λ为权重系数,在这里设置λ=0.3。
所述颜色分配的优化方法可参见“Y.Wang,X.Chen,T.Ge,C.Bao,M.Sedlmair,C.-W.Fu,O.Deussen,and B.Chen.Optimizing color assignment for perception of class separability in multiclass scatterplots.IEEE Trans.Vis.&Comp.Graphics,25(1):820–829,2019”。
(2)如果颜色重新分配的结果优于当前解,则将此结果赋予当前解,否则进行下一步。
步骤4-7:如果当前解的评分差于上一个解,则以概率exp(ΔE/T
t)接受当前解,其中,ΔE表示当前解与上一个解的评分之差,T
t为当前温度。
步骤4-8:温度降低,返回步骤4-2继续迭代。
步骤5:利用生成的调色板渲染数据。
实施例二
本实施例的目的是提供一种基于数据分布的调色板生成系统。
为了实现上述目的,本实施例提供了一种基于数据分布的调色板生成系统,包括:
数据加载模块,接收分类数据和颜色数据,所述颜色数据包括离散化的颜色空间;
数据分布确定模块,将所述分类数据投影到可视空间,获取所述分类数据的位置信息;
分离度度量模块,基于所述位置信息度量类之间的分离度;
调色板优化模块,从离散化的颜色空间中随机选择多个颜色作为初始解,结合类之间的分离度,基于模拟退火算法寻找近似最优解,生成调色板;
数据渲染模块,基于所述调色板渲染分类数据。
实施例三
本实施例的目的是提供一种电子设备。
为了实现上述目的,本实施例提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现以下步骤,包括:
接收分类数据和颜色数据,所述颜色数据包括离散化的颜色空间;
将所述分类数据投影到可视空间,获取所述分类数据的位置信息;
基于所述位置信息度量类之间的分离度;
从离散化的颜色空间中随机选择多个颜色作为初始解,结合类之间的分离度,基于模拟退火算法寻找近似最优解,生成调色板;
基于所述调色板渲染分类数据。
实施例四
本实施例的目的是提供一种计算机可读存储介质。
为了实现上述目的,本实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时执行以下步骤:
接收分类数据和颜色数据,所述颜色数据包括离散化的颜色空间;
将所述分类数据投影到可视空间,获取所述分类数据的位置信息;
基于所述位置信息度量类之间的分离度;
从离散化的颜色空间中随机选择多个颜色作为初始解,结合类之间的分离度,基于模拟退火算法寻找近似最优解,生成调色板;
基于所述调色板渲染分类数据。
以上实施例二、三和四中涉及的各步骤与实施例一相对应,具体实施方式可参见实施例一的相关说明部分。术语“计算机可读存储介质”应该理解为包括一 个或多个指令集的单个介质或多个介质;还应当被理解为包括任何介质,所述任何介质能够存储、编码或承载用于由处理器执行的指令集并使处理器执行本发明中的任一方法。
以上一个或多个实施例具有以下技术效果:
本发明在获取了分类数据分布的基础上,采用模拟退火算法自动生成调色板,在考虑了数据分布的基础上,结合颜色的区分度和美观度,提高了数据可视化的效果,从而提升了可视分析的效率。
本发明采用模拟退火算法自动求解最优的调色板颜色组合,其中对于解的评分结合了用户的颜色喜好、颜色饱和度方差、颜色名差异、类分离度和与背景的对比度,能够得到一个兼顾美观和颜色区分度的调色板,使得可视化效果更加合理,且符合人类感知,用户体验好。
本发明在计算类分离度时设置了用户可调的权重系数,通过对该系数进行调整,能够生成偏向分离度和符合用户喜好的可视化结果;
本发明的调色板生成方法可适用于各类分类数据,包括但不限于散点图、折线图和柱状图等。
本领域技术人员应该明白,上述本发明的各模块或各步骤可以用通用的计算机装置来实现,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。本发明不限制于任何特定的硬件和软件的结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。
上述虽然结合附图对本发明的具体实施方式进行了描述,但并非对本发明保护范围的限制,所属领域技术人员应该明白,在本发明的技术方案的基础上, 本领域技术人员不需要付出创造性劳动即可做出的各种修改或变形仍在本发明的保护范围以内。
Claims (9)
- 一种基于数据分布的调色板生成方法,其特征在于,包括以下步骤:接收分类数据和颜色数据,所述颜色数据包括离散化的颜色空间;将所述分类数据投影到可视空间,获取所述分类数据的位置信息;基于所述位置信息度量类之间的分离度;从离散化的颜色空间中随机选择多个颜色作为初始解,结合类之间的分离度,基于模拟退火算法寻找近似最优解,生成调色板;基于所述调色板渲染分类数据。
- 如权利要求1所述的一种基于数据分布的调色板生成方法,其特征在于,所述分类数据为散点图、折线图或柱状图;获取所述分类数据的位置信息包括:当所述分类数据为散点图时,获取各类散点中每个点的位置;当所述分类数据为折线图时,按设定间隔对图中的折线进行离散,获取各条折线中每个离散点的位置;当所述分类数据为柱状图时,获取各柱形的几何中心的位置。
- 如权利要求1所述的一种基于数据分布的调色板生成方法,其特征在于,所述基于模拟退火算法寻找近似最优解包括:(1)从离散化的颜色空间中随机选择多个颜色组成初始解,颜色个数与分类数据的类别数目相同,并设置初始温度、降温系数与最低温度;(2)如果当前温度大于最低温度,执行步骤(3),否则退出迭代,输出当前解;(3)对初始解进行随机扰动获得新解;(4)检查新解中所有颜色的差别感觉阈限是否大于预设阈值,如果不满足,对颜色进行扰动直到满足;如果在当前差别感觉阈限下无法找到颜色差异符合要求的调色板,调整颜色使得差别感觉阈限减小直至差别感觉阈限无法减小,退出迭代,返回当前解;(5)对当前解进行评分;如果当前解的评分优于上一个解,则对当前解做颜色分配;如果当前解的评分差于上一个解,则以一定概率接受当前解;(6)温度降低,返回步骤(2)。
- 如权利要求3所述的一种基于数据分布的调色板生成方法,其特征在于,所述步骤(3)和(5)中的评分方法包括:采用颜色组合偏好和饱和度方差对调色板进行美观性评分;采用颜色名差异、类分离度和与背景的对比度进行调色板的可区分度评分;对颜色组合偏好、饱和度方差、颜色名差异、类分离度和与背景的对比度进行线性加权得到调色板最终评分。
- 如权利要求4所述的一种基于数据分布的调色板生成方法,其特征在于,所述类分离度计算方法为:计算两个类C i和C j的基于K近邻图的不分离程度:计算两个类C i和C j的基于距离一致性的不分离程度:将基于K近邻图和距离一致性的不分离度按比例组合:ns(C i,C j)=λdns(C i,C j)+(1-λ)kns(C i,C j),λ为用户可调整的权重;其中,Δε(c i,c j)表示两个颜色间的感知距离。
- 如权利要求3所述的一种基于数据分布的调色板生成方法,其特征在于,所述步骤(5)中颜色分配包括:将当前解所对应的多个颜色分配到各个类并进行优化;如果颜色重新分配的结果优于当前解,则将此结果赋予当前解,否则进行步骤(6)。
- 一种基于数据分布的调色板生成系统,其特征在于,包括:数据加载模块,接收分类数据和颜色数据,所述颜色数据包括离散化的颜色空间;数据分布确定模块,将所述分类数据投影到可视空间,获取所述分类数据的位置信息;分离度度量模块,基于所述位置信息度量类之间的分离度;调色板优化模块,从离散化的颜色空间中随机选择多个颜色作为初始解,结合类之间的分离度,基于模拟退火算法寻找近似最优解,生成调色板;数据渲染模块,基于所述调色板渲染分类数据。
- 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1-6任一项所述的一种基于数据分布的调色板生成方法。
- 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-6任一项所述的一种基于数据分布的调色板生成方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910492234.5 | 2019-06-06 | ||
CN201910492234.5A CN110196935B (zh) | 2019-06-06 | 2019-06-06 | 一种基于数据分布的调色板生成方法及系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020244215A1 true WO2020244215A1 (zh) | 2020-12-10 |
Family
ID=67754092
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/130087 WO2020244215A1 (zh) | 2019-06-06 | 2019-12-30 | 一种基于数据分布的调色板生成方法及系统 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110196935B (zh) |
WO (1) | WO2020244215A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110196935B (zh) * | 2019-06-06 | 2021-03-30 | 山东大学 | 一种基于数据分布的调色板生成方法及系统 |
CN111124404B (zh) * | 2019-11-29 | 2023-06-20 | 武汉虹信技术服务有限责任公司 | 一种自定义颜色的显示方法及系统 |
CN112052057B (zh) * | 2020-08-12 | 2021-10-22 | 北京科技大学 | 一种基于弹簧模型优化颜色表的数据可视化方法及系统 |
CN113345052B (zh) * | 2021-06-11 | 2023-01-10 | 山东大学 | 基于相似显著性的分类数据多视图可视化着色方法及系统 |
CN115457167B (zh) * | 2022-09-21 | 2023-06-09 | 山东大学 | 基于色彩排序的调色板设计系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103324662A (zh) * | 2013-04-18 | 2013-09-25 | 中国科学院计算技术研究所 | 社会媒体事件的动态观点演变的可视化方法及设备 |
CN103778658A (zh) * | 2014-01-23 | 2014-05-07 | 浙江财经大学 | 一种快速展示体数据特征的可视化方法 |
US20170316114A1 (en) * | 2016-04-29 | 2017-11-02 | Accenture Global Solutions Limited | System architecture with visual modeling tool for designing and deploying complex models to distributed computing clusters |
CN110196935A (zh) * | 2019-06-06 | 2019-09-03 | 山东大学 | 一种基于数据分布的调色板生成方法及系统 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101645173B (zh) * | 2008-12-16 | 2012-05-16 | 中国科学院声学研究所 | 一种随机调色板编码系统及方法 |
CN102262531A (zh) * | 2011-06-10 | 2011-11-30 | 上海市金山区青少年活动中心 | 双向数据绑定的调色板装置 |
WO2019022758A1 (en) * | 2017-07-28 | 2019-01-31 | Hewlett-Packard Development Company, L.P. | PALLETS OF REPRODUCIBLE COLORS IN CONFORMITY WITH COLOR HARMONY |
CN108986180B (zh) * | 2018-06-07 | 2022-09-16 | 创新先进技术有限公司 | 一种调色板的生成方法、装置及电子设备 |
-
2019
- 2019-06-06 CN CN201910492234.5A patent/CN110196935B/zh active Active
- 2019-12-30 WO PCT/CN2019/130087 patent/WO2020244215A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103324662A (zh) * | 2013-04-18 | 2013-09-25 | 中国科学院计算技术研究所 | 社会媒体事件的动态观点演变的可视化方法及设备 |
CN103778658A (zh) * | 2014-01-23 | 2014-05-07 | 浙江财经大学 | 一种快速展示体数据特征的可视化方法 |
US20170316114A1 (en) * | 2016-04-29 | 2017-11-02 | Accenture Global Solutions Limited | System architecture with visual modeling tool for designing and deploying complex models to distributed computing clusters |
CN110196935A (zh) * | 2019-06-06 | 2019-09-03 | 山东大学 | 一种基于数据分布的调色板生成方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN110196935A (zh) | 2019-09-03 |
CN110196935B (zh) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020244215A1 (zh) | 一种基于数据分布的调色板生成方法及系统 | |
US9891796B2 (en) | User interface to media files | |
CN109618222B (zh) | 一种拼接视频生成方法、装置、终端设备及存储介质 | |
US11899703B2 (en) | Arrangements of documents in a document feed | |
US20220164395A1 (en) | Using Natural Language Processing for Visual Analysis of a Data Set | |
US9747009B2 (en) | User interface for creating a playlist | |
CN109688463A (zh) | 一种剪辑视频生成方法、装置、终端设备及存储介质 | |
US10528620B2 (en) | Color sketch image searching | |
US8875021B2 (en) | Visual playlist | |
US20180348998A1 (en) | Data access interface | |
US9940551B1 (en) | Image generation using neural networks | |
AU2007299588B2 (en) | Method and system for selecting records from a database | |
Lu et al. | Palettailor: Discriminable colorization for categorical data | |
McCormack et al. | Deep learning of individual aesthetics | |
JP2002541571A (ja) | 格子表示装置及び方法 | |
JP2006139810A (ja) | 色温度を用いた映像ブラウジング装置及びその方法 | |
US11636251B2 (en) | Content aware font recommendation | |
CN110263218A (zh) | 视频描述文本生成方法、装置、设备和介质 | |
CN113345052B (zh) | 基于相似显著性的分类数据多视图可视化着色方法及系统 | |
CN103081460A (zh) | 运动图像处理装置以及运动图像处理方法和程序 | |
CN106469437B (zh) | 图像处理方法和图像处理装置 | |
US20100141653A1 (en) | Apparatus for providing and transforming shader of 3d graphic system | |
Wang et al. | Importance driven automatic color design for direct volume rendering | |
CN104700445B (zh) | 一种基于测量数据的brdf反射模型衍生方法 | |
JP2019061642A (ja) | 映像処理装置及びその方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19931977 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19931977 Country of ref document: EP Kind code of ref document: A1 |