WO2023037398A1 - 情報処理装置、情報処理方法及びプログラム - Google Patents

情報処理装置、情報処理方法及びプログラム Download PDF

Info

Publication number
WO2023037398A1
WO2023037398A1 PCT/JP2021/032766 JP2021032766W WO2023037398A1 WO 2023037398 A1 WO2023037398 A1 WO 2023037398A1 JP 2021032766 W JP2021032766 W JP 2021032766W WO 2023037398 A1 WO2023037398 A1 WO 2023037398A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
evaluation
insight
information processing
context
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/032766
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
拓磨 野澤
于洋 董
昌文 榎本
昌史 小山田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to US18/686,514 priority Critical patent/US20240354307A1/en
Priority to PCT/JP2021/032766 priority patent/WO2023037398A1/ja
Priority to JP2023546584A priority patent/JP7740343B2/ja
Publication of WO2023037398A1 publication Critical patent/WO2023037398A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24575Query processing with adaptation to user needs using context
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management

Definitions

  • the present invention relates to an information processing device, an information processing method, and a program.
  • instance data is generated by visualizing data to be visualized based on template data having a keyword that expresses a method of visualizing a data analysis result, and the instance data is evaluated as instance metadata. A method for regeneration based on values is described.
  • Patent Literature 1 has a problem that when the template data does not capture the user context, the presented visualization candidate is not necessarily the visualization result desired by the user.
  • One aspect of the present invention has been made in view of the above problems, and an example of its purpose is to provide a technology that enables evaluation as to whether a data visualization candidate provides an insight desired by a user. is.
  • An information processing apparatus includes acquisition means for acquiring an evaluation data set and context data, and for a plurality of insight subjects generated by referring to at least the evaluation data set, the context and evaluation means for performing evaluation according to the data.
  • An information processing method comprises: at least one processor acquiring an evaluation dataset and context data; is evaluated according to the context data.
  • a program provides a computer with a process of acquiring an evaluation data set and context data, and for a plurality of insight subjects generated with reference to at least the evaluation data set, performing the and a process of performing evaluation according to the context data.
  • FIG. 1 is a block diagram showing the configuration of an information processing device according to exemplary Embodiment 1 of the present invention
  • FIG. FIG. 3 is a flow diagram showing the flow of an information processing method according to exemplary embodiment 1 of the present invention
  • FIG. 4 is a diagram showing examples of insight subjects and evaluation results according to exemplary embodiment 1 of the present invention
  • FIG. 7 is a block diagram showing the configuration of an information processing apparatus according to exemplary Embodiment 2 of the present invention
  • FIG. 7 is a flow diagram showing the flow of an information processing method according to exemplary embodiment 2 of the present invention
  • FIG. 5 is a diagram showing an example of input data according to exemplary embodiment 2 of the present invention
  • FIG. 10 illustrates an example of context and visualization information according to example embodiment 2 of the present invention
  • FIG. 10 is a diagram showing an example of feature vector generation according to exemplary embodiment 2 of the present invention
  • FIG. 5 is a diagram showing an example of aggregated data and statistics according to exemplary embodiment 2 of the present invention
  • FIG. 10 is a diagram showing an example of an evaluation model according to exemplary embodiment 2 of the present invention
  • FIG. 10 is a diagram showing an example of displaying insight subjects with evaluation results according to exemplary embodiment 2 of the present invention
  • FIG. 10 is a diagram showing an example of displaying visualization information together with evaluation results according to exemplary embodiment 2 of the present invention
  • FIG. 10 is a diagram showing an example of displaying insight subjects with evaluation results according to exemplary embodiment 2 of the present invention
  • FIG. 10 is a diagram showing an example of displaying insight subjects with evaluation results according to exemplary embodiment 2 of the present invention
  • FIG. 11 is a block diagram showing the configuration of an information processing apparatus according to exemplary Embodiment 3 of the present invention. It is a figure which shows an example of the computer which executes the instruction
  • FIG. 1 is a block diagram showing the configuration of an information processing device 1.
  • the information processing device 1 is a device that evaluates whether a data visualization candidate provides an insight desired by a user.
  • the information processing device 1 includes an acquisition unit 11 and an evaluation unit 12 .
  • the acquisition unit 11 acquires an evaluation data set and context data.
  • the evaluation unit 12 evaluates a plurality of insight subjects generated by referring to at least the evaluation data set, according to the context data.
  • the evaluation data set is data used by the information processing apparatus 1 to evaluate visualization candidates of data.
  • the evaluation data set includes at least one of evaluation data, which is data to be visualized, and related data related to the evaluation data.
  • the data included in the evaluation data set is not limited to the examples described above, and the evaluation data set may include other information.
  • the evaluation data is data to be visualized, and is, for example, multidimensional data including multiple records. Examples of the evaluation data include data indicating monthly sales records of a certain store, data indicating the size and area of the store, data indicating product codes, product names and unit prices of products sold at the store, and/or It includes data that indicates the customer's gender, age, place of residence, occupation, etc. However, the evaluation data is not limited to this, and may be other data.
  • the evaluation data is visualized, for example, as a chart (a pie chart, a bar graph, a line graph, etc.) representing the contents of the evaluation data.
  • Related data is data related to the evaluation data.
  • the related data includes, for example, aggregated data indicating the aggregation result of the evaluation data, statistics of the aggregated data, and/or related information that is a set of various information used for visualizing the evaluation data.
  • the related information includes, for example, a part or all of the name of the data used for visualization of the evaluation data, the data type, the type of aggregation method, and the type of chart design. Note that the data included in the related data is not limited to the examples described above, and the related data may include other data.
  • Context data is data that represents what kind of insight a user seeks.
  • the context data includes, for example, at least one of a context, which is data related to the insight desired by the user, and a feature vector representing the context in a vector space. Note that the data included in the context data is not limited to the example described above, and the context data may include other data.
  • Context is data about the insight that a user seeks, an example being linguistic information extracted from a user query or metadata.
  • the context is the words “product A” and “customer” extracted from the user query "about the customer of product A.”
  • the context is the words “sales” and “transition” extracted from the user query “about sales transition”.
  • the context is, for example, the words “product A” and “customer” extracted from the metadata whose "search history” is "customer of product A”.
  • the context is, for example, the words “sales” and “transition” extracted from the metadata whose "search history” is "sales transition”.
  • the context is not limited to language information, and may be other information.
  • the context may be, for example, location information that indicates the user's location, information that indicates the degree of association between words, or information that indicates the browsing history of the site.
  • the insight subject is data generated with reference to at least the evaluation data set.
  • the insight subject includes, for example, at least one of data representing the visualization result of the evaluation data and data used to visualize the evaluation data.
  • a visualization result obtained by visualizing the evaluation data is, for example, a chart (a pie chart, a bar graph, a line graph, etc.) representing the contents of the evaluation data.
  • the insight subject may be, for example, a part of the above-described related data, such as related information included in the related data.
  • the insight subject may be part of the evaluation data set.
  • the insight subject is not limited to the above example, and may be other data.
  • an insight refers to a visualization result that a person recognizes as useful, and data representing such a visualization result.
  • an insight is an insight subject that a person finds useful.
  • the method by which the acquisition unit 11 acquires the evaluation data set and the context data is not particularly limited.
  • the acquisition unit 11 may acquire the evaluation data set and the context data by reading them from an external storage device or an internal storage device, and may acquire the evaluation data set and the context data via the communication IF or the input/output IF. You can get context data.
  • the method by which the evaluation unit 12 evaluates multiple insight subjects according to context data is not particularly limited.
  • the evaluation unit 12 calculates, for each of a plurality of insight subjects, an evaluation value that is an evaluation result of whether or not the insight desired by the user is provided.
  • this evaluation value is also called an insight score. Insight scores are a great help in discovering insight subjects that give users the insights they want even if they are output as is.
  • the insight score it is also possible to automatically detect an insight subject with a high insight score, that is, an insight subject that is likely to provide the insight desired by the user.
  • the evaluation unit 12 evaluates a plurality of insight subjects using an evaluation model in which related data and context data are input and an evaluation value is output.
  • the evaluation model may be a predefined score function, or may be a learned model constructed by machine learning.
  • the evaluation unit 12 evaluates a plurality of insight subjects using a score function that outputs a higher evaluation value as the relationship between the related data and the context data is higher, as an example.
  • the methods of evaluation performed by the evaluation unit 12 are not limited to these, and other methods may be used.
  • the visualization results obtained by visualizing the evaluation data differ depending on the content of the related information used for visualization.
  • Each of a plurality of visualization results obtained by visualizing the evaluation data with a plurality of different patterns is hereinafter also referred to as a “visualization candidate”.
  • the visual features given to the user by the plurality of visualization candidates of the evaluation data are different for each of the plurality of visualization candidates.
  • the evaluation unit 12 evaluates a plurality of insight subjects according to the context data, so that a plurality of visualization candidates are evaluated according to the context data.
  • FIG. 2 is a flow diagram showing the flow of the information processing method S1.
  • At step S11 at least one processor acquires an evaluation data set and context data. Then, in step S12, at least one processor evaluates a plurality of insight subjects generated by referring to at least the evaluation data set, according to the context data. Thus, the information processing method S1 of FIG. 2 ends.
  • each processor may be provided in one information processing apparatus, or may be provided in different information processing apparatuses.
  • At least one processor that executes the processes of S11 to S12 may be included in the information processing apparatus 1.
  • FIG. 3 is a diagram showing an example of insight subjects and evaluation results.
  • insight subjects V1 to V8 are data representing visualization candidates for evaluation data.
  • the evaluation result is the result of calculating the insight score by the evaluation unit 12 for each of the insight subjects V1 to V8.
  • the insight subject V1 has an insight score of "0.2" and the insight subject V2 has an insight score of "0.1.”
  • the insight scores of insight subjects V3 to V8 are respectively “0.8”, “0.6”, “0.3”, “0.5”, “0.9”, “0.7 ”.
  • the acquisition unit 11 that acquires the evaluation data set and the context data, and at least for a plurality of insight subjects generated by referring to the evaluation data set, and an evaluation unit 12 that performs evaluation according to the context data. Therefore, according to the information processing apparatus 1 according to the present exemplary embodiment, it is possible to obtain an effect that it becomes possible to evaluate whether the data visualization candidate provides the insight desired by the user.
  • a program according to this exemplary embodiment causes a computer to perform a process of obtaining an evaluation data set and context data, and at least context data for a plurality of insight subjects generated with reference to the evaluation data set. and a process of performing evaluation according to. Therefore, according to the program according to the present exemplary embodiment, it is possible to obtain an effect that it is possible to evaluate whether the data visualization candidate provides the insight desired by the user.
  • At least one processor acquires the evaluation data set and the context data, and a plurality of A configuration is adopted that includes evaluating an insight subject according to context data. Therefore, according to the information processing method S1 according to the present exemplary embodiment, it is possible to obtain the effect that it is possible to evaluate the visualization candidates as to whether they provide the insight desired by the user.
  • FIG. 4 is a block diagram showing the configuration of the information processing device 1A.
  • the information processing apparatus 1A includes a control section 10A that controls all the sections of the information processing apparatus 1A, and a storage section 17 that stores various data used by the information processing apparatus 1A.
  • the information processing apparatus 1A also receives an input to the information processing apparatus 1A, a communication section 18 for the information processing apparatus 1A to communicate with other apparatuses, a display section 19 for the information processing apparatus 1A to display and output data, and the information processing apparatus 1A.
  • An input unit 20 is provided.
  • the display unit 19 displays and outputs data will be described below, the information processing apparatus 1A may output data in a form such as print output or voice output.
  • the display unit 19 and the input unit 20 may be devices external to the information processing apparatus 1A, which are externally attached to the information processing apparatus 1A.
  • the control unit 10A includes an acquisition unit 11, an evaluation unit 12, a first generation unit 13, and a second generation unit 14.
  • the storage unit 17 also stores an evaluation data set DS, context data CD, evaluation model parameters EMP, evaluation results ER, and display data DD.
  • evaluation data set DS includes evaluation data and related data VD related to the evaluation data.
  • Evaluation data is data to be visualized, and examples include data indicating monthly sales records of a store, data indicating the size and area of the store, product codes and product names of products sold at the store. and data indicating the unit price, and/or data indicating the sex, age, place of residence, occupation, etc. of the customer.
  • the related data VD is data related to the evaluation data.
  • Related data VD includes ⁇ Relevant information V related to evaluation data ⁇ Feature vector d V representing related information V in vector space - Aggregate data s V obtained by aggregating the data included in the evaluation data and corresponding to the related information V, and ⁇ Statistics t V of total data s V includes at least one of
  • the related information V is, for example, a set of various information used for visualization of the evaluation data, and includes the following information, for example.
  • ⁇ Attribute information of each data included in the evaluation data ⁇ Information on the aggregation method (filter, aggregation function, column name that is the key for aggregation, etc.) (information on the filter applied to the evaluation data, etc.)
  • ⁇ Information on chart design x-axis, y-axis, chart type, plot type, etc.
  • the related information feature vector dV is a representation of the related information V in a vector space. Any vectorization method may be used, but for example, distributed representation of words may be used.
  • Total data s V is data obtained by aggregating numerical values corresponding to related information V from evaluation data. Aggregated data sV is plotted on a chart as a visualization result of related information V.
  • the statistic tV of the total data sV is an array of various statistics about the total data sV . Any statistic can be used, but for example, the following can be used as the statistic tV . ⁇ Maximum value, minimum value, median value ⁇ Mean value, standard deviation, variance ⁇ Cardinality ⁇ Percentage of zero values, percentage of missing values ⁇ Kurtosis, skewness ⁇ Entropy ⁇ Gini coefficient
  • Context data CD contains - Context C, and ⁇ Feature vector d C representing context in vector space includes at least one of
  • Context C is data about the insight that the user seeks.
  • the context C is, for example, data expressing the insight sought by the user in natural language, and includes data relating to the quality and quantity of the insight sought by the user.
  • Context C may be extracted from user query Q and/or metadata M described below.
  • Context C includes, as an example, the words "merchandise A" and "customer.”
  • a feature vector d C of context C is a representation of context C in a vector space. Any vectorization method may be used, but as an example, a distributed representation of words may be used.
  • a user query Q is a query about an insight that a user seeks and is provided by the user in natural language.
  • the user query Q includes, for example, the following information. ⁇ Information about the data to be analyzed (Example: “Product A”, “Sales”) ⁇ Hypotheses about insights ⁇ Characteristics of assumed charts (e.g. aggregation by region, pie chart)
  • Metadata M is information from which insight desired by the user can be estimated. Metadata M is, for example, automatically collected by a predetermined system.
  • the metadata M includes, for example, the following information. ⁇ User's search history (eg, searching for "product A, customer”) ⁇ User's analysis history (Example: customer analysis of product A was performed in the past) - User's evaluation history (e.g., the chart about the customer of product A was highly evaluated) ⁇ User's action history (eg, stayed at the site or store selling product A for xx minutes)
  • the evaluation model parameter EMP is a parameter that defines the evaluation model f.
  • the evaluation model f is a model that inputs the related data VD and the context data CD and quantitatively evaluates the insight subject corresponding to the input related data VD. Any model can be used as the evaluation model f as long as it can be used to estimate the evaluation result of the insight subject. For example, a rule-based model to be described later, a model constructed by machine learning, or the like can be used as the evaluation model f.
  • the output of the evaluation model f is, for example, a score representing the evaluation result or a label probability. The evaluation model f will be described later.
  • the evaluation result ER is data indicating the evaluation result of the insight subject by the evaluation unit 12 .
  • the evaluation result ER is, for example, an insight score y ⁇ representing an evaluation result for each of a plurality of insight subjects.
  • the insight score ⁇ is a quantitative index of goodness of visualization calculated based on the output value of the evaluation model f.
  • the insight score ⁇ may be, for example, an output value of the evaluation model f, or may be a value obtained by applying processing such as normalization and/or weighting to the output value of the evaluation model f.
  • a specific example of the method for calculating the insight score y ⁇ will be described later.
  • the display data DD is data for presenting the insight subject's evaluation result by the information processing apparatus 1A to the user, that is, data relating to the insight subject's evaluation result as to whether the insight desired by the user is provided.
  • the acquisition unit 11 acquires the evaluation data set DS and the context data CD.
  • the acquisition unit 11 acquires the evaluation data set DS and the context data CD by reading them from the storage unit 17 .
  • the method of obtaining the evaluation data set DS and the context data CD is not particularly limited.
  • the acquisition unit 11 may acquire the evaluation data set DS and the context data CD input by the user of the information processing device 1A via the input unit 20 .
  • the acquisition unit 11 may acquire the evaluation data set DS and the context data CD from an external device through communication via the communication unit 18 .
  • the evaluation unit 12 evaluates at least a plurality of insight subjects generated by referring to the evaluation data set DS, according to the context data CD. As an example, the evaluation unit 12 calculates an insight score y ⁇ for each of a plurality of insight subjects, generates an evaluation result ER indicating the calculation result, and stores the evaluation result ER in the storage unit 17 .
  • the first generation unit 13 generates a plurality of insight subjects with reference to the evaluation data set DS.
  • the first generation unit 13 also generates display data DD regarding the evaluation result of the evaluation unit 12 .
  • the second generator 14 generates at least part of the context data CD and at least part of the related data VD.
  • FIG. 5 is a flow diagram showing the flow of the information processing method.
  • the related information V is the visualization information used for visualization of the evaluation data
  • the visualization information which is an example of the related information V is also called "visualization information V.”
  • Step S101 the acquisition unit 11 acquires the input data D and the data for context generation.
  • Input data D is an example of evaluation data according to the present specification.
  • the input data D only needs to include data to be plotted on the chart, and any format can be used as the input data D format.
  • the acquisition unit 11 acquires the input data D via the input unit 20 or the communication unit 18 .
  • FIG. 6 is a diagram showing an example of input data D.
  • the input data D includes sales data, store data, product data, and customer data.
  • Sales data, store data, product data, and customer data are all data sets of multidimensional data including multiple records.
  • Sales data is multi-dimensional data including data items of "date”, “merchandise code”, “customer code”, “store code”, and "sales”.
  • the store data is multi-dimensional data including data items of "store code”, "store name”, "area”, and “scale”.
  • the product data is multi-dimensional data including data items of "product code", “product name”, “classification”, and "unit price”.
  • the customer data is multi-dimensional data including data items of "customer code", “age”, “sex”, “place of residence", "occupation”, and "income”.
  • the context generation data is data for generating context C, and includes one or both of user query Q and metadata M, for example.
  • the context-generating data may include multiple user queries and may include multiple metadata.
  • context generation data is not limited to user queries and metadata, and may be other data.
  • the context generation data may be data that can be used as the context C as it is.
  • the acquisition unit 11 may acquire the context generation data via the input unit 20 or the communication unit 18 , or may acquire the context generation data by reading the context generation data from the storage unit 17 .
  • step S102 the second generation unit 14 generates the evaluation data set DS and the context data CD.
  • the evaluation data set DS and the context data CD A specific example of generating the evaluation data set DS and generating the context data CD will be described below.
  • the second generator 14 first acquires the visualization information V.
  • the second generation unit 14 may acquire the visualization information V by reading it from a predetermined storage area of the storage unit 17, or acquire the visualization information V via the input unit 20 or the communication unit 18. good.
  • the second generation unit 14 acquires a plurality of pieces of visualization information V.
  • the visualization information V includes, for example, attribute information of each data included in the input data D, information on the relationship between each axis of the chart and the item, a filter applied to the input data D, a chart type, an aggregation method, and the like. Contains information.
  • the second generation unit 14 uses an arbitrary language model to generate a feature vector dV that expresses the acquired visualization information V in a vector space.
  • a feature vector dV is generated for each of a plurality of pieces of visualization information V.
  • FIG. the second generation unit 14 generates aggregated data s V obtained by aggregating numerical values corresponding to the visualization information V from the input data D, and a statistic t V that is a set of various statistics for the aggregated data s V. do.
  • the second generation unit 14 generates the acquired visualization information V, the related data VD including the generated feature vector d V , the total data s V , and the statistic t V , and the input data acquired by the acquisition unit 11 in step S101. Generate an evaluation data set DS containing D.
  • the related data VD may include multiple visualizations V and multiple feature vectors dV , or may include a pair of visualizations V and feature vector dV .
  • the second generation unit 14 generates a context C by executing arbitrary natural language processing on the context generation data acquired by the acquisition unit 11 in step S101. Note that the second generation unit 14 may use the context generation data as the context C as it is.
  • the second generation unit 14 performs natural language processing on a user query "customer of product A” to generate context C of "product A” and “customer”.
  • the second generating unit 14 performs natural language processing on a user query "sales transition” to generate context C "sales” and "transition”.
  • the second generation unit 14 performs natural language processing on metadata whose “search history” is “customer of product A” to generate context C of “product A” and “customer”. Generate.
  • the second generating unit 14 generates the context C of "sales” and “transition” by performing natural language processing on the metadata whose "search history" is "sales transition".
  • the second generation unit 14 uses an arbitrary language model to generate a feature vector dC expressing the generated context C in a vector space, and generates context data CD including the generated feature vector dC and the context C. Generate.
  • FIG. 7 is a diagram showing an example of context C and visualization information V.
  • FIG. 8 is a diagram showing an example of generation of the feature vector dC and the feature vector dV .
  • context C includes the words "merchandise A" and "customer.”
  • the visualization information V includes attribute information of each data included in the input data D, information on the relationship between each axis of the chart and the item, filters applied to the input data D, chart type, aggregation method, and other information.
  • a feature vector dV is generated from the visualization information V
  • a feature vector dC is generated from the context C.
  • FIG. 8 is a diagram showing an example of context C and visualization information V.
  • FIG. 9 is a diagram showing an example of total data sV and statistics tV generated by the second generation unit 14.
  • the aggregated data sV is data obtained by aggregating the data included in the input data D and corresponding to the visualization information V.
  • the statistic tV is data representing the statistic of the aggregated data sV .
  • Step S103 the 1st production
  • the insight subjects are data indicating visualization candidates
  • the first generation unit 13 generates a plurality of insight subjects by referring to the evaluation data and the related data VD, for example.
  • the first generation unit 13 generates an insight subject representing the visualization result of plotting the aggregated data SV included in the related data VD on a chart of the display mode represented by the visualization information V, for example.
  • the first generating unit 13 generates an insight subject for each of the plurality of visualization information V, thereby generating a plurality of insight subjects.
  • the visualization information V and the insight subject correspond one-to-one.
  • the insight subject is not limited to the data representing the visualization candidate, and for example, the visualization information V may be treated as it is as the insight subject.
  • step S104 the evaluation unit 12 evaluates each of the plurality of insight subjects with reference to the context data CD. At this time, the evaluation unit 12 gives a higher evaluation, for example, to an insight subject that is more relevant to the context data CD.
  • the evaluation unit 12 evaluates each of a plurality of insight subjects by referring to the related data VD and the context data CD. At this time, since the plurality of insight subjects correspond to the related information V on a one-to-one basis, the evaluation unit 12 evaluates each of the visualization information V. FIG. In other words, the evaluation unit 12 evaluates each of the plurality of insight subjects for each related information V included in the related data VD.
  • the evaluation unit 12 uses the related data VD to calculate the score y 0 ⁇ , and uses the score y 0 ⁇ to calculate the insight score ⁇ . At this time, the evaluation unit 12 may use the score y 0 ⁇ as it is as the insight score y ⁇ , or may calculate the insight score y ⁇ by adding processing such as normalization or weighting to the score y 0 ⁇ . good too.
  • the method of calculating the score y 0 ⁇ is not limited, but the evaluation unit 12 may use, for example, a score function defined on a rule basis for each type of insight, or learn the feature amount of the chart that provides the insight.
  • the score y 0 ⁇ may be calculated using a model that
  • the score function is, for example, a function that outputs a higher evaluation value as the relationship between the related data VD and the context data CD is higher.
  • the evaluation unit 12 uses a score function defined in advance to output a higher evaluation value as the relationship between the related data VD and the context data CD is higher, and evaluates a plurality of insight subjects. to evaluate.
  • the evaluation unit 12 sets the insight score ⁇ for the related data VD having low relevance to the context data CD to zero or a negative value, so that the evaluation result is low.
  • the method of calculating the degree of association (similarity) between the context data CD and the related data VD is not limited, the evaluation unit 12 may, for example, calculate the similarity of sets (Jaccard, Dice, Simpson, etc.), the similarity of character strings, (Hamming distance, Levenshtein distance, Jaro-Winkler distance, etc.) and distributed representation (word2vec, fastText, BERT, etc.).
  • the evaluation unit 12 may also calculate the insight score y using a score weighted by the degree of similarity between the context data CD and the related data VD. More specifically, for example, the insight score y ⁇ may be the product of the score y0 ⁇ calculated using the related data VD and the similarity sim(CD, Dv ).
  • the evaluation unit 12 uses an evaluation model f that is a pre-learned evaluation model, receives the related data VD and the context data CD, and outputs an evaluation value. evaluation.
  • the machine learning method of the evaluation model f is not limited, and as an example, a decision tree-based, linear regression, or neural network method may be used, or one or more of these methods may be used. good.
  • Decision tree bases include, for example, LightGBM (Light Gradient Boosting Machine) and XGBoost.
  • Linear regression includes, for example, support vector regression, Ridge regression, Lasso regression, and ElasticNet.
  • Neural networks include, for example, deep learning.
  • any teacher data that can be considered to have insight can be used.
  • charts created by data analysts in the past may be considered to contain features that give insight, and their visualization information V may be used for learning as positive samples.
  • chart visualization information V that is considered to have no insight may be used as a negative sample for learning.
  • FIG. 10 is a diagram showing an example of the evaluation model f.
  • the input of the evaluation model f includes the feature vector dV , the feature vector dC, the summary data Sv , and the statistic tv .
  • the output of the evaluation model f is an evaluation result, for example, a label probability indicating whether the insight desired by the user is provided.
  • Example 1 of learning-based evaluation model When a teacher label y regarding an insight of the visualization information V is given, an evaluation model can be learned as a classification model. For example, when y ⁇ ⁇ 0, 1 ⁇ is 1, there is insight, and when it is 0, there is no insight, as a two-class classification task, for example, by the following equation (1) A machine learning model that minimizes the given loss function E( ⁇ ) should be learned.
  • Equation (1) N is the number of learning data.
  • Example 2 of learning-based evaluation model
  • an evaluation model can be learned as a regression model. For example, if y is the score given by the teacher data, a machine learning model that minimizes the loss function E( ⁇ ) given by the following equation (2) may be trained.
  • Equation (2) N is the number of learning data.
  • the output of the machine learning model that minimizes the above loss function is a score that expresses the goodness of visualization in the same way as the score of the training data, and may be used as the insight score y ⁇ .
  • Step S105 In step S105 of FIG. 5, the evaluation unit 12 outputs information related to the insight subject to the display unit 19, and the display unit 19 displays the information related to the insight subject. Specifically, for example, the display unit 19 displays at least one of the plurality of insight subjects together with the evaluation result by the evaluation unit 12 or in a display mode according to the evaluation result by the evaluation unit 12 .
  • the display mode according to the evaluation result includes, for example, display order or display size.
  • FIG. 11 is a diagram showing an example of displaying an insight subject together with an evaluation result.
  • insight subjects V7, V3, V8, .
  • the insight score y ⁇ of each insight subject is displayed adjacent to each of the insight subjects V7, V3, V8, .
  • a plurality of insight subjects V7, V3, V8, . . . are displayed in descending order of insight score ⁇ .
  • a plurality of insight subjects are displayed in descending order of insight score ⁇ , so that the user can easily grasp which insight subject has a high evaluation.
  • FIG. 12 is a diagram showing an example of displaying the visualization information V together with the evaluation results.
  • the display unit 19 displays each related information V included in the related data in association with the evaluation by the evaluation unit 12 .
  • the display unit 19 displays the visualized information V11 to V18 and the insight score y ⁇ corresponding to each of the visualized information V11 to V18 in association with each other.
  • FIG. 13 is a diagram showing an example of displaying insight subjects together with evaluation results.
  • the display unit 19 displays a chart (bar graph) that is a visualization result of the input data D, and also displays an insight score y ⁇ corresponding to the displayed chart together with the chart.
  • the evaluation unit 12 gives a higher evaluation to an insight subject having a higher relationship with the context data. . Therefore, according to the information processing device 1A according to the present exemplary embodiment, in addition to the effects of the information processing device 1 according to the first exemplary embodiment, the degree of relevance between the context data and the insight subject can be grasped. It is possible to obtain an effect that an easy evaluation can be performed.
  • FIG. 14 is a block diagram showing the configuration of an information processing device 1B according to this exemplary embodiment.
  • the information processing apparatus 1B includes a control section 10B instead of the control section 10A of the information processing apparatus 1A according to the second exemplary embodiment.
  • the control unit 10 ⁇ /b>B includes a learning unit 15 in addition to the acquisition unit 11 , the evaluation unit 12 , the first generation unit 13 and the second generation unit 14 .
  • the input unit 20 receives user feedback on the evaluation result of the evaluation unit 12 . Also, the learning unit 15 re-learns the evaluation model f with reference to feedback from the user.
  • the learning unit 15 stores the user's operation history regarding the information (insight score y ⁇ , visualization information V, chart, etc.) related to the insight subject displayed by the display unit 19 as feedback from the user, such as the storage unit 17. to record.
  • the user's operation history includes, for example, the display time of the information related to the insight subject, the pressing of the evaluation button for the information related to the insight subject, and the like.
  • the learning unit 15 re-learns the evaluation model f reflecting the feedback from the user. For example, the learning unit 15 performs re-learning of the evaluation model f by using high-evaluation visualization information V as a positive sample and low-evaluation visualization information as a negative sample.
  • the input unit 20 receives feedback from the user regarding the evaluation result
  • the learning unit 15 refers to the feedback from the user and re-learns the evaluation model. Adopted. Therefore, according to the information processing device 1B according to the present exemplary embodiment, in addition to the effect of the information processing device 1 according to the first exemplary embodiment, the effect that the evaluation accuracy of the evaluation model can be further improved. is obtained.
  • the processing performed by one information processing apparatus 1 may be shared by a plurality of information processing apparatuses. In other words, part of the processing performed by the information processing device 1 may be performed by at least one other information processing device. In other words, when at least one processor performs each of the processes described above, the at least one processor may be provided in one information processing apparatus 1, or may be provided in different information processing apparatuses. It may be something that is This also applies to the information processing device 1A in the second exemplary embodiment and the information processing device 1B in the third exemplary embodiment described above.
  • Some or all of the functions of the information processing apparatuses 1, 1A, and 1B may be implemented by hardware such as integrated circuits (IC chips), or may be implemented by software.
  • the information processing apparatuses 1, 1A, and 1B are implemented by computers that execute program instructions, which are software that implements each function, for example.
  • An example of such a computer (hereinafter referred to as computer C) is shown in FIG.
  • Computer C comprises at least one processor C1 and at least one memory C2.
  • a program P for operating the computer C as the information processing apparatuses 1, 1A, and 1B is recorded in the memory C2.
  • the processor C1 reads the program P from the memory C2 and executes it, thereby realizing each function of the information processing apparatuses 1, 1A, and 1B.
  • processor C1 for example, CPU (Central Processing Unit), GPU (Graphic Processing Unit), DSP (Digital Signal Processor), MPU (Micro Processing Unit), FPU (Floating point number Processing Unit), PPU (Physics Processing Unit) , a microcontroller, or a combination thereof.
  • memory C2 for example, a flash memory, HDD (Hard Disk Drive), SSD (Solid State Drive), or a combination thereof can be used.
  • the computer C may further include a RAM (Random Access Memory) for expanding the program P during execution and temporarily storing various data.
  • Computer C may further include a communication interface for sending and receiving data to and from other devices.
  • Computer C may further include an input/output interface for connecting input/output devices such as a keyboard, mouse, display, and printer.
  • the program P can be recorded on a non-temporary tangible recording medium M that is readable by the computer C.
  • a recording medium M for example, a tape, disk, card, semiconductor memory, programmable logic circuit, or the like can be used.
  • the computer C can acquire the program P via such a recording medium M.
  • the program P can be transmitted via a transmission medium.
  • a transmission medium for example, a communication network or broadcast waves can be used.
  • Computer C can also obtain program P via such a transmission medium.
  • (Appendix 1) Acquisition means for acquiring the evaluation data set and the context data; evaluation means for evaluating a plurality of insight subjects generated by referring to at least the evaluation data set, according to the context data; Information processing device.
  • the evaluation means are The information processing device according to appendix 1, wherein a higher evaluation is given to an insight subject having a higher relevance to the context data.
  • Appendix 3 Further comprising a first generation means for generating the plurality of insight subjects by referring to the evaluation data set; 3.
  • the evaluation data set includes evaluation data and related data related to the evaluation data
  • the first generation means generates the plurality of insight subjects by referring to the evaluation data and the related data, 3.
  • the information processing apparatus according to appendix 3, wherein the evaluation means performs evaluation with reference to the related data and the context data for each of the plurality of insight subjects.
  • an evaluation is made as to whether the insight desired by the user is provided. It can be performed.
  • the insight subject can be evaluated for each related information.
  • appendix 6 The information processing apparatus according to appendix 4 or 5, further comprising second generation means for generating at least part of the context data and at least part of the related data.
  • the context data includes: context, and 7.
  • the information processing device according to any one of appendices 4 to 6, wherein at least one of the context feature vectors is included.
  • the relevant data includes: relevant information related to the evaluation data; a feature vector of the relevant information; Aggregated data obtained by aggregating the data corresponding to the related information, which is included in the evaluation data; and 8.
  • the information processing device according to any one of appendices 4 to 7, wherein at least one of the statistics of the aggregated data is included.
  • the evaluation means are The plurality of insight subjects are evaluated using a score function that is a predefined score function that outputs a higher evaluation value as the relationship between the related data and the context data is higher, 9.
  • the information processing apparatus according to any one of Appendices 4 to 8.
  • each of a plurality of insight subjects generated by referring to the evaluation data set and related data can be evaluated using the score function.
  • the evaluation means are Supplementary notes 4 to 8, wherein the plurality of insight subjects are evaluated using an evaluation model that is pre-learned and receives the relevant data and the context data and outputs an evaluation value.
  • the information processing apparatus according to any one of .
  • each of a plurality of insight subjects generated by referring to the evaluation data set and related data can be evaluated using the evaluation model.
  • Appendix 11 further comprising receiving means for receiving feedback from the user on the evaluation result of the evaluation means; 11.
  • Appendix 12 12. The information processing apparatus according to any one of appendices 4 to 11, further comprising display means for displaying information related to the insight subject.
  • the user can grasp the evaluation of the insight subject from the information displayed by the display means.
  • the display means is 13.
  • the insight subject displayed by the display means makes it easier for the user to grasp the evaluation of the insight subject.
  • the display means is 13. The information processing apparatus according to appendix 12, wherein each related information included in the related data and the evaluation by the evaluation means are displayed in association with each other.
  • the user can grasp the evaluation of each of the plurality of insight subjects from the information displayed by the display means.
  • Appendix 15 at least one processor obtaining an evaluation dataset and contextual data; and evaluating at least a plurality of insight subjects generated by referring to the evaluation data set according to the context data; Information processing method including.
  • Appendix 16 to the computer, a process of acquiring an evaluation data set and context data; a process of evaluating at least a plurality of insight subjects generated by referring to the evaluation data set, according to the context data; program to run.
  • the processor performs an acquisition process for acquiring an evaluation data set and context data;
  • An information processing device that executes an evaluation process for performing an evaluation according to the
  • the information processing apparatus may further include a memory, and the memory may store a program for causing the processor to execute the acquisition process and the evaluation process. Also, this program may be recorded in a computer-readable non-temporary tangible recording medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/JP2021/032766 2021-09-07 2021-09-07 情報処理装置、情報処理方法及びプログラム Ceased WO2023037398A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US18/686,514 US20240354307A1 (en) 2021-09-07 2021-09-07 Information processing apparatus, information processing method, and storage medium
PCT/JP2021/032766 WO2023037398A1 (ja) 2021-09-07 2021-09-07 情報処理装置、情報処理方法及びプログラム
JP2023546584A JP7740343B2 (ja) 2021-09-07 2021-09-07 情報処理装置、情報処理方法及びプログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/032766 WO2023037398A1 (ja) 2021-09-07 2021-09-07 情報処理装置、情報処理方法及びプログラム

Publications (1)

Publication Number Publication Date
WO2023037398A1 true WO2023037398A1 (ja) 2023-03-16

Family

ID=85507260

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/032766 Ceased WO2023037398A1 (ja) 2021-09-07 2021-09-07 情報処理装置、情報処理方法及びプログラム

Country Status (3)

Country Link
US (1) US20240354307A1 (https=)
JP (1) JP7740343B2 (https=)
WO (1) WO2023037398A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220123848A1 (en) * 2019-01-21 2022-04-21 Nec Corporation Wireless communication quality visualization system, wireless communication quality visualization device, and measurement apparatus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015194115A1 (ja) * 2014-06-16 2015-12-23 パナソニックIpマネジメント株式会社 接客評価装置、接客評価システム及び接客評価方法
JP2016153981A (ja) * 2015-02-20 2016-08-25 三菱重工業株式会社 解析支援装置、解析支援方法、解析支援プログラム
JP2016224873A (ja) * 2015-06-03 2016-12-28 株式会社日立製作所 営業支援サーバ、営業支援端末及び営業支援システム
US20180095945A1 (en) * 2016-09-30 2018-04-05 Wipro Limited Methods and systems for creating new presentations using existing presentations

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004322862A (ja) * 2003-04-24 2004-11-18 Sekiya Motors:Kk 車両検査診断装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015194115A1 (ja) * 2014-06-16 2015-12-23 パナソニックIpマネジメント株式会社 接客評価装置、接客評価システム及び接客評価方法
JP2016153981A (ja) * 2015-02-20 2016-08-25 三菱重工業株式会社 解析支援装置、解析支援方法、解析支援プログラム
JP2016224873A (ja) * 2015-06-03 2016-12-28 株式会社日立製作所 営業支援サーバ、営業支援端末及び営業支援システム
US20180095945A1 (en) * 2016-09-30 2018-04-05 Wipro Limited Methods and systems for creating new presentations using existing presentations

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220123848A1 (en) * 2019-01-21 2022-04-21 Nec Corporation Wireless communication quality visualization system, wireless communication quality visualization device, and measurement apparatus
US12155422B2 (en) * 2019-01-21 2024-11-26 Nec Corporation Wireless communication quality visualization system, wireless communication quality visualization device, and measurement apparatus

Also Published As

Publication number Publication date
JP7740343B2 (ja) 2025-09-17
JPWO2023037398A1 (https=) 2023-03-16
US20240354307A1 (en) 2024-10-24

Similar Documents

Publication Publication Date Title
Ramos-Carreño et al. scikit-fda: a Python package for functional data analysis
CN105930934B (zh) 展示预测模型的方法、装置及调整预测模型的方法、装置
JP7619470B2 (ja) 情報処理装置、情報処理方法及びプログラム
US10558657B1 (en) Document content analysis based on topic modeling
JPWO2023037399A5 (https=)
US10255283B1 (en) Document content analysis based on topic modeling
US20180341686A1 (en) System and method for data search based on top-to-bottom similarity analysis
CN119474171B (zh) 数据挖掘方法装置、设备及存储介质
US20220147758A1 (en) Computer-readable recording medium storing inference program and method of inferring
Saravanan et al. Prediction of insufficient accuracy for human activity recognition using convolutional neural network in compared with support vector machine
Shahid et al. Enhancing movie recommendations using quantum support vector machine (QSVM) M. Shahid et al.
JP7461524B2 (ja) 情報処理装置、情報処理方法およびプログラム
JP7740343B2 (ja) 情報処理装置、情報処理方法及びプログラム
US12572845B2 (en) Intelligent machine-learning model catalog
JPWO2023037398A5 (https=)
AU2023210681B2 (en) Dataset ranking based on composite score
CN119811656A (zh) 多维医疗大数据融合与疾病预测识别引擎的构建和应用
JP7292235B2 (ja) 分析支援装置及び分析支援方法
Zloch et al. Charaterizing RDF graphs through graph-based measures–framework and assessment
Huang et al. Tabular data classification via an improved gated transformer
Kumbhar et al. Web mining: A Synergic approach resorting to classifications and clustering
JP2010198269A (ja) 意味ドリフトの発生評価方法及び装置
Wang Library User Behavior and Service Optimization Using Artificial Intelligence
JP2021165892A (ja) 情報処理装置、情報処理方法およびプログラム
US20240355438A1 (en) Computer-implemented method for fast matching of entities from different datasets

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21956698

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023546584

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 18686514

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21956698

Country of ref document: EP

Kind code of ref document: A1