WO2017072822A1 - Système et procédé d'évaluation de pertinence, programme, et support d'enregistrement - Google Patents

Système et procédé d'évaluation de pertinence, programme, et support d'enregistrement Download PDF

Info

Publication number
WO2017072822A1
WO2017072822A1 PCT/JP2015/005479 JP2015005479W WO2017072822A1 WO 2017072822 A1 WO2017072822 A1 WO 2017072822A1 JP 2015005479 W JP2015005479 W JP 2015005479W WO 2017072822 A1 WO2017072822 A1 WO 2017072822A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
evaluation
test data
relevance
components
Prior art date
Application number
PCT/JP2015/005479
Other languages
English (en)
Japanese (ja)
Inventor
秀樹 武田
和巳 蓮子
Original Assignee
株式会社Ubic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社Ubic filed Critical 株式会社Ubic
Priority to PCT/JP2015/005479 priority Critical patent/WO2017072822A1/fr
Priority to JP2017547201A priority patent/JPWO2017072822A1/ja
Publication of WO2017072822A1 publication Critical patent/WO2017072822A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to a relevance evaluation system, method, program for evaluating relevance between data, and a recording medium storing it.
  • a data aggregate (hereinafter simply referred to as “data”) composed of many data components (for example, “word” in the case of document data) always has a characteristic in its contents.
  • data having a large number of data components to be configured, it may be necessary to objectively evaluate the characteristics of the data without comparing the details in detail.
  • a method of calculating a characteristic value representing similarity in each piece of data and comparing the similarity of the data there is a method of calculating a characteristic value representing similarity in each piece of data and comparing the similarity of the data.
  • Patent Document 1 discloses an example of similar document search.
  • a feature word characterizing the description content is extracted from a document set made up of a large number of documents in advance, and a set of feature words is created.
  • a feature vector from a data component serving as a reference is calculated and stored for the feature word.
  • the similarity is calculated by comparing with the feature word, and it is determined that the document having the most similar score value is the closest to the input document.
  • score value the degree of similarity
  • the type of data is not limited to document data as disclosed in Patent Document 1, and data having various types of morphemes such as image data and audio data as data components can be considered. Therefore, an index that causes a difference in the degree of relevance of the data with respect to the reference data is obtained by a simple method.
  • a relevance evaluation system that evaluates relevance of test data with respect to reference data, the relevance evaluation system including a data acquisition unit that acquires the reference data and the test data, respectively, and the reference data Of the data components, an evaluation component that represents the characteristics of the reference data is extracted from the test data in the order of appearance according to the arrangement direction of the data components of the test data;
  • the relevance evaluation system includes a relevance evaluation unit that calculates a feature coefficient based on the appearance order of the evaluation components of the test data in the arrangement direction of the test data.
  • a method for evaluating the relevance between reference data and test data by a relevance evaluation system comprising a computer, wherein the reference data and the test data are respectively acquired, and the data components of the reference data Among them, the evaluation component representing the characteristics of the reference data is extracted from the test data in the order of appearance according to the arrangement direction of the data component of the test data, and the test data in the alignment direction of the test data is extracted.
  • a relevance evaluation method for calculating a feature coefficient based on the appearance order of the evaluation components of the inspection data.
  • a relevance evaluation program that can be executed in a relevance evaluation system comprising a computer, the program evaluating relevance between reference data and test data, and the program includes the reference data and the subject data.
  • Each of the test data and the evaluation component representing the characteristics of the reference data among the data components of the reference data in the arrangement direction of the data components of the test data from the test data
  • the relevance evaluation program executes the step of extracting in the order of appearance, and the step of calculating the feature coefficient based on the order of appearance of the evaluation components of the test data in the arrangement direction of the test data.
  • a storage medium that is executable in a relevance evaluation system including a computer and stores a relevance evaluation program for evaluating relevance between reference data and test data, the program including the reference data and Obtaining each of the test data, and an evaluation component representing a characteristic of the reference data among the data components of the reference data, the arrangement of the data components of the test data from the test data
  • a storage medium that performs the steps of extracting in order of appearance according to the direction and calculating the feature coefficient based on the order of appearance of the evaluation components of the test data in the alignment direction of the test data is solved by a storage medium .
  • the present invention makes it possible to select data closest to the reference data for two or more data.
  • FIG. 5 is a conceptual diagram showing reference data R.
  • FIG. 3 is a conceptual diagram showing test data T. It is the figure which showed contrast with the evaluation component of the reference data R which considered the order of appearance, and the evaluation component of the test data T which considered the order of appearance.
  • FIG. 1 is an example of a hardware configuration of the system 1.
  • the system 1 includes a server device 10 and a client terminal 11.
  • the server device 10 includes an arithmetic device 10a that performs calculation and a storage device 10b that stores data.
  • the server device 10 can execute main processing of data analysis.
  • the client terminal 11 can execute a data analysis related process in the server device 10.
  • the storage device 10b is, for example, any recording medium (for example, a memory or a hard disk) that can store data (including digital data and analog data).
  • the arithmetic device 10a is a controller (for example, a central processing unit (CPU)) that can execute a control program stored in a recording medium.
  • the computing device 10a is a computer or a computer system (a system that realizes data analysis by operating a plurality of computers in an integrated manner) that analyzes data stored at least temporarily in a recording medium.
  • the computing device 10a may be configured as a management computer (not shown) in the form of an external device of the server device 10, and the storage device 10b is configured as the data storage server device 13 of the external storage device of the server device 10. You may make it comprise with a form.
  • the management computer may include, for example, a memory, a controller, a bus, an input / output interface, and a communication interface.
  • application programs that can control the respective devices of the client terminal 11, the server device 10, and the management computer (not shown) are stored in the memory provided in each of the client terminal 11, the server device 10, and the management computer (not shown). Yes.
  • the application program software resource
  • the hardware resource cooperate to operate each device.
  • the storage device 10b is composed of, for example, a disk array system, and can include a database that records data and results of evaluation / classification of the data.
  • the server device 10 and the storage device 10b are connected by a direct connection method (DAS) or a storage device area network (SAN).
  • DAS direct connection method
  • SAN storage device area network
  • the client terminal 11 presents data in the middle of the processing process in the server device 10 to the user. As a result, the user can input, that is, provide classification information through bidirectional exchange via the client terminal 11.
  • the client terminal 11 includes, for example, a memory, a controller, a bus, an input / output interface (for example, a keyboard and a display), and a communication interface (communication means using a predetermined network). For communication).
  • the client terminal 11 may be configured to include an input device 12 such as a scanner.
  • the hardware configuration shown in FIG. 1 is merely an example, and the system 1 can be realized by other hardware configurations.
  • a configuration in which part or all of all the processes are executed in the server device 10 may be used, or a part or all of the processing may be executed in the client terminal 11.
  • the input device 12 is connected to the client terminal 11 and can transmit to the server device 10.
  • the input device 12 directly connects to the server device 10 and inputs data to the server from here. May be. It will be understood by those skilled in the art that there are various hardware configurations that can implement the system 1, and the configuration is not limited to the configuration illustrated in FIG. 1, for example.
  • FIG. 2 is a diagram illustrating the reference data R and the test data T that are comparison targets of the relevance in the present invention.
  • test data T1 and test data T2 are highly related to the reference data R. It is.
  • the feature coefficient is calculated as an index for evaluating the relevance, thereby evaluating the high relevance.
  • Both the test data T1, T2 and the reference data R are aggregates of data components.
  • test data T1 and T2 are composed of a plurality of unit data t1
  • the test data T2 is composed of a plurality of data components t2
  • the reference data R is composed of a plurality of data components r.
  • the data type of the test data T and the reference data R is not particularly limited. It may be document data, or any data aggregate, such as image data and audio data, as long as it is an aggregate of unit data.
  • the data components include morphemes, keywords, sentences, paragraphs, and / or metadata (for example, header information of an e-mail) constituting a document, partial voices constituting the voice, and volume (gain) information. And / or timbre information, partial image, partial pixel, and / or luminance information constituting an image, frame image, motion information, and / or three-dimensional information constituting a video.
  • test data T and the reference data R are assumed to be document data
  • the data component is text data having a typical word or phrase constituting it as a representative example.
  • the types of data are typically the same, but they are not necessarily the same.
  • the reference data R is document data and the data component is a word
  • the test data T is speech data
  • a comparison is made between the data component as characters and the word data as speech.
  • the degree of relevance can be evaluated.
  • FIG. 3 is a diagram showing data components.
  • the arrangement direction of data constituent elements constituting the reference data R is defined.
  • the arrangement direction necessary for evaluating the contents of the reference data R is determined.
  • the arrangement direction of the data components is determined from the left to the right.
  • the rightmost data component the leftmost data in the line one level down is assigned, and from that position
  • the alignment direction is determined so as to go to the right.
  • the order of character strings is the arrangement direction.
  • the arrangement direction most appropriate for evaluation is determined.
  • a plurality of data constituent elements that best represent the characteristics of the content of the reference data R are used as evaluation constituent elements, and appear in accordance with the arrangement direction of the predefined unit data. Extract sequentially. In the example shown in FIG. 4, five evaluation components m1, m2, m3, m4, and m5 are selected. The selection of evaluation components and the order of their appearance are selected so as to most accurately represent the characteristics of the content of the reference data R.
  • the evaluation components m1, m2, m3, m4, and m5 of the reference data R and their appearance order function as predetermined criteria for evaluating the relevance of the test data T.
  • FIG. 4 is a diagram showing the test data T.
  • the arrangement direction of the data components constituting the test data T is determined.
  • An arrangement direction necessary for evaluating the contents of the test data T is determined.
  • the arrangement direction of the data components is determined from the left to the right.
  • the rightmost data component the leftmost data in the line one level down is displayed. It is assigned and the arrangement direction is determined so as to go to the right from the position.
  • the evaluation components m1, m2, m3, m4, and m5 previously defined in the reference data R are detected in the order of appearance.
  • the data components m1, m4, m3, and m2 of the test data T corresponding to the evaluation components are detected in the order of appearance.
  • the data component of the test data T corresponding to the evaluation component m5 is not detected. That is, among the five evaluation components m1, m2, m3, m4, and m5 previously defined in the reference data R, the data components m1, m2, m3, and m4 are extracted from the test data T and their appearance
  • the order is m1, m4, m3, m2.
  • FIG. 5 shows evaluation components m1, m2, m3, m4, and m5 (upper side in FIG. 5) of the reference data R in consideration of the appearance order, and evaluation components m1, m4, and m3 of the test data T in consideration of the appearance order. , M2 (lower side in FIG. 5).
  • a characteristic coefficient (Order) which is an index indicating the degree of relevance is defined as follows.
  • the characteristic coefficient (Order) is the value of “the two combinations selected from the evaluation components detected in the test data T” with respect to “the number of combinations for selecting two from the evaluation components detected in the test data T”.
  • the ratio is “the same number of combinations as the order of appearance of the evaluation components of the reference data R”. That is, in the denominator, when the number of evaluation components detected in the test data T is N, the number of combinations of two evaluation components among the evaluation components detected in the test data T is N (N -1) / 2.
  • N N -1 / 2.
  • FIGS. 4 and 5 since four evaluation components m1, m2, m3, and m4 are detected in the test data T, there are six patterns. Specifically, it is a combination of (m1, m2), (m1, m3), (m1, m4), (m2, m3), (m2, m4), (m3, m4).
  • the numerator calculates the number of the combinations in which the appearance order of the evaluation components of the reference data R is the same among the two combinations selected from the evaluation components detected in the test data T out of the total number of combinations. To do. Here, only the order of appearance is considered, and the appearance of another constituent element between constituent elements is not considered as an evaluation target. In the examples of FIGS. 4 and 5, there are three combinations (m1, m2), (m1, m3), and (m1, m4) that have the same order of appearance as the reference data R among the above combinations. The presence of m4 between m1 and m3 is not subject to evaluation. Therefore, in this case, the characteristic coefficient (Order) is 0.5.
  • T (N) / F (N) 1. 0. That is, the more evaluation components appear in the test data T in the same order as the reference data R, the higher the relationship between the test data T and the reference data R and the characteristic coefficient (Order) is close to 1. Become. On the other hand, when the relationship between the test data T and the reference data R is low, the characteristic coefficient (Order) is close to zero.
  • the characteristic coefficient (Order) satisfies 0 ⁇ feature coefficient (Order) ⁇ 1.
  • FIG. 6 is a diagram illustrating an example of a functional block configuration of the system 1.
  • the system 1 includes, for example, a reference data acquisition unit 21, a test data acquisition unit 22, an arrangement direction determination unit 23, an evaluation component extraction unit 24, a component storage unit 25, and a component relevance evaluation unit 26.
  • a route from the reference data acquisition unit 21 to the component storage unit 25 via the arrangement direction determination unit 23 and the evaluation component extraction unit 24 is a learning process for the reference data R.
  • the route from the test data acquisition unit 22 to the component relationship evaluation unit 26 via the alignment direction determination unit 23 and the evaluation component extraction unit 24 is related to the reference data R with respect to the test data T. This is a sex assessment process.
  • the reference data acquisition unit 21 acquires the reference data input from the input device 12 or the client terminal 11 or all data components constituting the reference data R already stored in the storage device 10b.
  • the reference data acquisition unit 21 and the test data acquisition unit 22 acquire all the data components, they output the data to the alignment direction determination unit 23, determine the alignment direction of these data components, and configure the data configuration Associate elements. All the data components associated with the arrangement direction are output to the evaluation component extraction unit 24.
  • the determination of the arrangement direction may be omitted depending on the data by using the data arrangement direction when the data is acquired in the reference data acquisition unit 21 and the test data acquisition unit 22 as they are. In this case, the arrangement direction determination unit 23 becomes unnecessary. Further, the determination of the alignment direction may be performed by the reference data acquisition unit 21 and the test data acquisition unit 22 or may be performed by the evaluation component extraction unit 24.
  • the evaluation component extraction unit 24 extracts a component group that most representatively represents the content feature of the reference data R.
  • the user can select a component group using the client terminal 11.
  • the “component group” is a group of data components.
  • the “component group” selected by the evaluation component extraction unit 24 is output to the component storage unit 25.
  • the component storage unit 25 stores the “component group” in the storage device 10 b or the data storage server device 13.
  • the evaluation component extraction unit 24 extracts the evaluation components m1, m2, m3, m4, and m5 from the data components that constitute the reference data R for which the arrangement direction is determined.
  • the number of evaluation components extracted by the evaluation component extraction unit 24 is arbitrarily determined according to the characteristics of the reference data R.
  • the evaluation component extraction unit 24 outputs the extracted evaluation components m1, m2, m3, m4, and m5 to the component storage unit 25.
  • the component storage unit 25 stores in the storage device 10b or the data storage server device 13. The above is the learning process of relevance evaluation.
  • the relevance evaluation process for the test data T with respect to the reference data R will be described.
  • the above description of the arrangement direction determination unit 23 and the evaluation component extraction unit 24 functions similarly in the evaluation process of the relevance evaluation for the test data T with respect to the reference data R. That is, as shown in FIG. 6, similarly to the reference data acquisition unit 21, the test data acquisition unit 22 is also stored in the test data T input from the input device 12 or the client terminal 11 or the storage device 10b. All the data components constituting the test data T being acquired are acquired.
  • the test data acquisition unit 22 When the test data acquisition unit 22 acquires all the data components, the test data acquisition unit 22 outputs the data to the arrangement direction determination unit 23.
  • the reference data acquisition unit 21 and the test data acquisition unit 22 do not need to be configured separately, and can be the same data acquisition unit.
  • the arrangement direction determination unit 23 determines the arrangement direction and associates the data components. All the data components associated with the arrangement direction are output to the evaluation component extraction unit 24.
  • the evaluation component extraction unit 24 stores the evaluation component stored in the storage device 10b or the data storage server device 13 in the arrangement direction. Extract from all data components of the associated test data T. Not all evaluation components are extracted. Among the data components of the test data T, those corresponding to the evaluation components selected in the learning process in the reference data R are extracted in the order of appearance. In the example of FIG.
  • the evaluation component extraction unit 24 extracts evaluation components in the order of appearance of m1, m4, m3, and m2 according to the arrangement direction.
  • the extracted evaluation components m1, m4, m3, and m2 are output to the component relationship evaluation unit 26.
  • the component relevance evaluation unit 26 calculates the characteristic coefficient (Order) described above.
  • the component relevance evaluation unit 26 reads an evaluation value associated with the component input from the evaluation component extraction unit 24 from an arbitrary memory (for example, the storage device 10b), and based on the evaluation value Evaluate the target data.
  • the evaluation value is a weighting value that is set in advance for each evaluation component selected in the reference data R in accordance with their characteristics. More specifically, the component relevance evaluation unit 26 adds, for example, an evaluation value associated with a component that constitutes at least a part of the target data, for example, an index of the target data (for example, target Numerical values, letters, and / or symbols that make the data orderable can be derived. As this index, for example, a score value can be used.
  • the score value (Score) is an index for quantitatively evaluating the strength of relevance of the test data T with respect to the data components of the reference data R.
  • the calculation method of the score value (Score) is not limited.
  • the score value may be calculated by a general method as long as the content of the reference data R can be appropriately evaluated. For example, as an example, with respect to the evaluation value of the evaluation component defined for each evaluation component extracted in the reference data R, the frequency of the evaluation component appearing in the test data T is expressed by the following equation: Can be represented.
  • the component relevance evaluation unit 26 can associate the test data T with the score value and store both in the storage device 10b.
  • reference data R is fetched (S101).
  • the arrangement direction of the data components is determined for the read reference data R (S102).
  • a plurality of data components that best represents the characteristics of the content of the reference data R among the data components are displayed together with the appearance order according to the predefined arrangement direction.
  • an evaluation component group for relevance evaluation S103.
  • the extracted evaluation component group and its appearance order data are stored in the storage device 10b (S104). The above is the learning process using the reference data R for relevance evaluation.
  • test data T is fetched (S105).
  • arrangement direction of the data components constituting the test data T is determined (S106).
  • Evaluation components for relevance evaluation that have been determined in advance in the learning process are extracted from the test data T for which the arrangement direction has been determined (S107).
  • the evaluation components in the extracted test data T are extracted in the same order of appearance in the reference data R (S108).
  • a feature coefficient based on the appearance order of the evaluation components of the test data in the arrangement direction of the test data is calculated.
  • the feature coefficient calculates the degree of coincidence of the appearance order of the selected two combinations among the evaluation constituent elements of the extracted test data T with the appearance order of the evaluation constituent elements of the reference data R defined in advance.
  • the score value (Score) alone has a high relevance to the reference data R.
  • the characteristic coefficient (Order) it can be determined that the larger the characteristic coefficient is, the higher the relevance with the reference data R is. For example, in the case of FIG. 2, when the test data T1 and the test data T2 both have a score value of 70 with respect to the reference data R, the characteristic coefficients (Order of the test data T1 and the test data T2 with respect to the reference data R) ) Are 0.6 and 0.8, respectively, it can be determined that the test data T2 is more relevant to the reference data R.
  • a distribution diagram in which one axis is assigned to the score value and the other axis is assigned to the feature coefficient (Order) Is displayed on a display means such as a display or a printer, and information that allows the user to easily determine the relevance of the test data T to the reference data R is provided to the user by using two elements, “score value” and “feature coefficient”. It is also possible to make it.
  • FIG. 8 shows the algorithm of the program according to the second embodiment.
  • the component relationship evaluation unit 26 in the functional block in FIG. 6 only calculates the feature coefficient (Order).
  • the component relevance evaluation unit 26 calculates the feature coefficient (Order) as a correction value of the score value for the test data T calculated in advance.
  • FIG. 8 shows a program algorithm according to the second embodiment.
  • the steps from the step of taking in the reference data R (S201) to the step of calculating the characteristic coefficient (Order) (S209) are the same as the steps S101 to S109 of the first embodiment.
  • the component relevance evaluation unit 26 calculates the score value (Score RAW ) calculated in advance for the test data T as described below after calculating the feature coefficient (Order). (S210).
  • the score value may be calculated by a general method as long as the content of the reference data R can be appropriately evaluated.
  • the score values are different, making comparison difficult.
  • the characteristic coefficient (Order) is large, the score values are different, making comparison difficult.
  • the score value corrected by the feature coefficient it can be determined that the larger the corrected score value is, the higher the relevance with the reference data R is.
  • the score values (Score RAW ) of the test data T1 and the test data T2 with respect to the reference data R are 72 and 71, respectively
  • the test data T1 and the test data T2 with respect to the reference data R If the feature coefficient (Order) is 0.65 and 0.67 respectively, the score values corrected by the feature coefficient are 45.5 and 46.9, respectively.
  • the score value is higher in the test data T2, it is possible to determine that the test data T2 is more relevant to the reference data R.
  • the test data T1 and the score value (Score RAW ) of the test data T2 with respect to the reference data R are calculated separately from the feature coefficient (Order). That is, this is a form that can be used when the evaluation component group for calculating the score value and the feature coefficient (Order) are different.
  • the calculation of the score value and the calculation of the characteristic coefficient are carried out by a series of processes using a common evaluation component determined in advance by the reference data R.
  • FIG. 9 is a diagram illustrating an example of a functional block configuration of the system 1 according to the third embodiment.
  • the system 1 includes a reference data acquisition unit 21, a test data acquisition unit 22, an arrangement direction determination unit 23, an evaluation component extraction unit 24, and a component storage unit 25. Since these are the same as those in the first embodiment, description thereof is omitted.
  • the third embodiment further includes a component relevance evaluation unit 26, a score value calculation unit 27, and a score value correction unit 28.
  • FIG. 9 is a diagram illustrating an example of a functional block configuration of the system 1 according to the third embodiment.
  • the system 1 includes a reference data acquisition unit 21, a test data acquisition unit 22, an arrangement direction determination unit 23, an evaluation component extraction unit 24, and a component storage unit 25. Since these are the same as those in the first embodiment, only different parts will be described.
  • the third embodiment further includes a component relevance evaluation unit 26, a score value calculation unit 27, and a score value correction unit 28.
  • the evaluation component extraction unit 24 extracts an evaluation component group that most appropriately represents the content of the reference data R, and classifies it into N groups.
  • the score value calculation unit 27 calculates a score value (Score (i) RAW ) for each of the N groups.
  • the score value may be calculated by a general method as long as the content of the reference data R can be appropriately evaluated.
  • the component relevance evaluation unit 26, for each group of evaluation component groups is a feature coefficient (Order) that is the ratio of the same order of appearance as the reference data R in the two combinations selected by the method in the first embodiment. Calculate The calculation method of the characteristic coefficient (Order) is as described in the first embodiment. Then, the score value correcting unit 28 multiplies the score value (Score (i) RAW ) and the feature coefficient (Order) for each group, and calculates the sum as follows.
  • FIG. 10 shows an algorithm in the third embodiment.
  • reference data R is fetched (S301).
  • the arrangement direction of the data components is determined for the read reference data R (S302).
  • a plurality of data components that best represents the characteristics of the content of the reference data R among the data components are displayed together with the appearance order according to the predefined arrangement direction. Extract and define as an evaluation component for relevance evaluation.
  • the evaluation component group is classified into N groups (S303).
  • the extracted evaluation components and their appearance order data are stored in the storage device 10b (S304). The above is the learning process using the reference data R for relevance evaluation.
  • test data is fetched (S305).
  • arrangement direction of the data components constituting the test data T is determined (S306).
  • Evaluation components for relevance evaluation that have been determined in advance in the learning process are extracted from the test data T for which the arrangement direction has been determined (S307).
  • a score value (Score (i) RAW ) is calculated for each of the N groups of evaluation components (S308).
  • Score (i) RAW is calculated for each of the N groups of evaluation components.
  • the same order of appearance in the reference data R is extracted.
  • the degree of coincidence between the appearance order of the two selected combinations with the appearance order of the evaluation component group of the reference data R defined in advance is acquired.
  • the degree of match can be, for example, the feature factor (Order) as a match rate.
  • Order the feature factor
  • the characteristic coefficient (Order) (frequency at which 1 is assigned) / (total number of two combinations) is calculated (S310).
  • the score value (Score (i) RAW ) is multiplied by the feature coefficient (Order), and the sum is calculated (S311).
  • the score value (Score (i) RAW ) and the feature coefficient (Order) are calculated by the same evaluation component group, the calculation process is simplified and the score value is easily calculated.
  • the determination based on the corrected score value is the same as in the first and second embodiments.
  • the control block of the data analysis system may be realized by a logic circuit (hardware) formed on an integrated circuit (IC chip) or the like, or may be realized by software using a CPU.
  • the system includes a CPU that executes a program (control program for the data analysis system) that is software that implements each function, and a ROM (in which the program and various data are recorded so as to be readable by the computer (or CPU)).
  • a Read Only Memory or a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for developing the program, and the like are provided.
  • the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it.
  • a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used.
  • the program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program.
  • the present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission. Note that the above program can be implemented in any programming language. Also, any recording medium that records the above program falls within the scope of the present invention.
  • Such systems include, for example, discovery support systems, forensic systems, e-mail monitoring systems, medical application systems (eg, pharmacovigilance support systems, clinical trial efficiency systems, medical risk hedging systems, fall prediction (fall prevention) systems, prognosis predictions) System, diagnosis support system, etc.), Internet application system (eg, smart mail system, information aggregation (curation) system, user monitoring system, social media management system, etc.), information leakage detection system, project evaluation system, marketing support system, Artificial intelligence systems that analyze big data, such as intellectual property evaluation systems, fraud monitoring systems, call center escalation systems, credit check systems The relevance of a given cases may be implemented as any system) can be evaluated.
  • medical application systems eg, pharmacovigilance support systems, clinical trial efficiency systems, medical risk hedging systems, fall prediction (fall prevention) systems, prognosis predictions) System, diagnosis support system, etc.
  • Internet application system eg, smart mail system, information aggregation (curation) system, user monitoring system, social media management system,
  • preprocessing for example, extracting an important part from the data and extracting only the important part from the data
  • the analysis target may be applied), or the mode of displaying the data analysis result may be changed. It will be understood by those skilled in the art that a variety of such variations can exist, and all variations fall within the scope of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un système d'évaluation de pertinence conçu pour évaluer la pertinence de données examinées par rapport à des données de référence, et comprenant : une unité d'acquisition de données destinée à acquérir chaque donnée de référence et chaque donnée examinée; une unité d'extraction de composantes d'évaluation, destinée à extraire, à partir des données examinées, des composantes d'évaluation représentant des caractéristiques des données de référence parmi des composantes de données des données de référence dans l'ordre d'apparition correspondant à une direction d'agencement de composantes de données des données examinées; et une unité d'évaluation de pertinence destinée à calculer un coefficient de caractéristiques sur la base de l'ordre d'apparition des composantes d'évaluation des données examinées dans la direction d'agencement des données examinées. Même lorsque plusieurs groupes de données examinées ne présentent aucune différence dans des valeurs de score représentant des caractéristiques, il est possible de reconnaître un groupe de données examinées qui présente un degré élevé de pertinence par rapport à un groupe de données de référence.
PCT/JP2015/005479 2015-10-30 2015-10-30 Système et procédé d'évaluation de pertinence, programme, et support d'enregistrement WO2017072822A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2015/005479 WO2017072822A1 (fr) 2015-10-30 2015-10-30 Système et procédé d'évaluation de pertinence, programme, et support d'enregistrement
JP2017547201A JPWO2017072822A1 (ja) 2015-10-30 2015-10-30 関連性評価システム、方法、プログラムおよび記録媒体

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/005479 WO2017072822A1 (fr) 2015-10-30 2015-10-30 Système et procédé d'évaluation de pertinence, programme, et support d'enregistrement

Publications (1)

Publication Number Publication Date
WO2017072822A1 true WO2017072822A1 (fr) 2017-05-04

Family

ID=58629917

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/005479 WO2017072822A1 (fr) 2015-10-30 2015-10-30 Système et procédé d'évaluation de pertinence, programme, et support d'enregistrement

Country Status (2)

Country Link
JP (1) JPWO2017072822A1 (fr)
WO (1) WO2017072822A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010055373A (ja) * 2008-08-28 2010-03-11 Sky Co Ltd ノート評価装置またはノート評価プログラム
JP2011113426A (ja) * 2009-11-30 2011-06-09 Fujitsu Ltd 辞書作成装置,辞書作成プログラムおよび辞書作成方法
JP2012252484A (ja) * 2011-06-02 2012-12-20 Hitachi Systems Ltd 回答自動生成システム

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006277413A (ja) * 2005-03-29 2006-10-12 Toshiba Corp 文書分類装置および文書分類方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010055373A (ja) * 2008-08-28 2010-03-11 Sky Co Ltd ノート評価装置またはノート評価プログラム
JP2011113426A (ja) * 2009-11-30 2011-06-09 Fujitsu Ltd 辞書作成装置,辞書作成プログラムおよび辞書作成方法
JP2012252484A (ja) * 2011-06-02 2012-12-20 Hitachi Systems Ltd 回答自動生成システム

Also Published As

Publication number Publication date
JPWO2017072822A1 (ja) 2018-07-26

Similar Documents

Publication Publication Date Title
JP6402265B2 (ja) 意思決定モデルを構築する方法、コンピュータデバイス及び記憶デバイス
CN111028006B (zh) 一种业务投放辅助方法、业务投放方法及相关装置
CN112017777B (zh) 相似对问题预测的方法、装置及电子设备
CN111159563A (zh) 用户兴趣点信息的确定方法、装置、设备及存储介质
CN110968664A (zh) 一种文书检索方法、装置、设备及介质
CN114650447B (zh) 一种确定视频内容异常程度的方法、装置及计算设备
US20160335249A1 (en) Information processing apparatus, information processing method, and non-transitory computer readable medium
CN113468421A (zh) 基于向量匹配技术的产品推荐方法、装置、设备及介质
CN112541069A (zh) 一种结合关键词的文本匹配方法、系统、终端及存储介质
CN110210572B (zh) 图像分类方法、装置、存储介质及设备
US11232325B2 (en) Data analysis system, method for controlling data analysis system, and recording medium
JP6144314B2 (ja) データ分類システム,方法,プログラムおよびその記録媒体
JP6026036B1 (ja) データ分析システム、その制御方法、プログラム、及び、記録媒体
WO2017072822A1 (fr) Système et procédé d'évaluation de pertinence, programme, et support d'enregistrement
CN114595787A (zh) 推荐模型的训练方法、推荐方法、装置、介质及设备
JP6509391B1 (ja) 計算機システム
US11514311B2 (en) Automated data slicing based on an artificial neural network
CN113688206A (zh) 基于文本识别的趋势分析方法、装置、设备及介质
CN113704623A (zh) 一种数据推荐方法、装置、设备及存储介质
CN115769194A (zh) 跨数据集的自动数据链接
KR20210023453A (ko) 리뷰 광고 매칭 장치 및 방법
JP5946949B1 (ja) データ分析システム、その制御方法、プログラム、および、記録媒体
CN114579762B (zh) 知识图谱对齐方法、装置、设备、存储介质及程序产品
US20080120263A1 (en) Computer-readable recording medium, apparatus and method for calculating scale-parameter
Singh et al. Application of error level analysis in image spam classification using deep learning model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15907186

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017547201

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15907186

Country of ref document: EP

Kind code of ref document: A1