WO2015045091A1 - Procédé et programme d'extraction de superstructure en apprentissage structural de réseau bayésien - Google Patents

Procédé et programme d'extraction de superstructure en apprentissage structural de réseau bayésien Download PDF

Info

Publication number
WO2015045091A1
WO2015045091A1 PCT/JP2013/076245 JP2013076245W WO2015045091A1 WO 2015045091 A1 WO2015045091 A1 WO 2015045091A1 JP 2013076245 W JP2013076245 W JP 2013076245W WO 2015045091 A1 WO2015045091 A1 WO 2015045091A1
Authority
WO
WIPO (PCT)
Prior art keywords
edge
variable
graph
bold
storage unit
Prior art date
Application number
PCT/JP2013/076245
Other languages
English (en)
Japanese (ja)
Inventor
民平 森下
真臣 植野
Original Assignee
株式会社シーエーシー
国立大学法人電気通信大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社シーエーシー, 国立大学法人電気通信大学 filed Critical 株式会社シーエーシー
Priority to PCT/JP2013/076245 priority Critical patent/WO2015045091A1/fr
Publication of WO2015045091A1 publication Critical patent/WO2015045091A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • the present invention relates to a method and a program for superstructure extraction in Bayesian network structure learning.
  • Non-Patent Document 1 This is a technique of extracting a narrowed search space called a super-structure by high-speed CI-based learning and searching a Bayesian network from the superstructure by score learning with high estimation accuracy. Thereby, a highly accurate Bayesian network can be learned efficiently.
  • the weak point of the hybrid method using the superstructure is that the edge existing in the true network is often deleted at the time of superstructure extraction, that is, a missing edge is generated. If a lost side occurs in the superstructure, the lost side cannot be restored even if the superstructure is searched by score learning. For this reason, a vanishing edge occurs also in a Bayesian network as an output.
  • the embodiment of the present invention provides a CI-based learning method for extracting a superstructure with fewer vanishing edges than in the past.
  • the embodiment of the present invention is a program and method for extracting a superstructure from input data.
  • the program and method cause a computer to execute the following steps.
  • FIG. 6 shows a flowchart of a CI-based learning method using edge direction according to an embodiment.
  • combines the result and returns as a superstructure is shown.
  • the pseudo code of the conventional RAI is shown.
  • 1 illustrates a conventional orientation routine.
  • combines the result and returns as a superstructure is shown.
  • FIG. 7 shows a modified version of RAI (RAIEX) processing of the embodiment called in the processing of FIG.
  • RAIEX modified version of RAI
  • Fig. 4 illustrates a new orientation routine of an embodiment of the present invention.
  • 11 shows a flowchart of the process of FIG. Fig. 4 shows a routine for determining the orientation of two sides of a v-structure of an embodiment of the present invention.
  • Fig. 13 shows a flowchart of the processing of Fig. 12.
  • the flow of a process of the Example of this invention is shown.
  • the graph structure of the Bayesian network used in the experiment is shown.
  • the graph structure of the Bayesian network used in the experiment is shown.
  • the graph structure of the Bayesian network used in the experiment is shown.
  • the graph structure of the Bayesian network used in the experiment is shown.
  • the graph structure of the Bayesian network used in the experiment is shown.
  • the graph structure of the Bayesian network used in the experiment is shown.
  • the graph structure of the Bayesian network used in the experiment is shown.
  • variable set is expressed as Z (bold).
  • the variable sets X (bold) and Y (bold) are conditionally independent given Z (bold) as Ind (X (bold); Y (bold)
  • the set symbol is omitted as appropriate.
  • Z) is written instead of Ind ( ⁇ X ⁇ ; ⁇ Y ⁇
  • Z (bold) is called a separated set of X (bold) and Y (bold).
  • Testing based on data whether variables X and Y are conditionally independent given a variable set Z (bold) is called a conditional independent test or simply a test, and Test (X; Y
  • Z (bold)) is true if Ind (X; Y
  • Z (bold)) is called a conditional variable set, and
  • the initial state is a completely undirected graph, and if a separated set of arbitrary two variables X and Y is found by a conditional independent test, an edge between XY is deleted from the graph.
  • the undirected side between the vertices X and Y is written as XY, and the directed side from X to Y is written as X ⁇ Y or Y ⁇ X. If the direction of the side is not distinguished, write X *-* Y. Vertices X and Y are said to be adjacent to each other when side X *-* Y is present.
  • X is called the parent of Y
  • Y is called the child of X.
  • Adj (bold) (X, g), Pa (bold) (X, g), Ch (bold) (X, g) are the adjacent variable set, parent variable set, child of variable (vertex) X on graph g Each variable set is represented.
  • a route in which all the sides of the route are undirected sides is called an undirected path.
  • a path traced from the start point X (1) to the end point X (n) according to the direction of the directed side is called a directed path.
  • a path in which the start point X (1) and the end point X (n) are the same variable is called a closed path.
  • a closed undirected path is called a loop, and a closed directed path is called a cycle.
  • a directed graph that does not have a cycle is called a directed acyclic graph, or DAG for short.
  • Variable sets X (bold) and Y (bold) are said to be directional-separated (d-separated) on DAG g given variable set Z (bold) when the following conditions are satisfied: ⁇ X ⁇ X (Bold) and ⁇ Y ⁇ Y (bold), all paths ⁇ of X and Y satisfy one of the following two properties: (1) On the path ⁇ , not a confluence, and Z ( There is a variable that is the origin of (bold); (2) there is a confluence on the path ⁇ where the confluence or the descendant of that confluence is not an element of Z (bold). On DAGg, X (bold) and Y (bold) are directed and separated with Z (bold) given as Dsep g (X (bold); Y (bold)
  • the first element g ⁇ V (bold)
  • X N ⁇ is a directed acyclic graph (DAG) consisting of a vertex set V (bold) and a directed edge set E (bold) representing the dependency between the variables.
  • a graph g represents an independent relationship between variables.
  • ⁇ i j).
  • Bayesian network B (bold) defines a unique joint probability distribution on variable set U (bold) as follows.
  • Bayesian network learning is roughly divided into a score-based learning method and a CI (conditional independence) -based learning method.
  • the CI -based learning method is also called a constraint-based learning method.
  • the score-based learning method calculates a statistical score of a candidate model (directed graph) and uses a model having the maximum score as a solution, and has a relatively high structure estimation accuracy.
  • score learning of a Bayesian network is NP-hard.
  • the CI-based learning method learns a Bayesian network by a conditional independent test between random variables, and relatively high-speed learning is possible.
  • Many CI-based methods first learn Bayesian network skeletons, that is, undirected graphs with edge orientations removed from the Bayesian network graph structure by conditional independence testing, and then the resulting conditional independence And edge orientation using the DAG constraints.
  • hybrid methods have been proposed that take advantage of high-speed CI-based learning and high-precision score learning.
  • the hybrid method first, a narrowed search space called a super-structure is extracted by CI-based learning.
  • the superstructure means an undirected graph including a skeleton of a true Bayesian network as a subgraph.
  • a Bayesian network is searched from the superstructure by score learning. That is, the Bayesian network is learned under the restriction that the directed edge that should exist in the output Bayesian network also exists in the superstructure as an undirected edge with the edge direction removed. Ordyniak et al.
  • the tree width is a value indicating the computational complexity of the graph.
  • the tree width increases as the number of parent variables increases and the number of loops in the skeleton of the Bayesian network increases. Become.
  • MMPC Max-Min Parents and Children: Tsabranos, I., Brown, L. E., and Aliferis, C. F., "The max-min hill-climbing Bayesian network structure learning algorithm, ”Machine Learning, Vol. 65, pp. 31-78 (2006)
  • HPC Hybrid Parents and Children: Rodrigues de Morais, S. and Aussem, A.,“ An Ecalableandand ” Algorithm for Local Bayesian Network Structure Discovery, in ECML PKDD '10, ”Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases-Part III, pp. 164-179, elBerlin.
  • HPC can reduce the number of erasures, although the superstructure has more surplus edges than MMPC.
  • MMPC and HPC are CI-based methods that extract undirected graphs based on conditional independent tests, but do not perform edge orientation.
  • FIG. 1 shows a schematic diagram of a computer 100 that executes the method of the embodiment of the present invention or executes the method of the present invention by the program of the embodiment of the present invention.
  • the method of the embodiment of the present invention may be executed by the computer 100 or the processor of FIG. 1 or may be executed by causing a computer-executable instruction to operate as each component shown in FIG.
  • An embodiment of the present invention may be a computer-readable storage medium storing such computer-executable instructions.
  • the input to the computer 100, the output from the computer 100, and each component of the computer 100 will be described below.
  • the computer 100 may read the following data and data specification description file as inputs.
  • the data store for storing data may be a file, a relational database, a two-dimensional array on a memory, or the like.
  • Each column corresponds to each random variable, and each row includes the state (realized value) of the corresponding random variable.
  • the types of coupons used by the customer are T1 and T2 (cannot be used together; n if no coupon is used) If the purchased product is represented by y and the product that has not been purchased is represented by n, the purchase data for six people is represented as shown in Table 1.
  • Data specification description file This is a file that describes what random variables and their states (realized values) are included in the above-mentioned “data”.
  • the data specification indicates that each row has a random variable name, state 1, state 2,. . . , Described in CSV format to have state n.
  • the types of coupons used by customers are T1 and T2 (cannot be used together, and coupons are not used.
  • n a purchased product is represented by y
  • n B a non-purchased product is represented by n
  • n C a purchase behavior history data of the customer and its actual value
  • Coupon T1, T2, n A, y, n B, y, n C, y, n D, y, n
  • a superstructure description file is output by the method or program of the embodiment of the present invention. This is the file that describes the estimated superstructure. In this file, variable pairs with sides exist are separated by commas (,) line by line. For example, in the above example, if it is estimated that there are edges between Coupon and A, between Coupon and D, between A and B, and between B and C, the superstructure is superstructure. It is described as follows in the description file. Coupon, A Coupon, D A, B B, C
  • the program control unit 102 is a component that controls the overall processing flow.
  • the program control unit 102 checks the arguments and parameters of the program as preprocessing, and if they are normal, causes the data specification analysis unit 104 to analyze the data specification.
  • the program control unit 102 further causes the CI base structure learning unit 106 to execute main processing.
  • the data specification analysis unit 104 reads the data specification description file and prepares to analyze the data that is the main input.
  • the data specification analysis unit 104 stores the name of each random variable, the number of random variables, the state name of each random variable, the number of states of each random variable, and the total number of data items included in the data. Providing information.
  • the CI-based structure learning unit 106 is a component that executes a CI-based learning algorithm and extracts a superstructure from data, and executes main processing.
  • the conditional independent test execution unit 108 is a component that executes a conditional independent test.
  • the conditional independent test execution unit 108 has a function of caching the execution result, and when the result of the conditional independent test that has already been executed is requested, the cached result is returned.
  • the separated set holding unit 110 determines that Ind (X; Y
  • the separated set Z (( Bold) is held and managed by a hash using a variable pair as a key.
  • the graph structure construction unit 112 constructs a super structure graph structure estimated by the CI base structure learning unit 106.
  • the graph structure construction unit 112 constructs 1) an array of nodes representing random variables, and 2) an array of directed edges or undirected edges representing dependencies between random variable pairs as a data structure shared with other components. And manage this.
  • the calculation results obtained in each component may be appropriately stored in a storage device such as a memory and used for subsequent calculations.
  • the results obtained so far may be stored in a storage device and used for subsequent calculations by the same component.
  • the method of the present invention is an improved CI-based learning technique that uses edge direction (ie parent-child relationship between variables) for selection of conditional independent tests.
  • edge direction ie parent-child relationship between variables
  • Many CI-based learning methods perform edge orientation only once at the end.
  • CI-based learning methods that use edge orientation perform edge orientation multiple times in the middle, and use the resulting variables as a result.
  • Based on the parent-child relationship determine what conditional independence tests will be performed thereafter.
  • An example of a CI-based learning method that uses the direction of an edge is RAI described later.
  • the present invention can be applied not only to RAI but also to a CI-based learning method that uses the direction of an edge.
  • a CI-based learning method using edge directions according to an embodiment of the present invention will be described.
  • FIG. 2 shows a flowchart of a CI-based learning method using the edge direction according to the embodiment.
  • the method of FIG. 2 may be executed by the computer 100 or the processor of FIG. 1 or computer-executable instructions may be executed by causing the computer to operate as each component shown in FIG.
  • An embodiment of the present invention may be a computer-readable storage medium storing such computer-executable instructions.
  • step 202 the output graph G is initialized as a completely undirected graph.
  • step 204 Sep (bold), which is a set of all separated sets, is initialized as an empty set.
  • step 206 the order n of the conditional independent test is set to zero.
  • step 208 the following steps (1) to (3) are repeated for all variable pairs ⁇ X; Y ⁇ having edges on the graph G.
  • the latent parent variable set Z (bold) Pap (bold) (X, G) ⁇ Pap (bold) (Y, G) of X and Y is specified (step 210).
  • S (bold)) is performed for all S (bold) that becomes
  • n, S (bold) ⁇ Z (bold) (step 212).
  • step 214 If the result of the test is Ind (X; Y
  • S (bold)) (“Yes” in step 214), the separation set Sep (bold) is set to Sep (bold) Sep (bold) ⁇ ⁇ X, Y ⁇ , S (bold)> ⁇ is updated (step 216), and the side X *-* Y is deleted from the graph G (step 218).
  • step 220 the orientation of the edges in the graph G is performed.
  • a new approach is directed here.
  • the contradiction of v-structure estimation is detected by the occurrence of a collision side, and the direction of the collision side is determined to be one of the probabilities.
  • the direction determined for the collision edge is maintained and managed separately for each individual execution (thread).
  • the orientation of the present invention is to determine and maintain the direction of the collision edge separately for each thread in addition to the graph G to be oriented and the separated set Sep (bold) of the deleted edges (variable pairs).
  • an edge parent set E p (bold) having a collision edge and a set of parent variables in the edge as elements.
  • the edge parent set may be implemented as a hash with an edge whose direction is ignored as a key and a parent variable in the edge as a value.
  • the orientation in the embodiment of the present invention when searching for a set of three variables X *-* Z *-* Y that are candidates for the v-structure, the direction of the edge between these variables is ignored. As a result, it is possible to determine whether or not the three variables of interest are the v-structure without being affected by the previously estimated v-structure.
  • the orientation of the two sides of the v-structure is determined.
  • the parent variable S p and the child variable S c where estimation of v- structure is correct. If first time v- directing the execution of structures for the sides, the sides because still undirected edges, the parent variable S p as estimated, determining the edges of the faces child variable as S c. If the direction is not determined for the side for the first time, and the side is already oriented in the opposite direction to the estimated v-structure, the side is detected as a collision side. When this collision edge is detected for the first time, the direction in this thread is selected according to a predetermined probability, and the decision is added to the edge parent set E p (bold) and held.
  • the direction of the edge is determined as much as possible according to a rule called an orientation rule that is derived so as not to contradict the constraints of the DAG.
  • step 222 the order n of the conditional independent test is incremented by one.
  • steps 224 and 226 latent parent variable sets are specified for all variable pairs ⁇ X, Y ⁇ of the remaining edges on the graph G. If there is a variable pair whose latent parent variable set size
  • the embodiment of the present invention performs a new edge orientation process in the CI-based learning method that uses the edge direction to select the conditional variable set S (bold) in the conditional independent test. Detect errors.
  • the embodiment of the present invention further extracts a superstructure with few missing edges by synthesizing the execution results of the above processing.
  • the present invention superimposes a plurality of the above-mentioned new CI-based learning results that output different graphs, so that even if there is an erasure edge in each CI-based learning result, an erasure edge is obtained by other CI-based learning. Can be supplemented.
  • FIG. 3 shows a flow of main processing in which a plurality of CI-based learnings of the present invention having different operations are executed, and the results are combined and returned as a superstructure.
  • This process may be executed by the computer 100 or the processor of FIG. 1, or may be executed by causing a computer-executable instruction to operate the computer as each component shown in FIG.
  • step 302 initial processing is executed.
  • the program control unit 102 includes the database connection information, the data specification description file name, the significance level ⁇ of the conditional independent test, the number of threads t (the number of parallel executions of CI-based learning in the embodiment of the present invention), the superstructure description file name Check operating parameters including at least one. If there is an error, the program control unit 102 displays the error on a display device or the like and ends the program. If it is normal, the program control unit 102 continues the processing and causes the data specification analysis unit 104 to analyze the data specification.
  • the data specification analysis unit 104 reads the data specification description file, and holds the names of the random variables, the number of random variables, the names of all the states that can be taken by the random variables, and the number of states. Next, the data specification analyzing unit 104 accesses the database using the database connection information, acquires the number of all data, and holds it.
  • the program control unit 102 transfers control to the CI base structure learning unit 106.
  • the CI base structure learning unit 106 performs CI base learning according to the above-described embodiment of the present invention.
  • step 304 CI-based learning execution threads having the specified number of parallel executions are generated.
  • step 306 the above-described CI-based learning thread of the present invention is executed in parallel.
  • step 308 the parent thread that executes CI-based learning waits until any execution thread ends.
  • step 310 a union of the edge set in the graph obtained from the terminated thread and the edge set obtained so far is generated and used as a superstructure. If there is an unprocessed thread (“Yes” in step 312), the process returns to step 308.
  • step 314 the graph structure construction unit 112 receives the super structure graph structure from the CI base structure learning unit 106, and generates an output in accordance with the specifications of the super structure description file. Processing ends at step 316.
  • RAI Recursive Autonomy Identification
  • edge orientation results to improve estimation accuracy and computational efficiency of CI-based methods
  • CI-based methods Yamamoto, R. and Lerner, B. “Bayesian Network Structure Learning by Recursive Autonomy” Identification, ” Journal of Machine Learning Research, Vol. 10, pp. 1527-1570 (2009)
  • RAI uses the result of edge orientation in graph structure learning to improve estimation accuracy and calculation efficiency.
  • the orientation of edges depends on the results of statistical tests (conditional independent tests), the edges can be misdirected under realistic conditions with finite samples. This misorientation causes conditional independence tests that are not necessary in nature, thereby causing lost edges.
  • RAI can be used as basic CI-based learning.
  • the embodiment of the present invention realizes extraction of a superstructure with fewer erasures compared to the conventional method using simply RAI.
  • edge orientation depends on the results of statistical tests (conditional independent tests). This can lead to incorrect orientation under realistic conditions of finite samples.
  • misorientation causes a vanishing edge by causing a conditional independent test that is not necessary in nature.
  • the present invention detects erroneous orientation based on orientation contradictions and synthesizes superstructures based on possible orientations to prevent the occurrence of superstructure disappearance edges.
  • RAI that can be used as a CI-based learning method that is the basis of one embodiment of the present invention
  • the disadvantages of using RAI as it is for superstructure extraction will be described.
  • RAI is a technique that avoids high-order conditional independent testing with low reliability and high calculation cost by using the parent-child relationship between variables obtained by the orientation of edges, and improves the accuracy and reduces the amount of calculation. is there.
  • RAI uses a parent-child relationship of the vertices (variables) caused by orientation, recursively decomposed into the ancestor partial structure g A comprising a graph from the progeny substructure g D and other variables.
  • a descendant substructure is more formally defined as an autonomous sub-structure (Definition 2).
  • the parent variable in the ancestor partial structure is defined as an exogenous cause (Definition 1).
  • DAGg ⁇ V (bold), E (bold)> V A (bold) ⁇ V (bold) and E A (bold) ⁇ E (bold) structure
  • g A ⁇ V A (bold), E A (bold)> that if ⁇ X ⁇ V A (bold), Pa (bold) (X, g) ⁇ ⁇ V A ( bold) ⁇ V ex (bold ) ⁇ , It is said that it is autonomous in g given the exogenous cause V ex (bold) ⁇ V (bold) of g A.
  • conditional independent tests contributes to the suppression of lost edges.
  • the most extreme case is a case where no conditional independence test is performed, and therefore no edge deletion is performed, and a completely undirected graph is output.
  • this completely undirected graph is regarded as a superstructure, the search space is not narrowed down at all. Instead, no superstructure disappearance occurs.
  • RAI is not a technique developed for the purpose of superstructure extraction, it reduces conditional independent tests and, as a result, suppresses erasure edges when used as a superstructure estimation technique.
  • the first mechanism is control of the test order based on graph decomposition.
  • RAI recursively decomposes a graph into descendant substructures and ancestor substructures using parent-child relationships between variables estimated by edge orientation.
  • the RAI selects edges (variable pairs) subject to conditional independence tests in the following order: 1) an edge inside the ancestor partial structure, 2) an edge connecting the ancestor partial structure and the descendant partial structure, and (3) a descendant Side inside the substructure.
  • edges variable pairs subject to conditional independence tests in the following order: 1) an edge inside the ancestor partial structure, 2) an edge connecting the ancestor partial structure and the descendant partial structure, and (3) a descendant Side inside the substructure.
  • the second mechanism is to reduce the condition variable set size based on edge orientation, which is based on the following lemma.
  • Lemma 1 In DAG, if X and Y are not adjacent and X is not a descendant of Y, X and Y are directed and separated given Pa (bold) (Y).
  • lemma 2 derived from lemma 1, if variables X and Y are both elements of g D , the existence of edge XY is similarly limited to a limited variable set S (bold) ⁇ Pa This can be determined by checking p (bold) (Y) ⁇ ⁇ X ⁇ .
  • the embodiment of the present invention replaces the directing routine called in line 10 and line 15 and operates a plurality of RAIs whose operations are changed.
  • edge orientation in the existing CI-based method such as RAI will be described.
  • a conventional orientation routine orientEdgeTrad is shown in FIG.
  • the edge orientation routine uses as input the graph to be oriented and the entire separated set Sep (bold) of the deleted edge (variable pair).
  • the basic idea of edge orientation is to first determine the direction of a particular edge from the segregated set obtained from the conditional independence test, and then the orientation rule (orientation rule) derived so as not to conflict with the DAG constraints.
  • the direction of the side is determined as much as possible according to a rule called).
  • the target of determining the direction from the separated set is three variables and two sides with a connection method of XZY (however, there is no side between X and Y).
  • XZY can be oriented as X ⁇ Z ⁇ Y.
  • This X ⁇ Z ⁇ Y is called a v-structure.
  • the v-structure is estimated from line 2 to line 5, and thereafter, orientation is performed according to the orientation rule.
  • the problem with conventional edge orientation is that orientation errors tend to occur at the time of v-structure estimation.
  • a new method of orientation instead of conventional orientation is used to detect orientation errors during v-structure estimation.
  • the embodiment of the present invention detects a misorientation of an edge that causes an occurrence of a missing edge in RAI, and combines the execution results of a plurality of RAIs that attempt possible orientation, thereby superstructure having few missing edges. To extract.
  • the embodiment of the present invention mainly extracts 1) a superstructure with few missing edges, and 2) detects an orientation error, on the premise that there is a learning error.
  • multi-threading is used for parallel execution of CI-based learning.
  • parallel execution of CI-based learning may be implemented as a multi-process, or distributed parallel by a plurality of computers may be used for load distribution.
  • the modified RAI is executed in parallel, but another method may be executed.
  • the same learning technique does not have to be performed on individual threads.
  • FIG. 7 explains the modified version of RAI (RAIEX) processing of the embodiment of the present invention called in the processing of FIG.
  • the present embodiment calls the extended side orientation routine (described later in FIG. 10) of the embodiment of the present invention instead of the conventional orientation routine (FIG. 5) in the 10th and 15th lines. And different from conventional RAI.
  • a variable S (bold) that is a subset S (bold) of the union (excluding X) of the latent parent variable set of Y included in G start and the parent variable set of Y included in G ex If there are n and X and Y are independent S (bold) given S (bold), ⁇ X, Y ⁇ , S (bold)> is added to the separation set Sep (bold) Add a source, sides X * - remove * Y from G all.
  • a set of separation variables of the deleted side X *-* Y is obtained. If Z is not included in this separation variable set, for each side of X *-* Z *-* Y, If the direction of the side is not defined, the direction is as estimated. If the side is already oriented in the opposite direction, determine that it is a collision side, If the collision edge is detected for the first time, it is directed with a predetermined probability. If it is not the first collision detected, leave it alone. The direction from the 8th line to the 13th line in FIG.
  • variable set having the lowest topological order is set as a descendant subset gD, which is temporarily deleted from gstart, and the remaining unconnected variable sets are respectively ancestor subsets G_A1, G_A2,. . . , G_Ak (If an ancestor subset is specified, G_D temporarily deleted from G_start is restored).
  • G_ex_D is changed to ⁇ G_A1, G_A2,. . . , G_Ak, G_ex ⁇ .
  • Edge Direction Collision Detection Here, the cause of the occurrence of an edge orientation error, the reason for the orientation error leading to the disappearance edge occurrence, and the method for detecting the edge orientation error will be described.
  • inventions of the present invention detects an orientation error at the time of v-structure estimation in order to suppress the occurrence of a missing edge due to the orientation error described with reference to FIG.
  • embodiments of the present invention provide v-structures that give different orientations to X *-* Y when multiple v-structures with a single side X *-* Y are inferred. If there is, it is determined that any v-structure estimation is incorrect.
  • an orientation collision a situation in which different directions are given to the side X *-* Y by a plurality of v-structures is called an orientation collision, and this state is represented by sides X ⁇ Y in both directions.
  • X ⁇ Y is called a collision side.
  • a contradiction in v-structure estimation is detected by the occurrence of a collision side, and the direction of the collision side is determined to be one of the probabilities.
  • the direction determined for the collision edge is maintained and managed separately for each individual execution (thread).
  • the new orientation routine orientEdge of the embodiment of the present invention is shown as Algorithm 5 in FIG.
  • FIG. 11 shows a flowchart of the processing of FIG.
  • the arguments of the orientation routine orientEdge are all thread-local and are initialized for each thread to take a unique state.
  • the orientEdge routine in addition to the arguments of the conventional edge orientation routine, determines the direction of the collision edge separately for each thread, and maintains an edge parent set consisting of the pair of the collision edge and the parent variable on that edge as an element. Take E p (bold) as an argument.
  • the edge parent set may be implemented as a hash with an edge whose direction is ignored as a key and a parent variable in the edge as a value.
  • the orientation routine orientEdge in the embodiment of the present invention, when searching for a set of three variables X *-* Z *-* Y that are candidates for the v-structure, the direction of the edge between these variables is ignored. Please be careful. As a result, it is possible to determine whether or not the three variables of interest currently form the v-structure without being affected by the previously estimated v-structure.
  • the orientationVStructure routine that determines the orientation of the two sides of the v-structure is shown as Algorithm 6 in FIG.
  • FIG. 13 shows a flowchart of the processing of FIG. orientVStructure routine, v- called for each of the sides was estimated to be structured, parent variable S p and the child variable S c where estimation is correct v- structure is passed. If first time orientVStructure calls for the sides, since at the time calling side is a still undirected edges, rows parent variable from S p, the child variable orienting side as S c (rows 8 as estimated 9).
  • the edge is determined to be a collision edge. If this collision edge is detected for the first time (line 3), the direction in this thread is selected stochastically (line 4), and the decision is added to the edge parent set E p (bold) and held.
  • the direction is determined with a probability of 1/2 (line 4 in FIG. 12).
  • the direction following the larger v-structure may be selected with a high probability.
  • the v-structure set having the same edge direction and the v-structure set having the opposite direction are: Each may take the average of the p-values of the corresponding test and assign the direction with a probability according to the ratio.
  • FIG. 14 shows a processing flow of the embodiment of the present invention.
  • the following processing may be executed by the computer 100 or the processor of FIG. 1 or may be executed by causing a computer-executable instruction to operate the computer as each component shown in FIG.
  • An embodiment of the present invention may be a computer-readable storage medium storing such computer-executable instructions.
  • step 1402 initial processing is executed.
  • preparations for learning such as program operation parameter check are performed.
  • the program control unit 102 executes the database connection information, the data specification description file name, the significance level ⁇ of the conditional independent test, the number of threads t (CI base learning of the embodiment is executed in parallel) passed from the command line argument or the like.
  • Check the operating parameters such as the superstructure description file name. If there is an error in the initial process, the error is displayed on the display device and the process is terminated.
  • the initial process is normally executed, the process is continued, and the data specification analysis unit 104 analyzes the data specification.
  • the data specification analysis unit 104 reads the data specification description file, and holds the names of the random variables, the number of random variables, the names of all the states that can be taken by the random variables, and the number of states. Next, the data specification analyzing unit 104 accesses the database using the database connection information, acquires the number of all data, and holds it. Next, the program control unit 102 transfers control to the CI base structure learning unit 806. In step 1404, the CI base structure learning unit 106 that has received control after the initial process executes the main process of the present invention (for example, a process corresponding to FIG. 6). RAIEX execution threads are generated for the specified number of parallel executions.
  • RAIEX using the new edge orientation technique is executed in parallel in each thread.
  • the parent thread that executes the main process waits until one of the RAIEX execution threads ends.
  • the superstructure is generated as the union of the edge set of edge set E of the graph g out an execution result of RAIEX obtained from terminated thread (bold) (g out) so far. If there is an unprocessed child thread (“Yes” in step 1412), the process returns to waiting for an execution thread in step 1108.
  • the graph structure construction unit 112 receives the super structure graph structure from the CI base structure learning unit 106, and outputs it according to the specifications of the super structure description file.
  • the process ends. The more specific processing of FIG. 14 is as already described.
  • Table 1 shows the Bayesian network used as an experiment target.
  • the network structure is shown in FIGS. These are based on widely published ones.
  • Win95pts was obtained from GeNIe & SMILE network repository (http://genie.sis.pitt.edu/index.php/network-repository), and the remaining Alarm, Insurance, and ⁇ Water ⁇ were Bayesian Network Repository (http: //www.cs.huji.ac.il/ ⁇ galel/Repository/)
  • the number of variables was reduced to 25 in the three networks of Alarm, Win95pts, and Water.
  • Each column from Missing Edge to SS Time in the table represents the superstructure extracted by the algorithm shown on the left side of the table. Missing Edge represents a missing edge, Extra Edge represents a surplus edge, Degree represents an average order, Max Degree represents a maximum order, and SS Time represents the number of seconds required from the start to the superstructure output.
  • the two columns of Score Time and BDeu in the table show the results of an exact solution search using the output superstructure. Score Time represents the number of seconds required for the exact solution search, and BDeu represents the BDeu score of the Bayesian network that is the solution for the exact solution search. All values are average values of 10 data sets.
  • the method of the embodiment of the present invention is expressed as Proposed-2 to Proposed-10. The number following the hyphen represents the parallel execution number t. For example, Proposed-3 represents the method of the embodiment of the present invention executed with the parallel execution number of 3.
  • Each vertex represents a random variable, and the number in parentheses at each vertex represents the number of states of the variable.
  • the experimental results show that the method of the present invention can suppress the disappearance edge of the superstructure most, and that the method of the present invention can find the highest score Bayesian network in the exact solution search of the score learning using the superstructure. .

Abstract

L'invention concerne, dans le présent mode de réalisation, un procédé d'apprentissage par CI au moyen duquel une superstructure présentant moins d'arcs manquants que dans l'approche conventionnelle est extraite. Une unité (102) de commande de programme fait en sorte qu'une unité (104) d'analyse de spécification de données analyse une spécification de données. L'unité (104) d'analyse de spécification de données accède à une base de données, acquiert et conserve le nombre total d'entrées de données. L'unité (102) de commande de programme transfère le contrôle à une unité (106) d'apprentissage de structure par CI. L'unité (106) d'apprentissage de structure par CI exécute un apprentissage par CI selon le présent mode de réalisation. Un nombre désigné de fils d'exécution d'apprentissage par CI à exécuter en parallèle sont générés et exécutés en parallèle. Un ensemble somme est généré à partir d'un ensemble d'arcs dans un graphe obtenu à partir des fils terminés et de l'ensemble d'arcs obtenu auparavant, ladite somme étant traitée comme une superstructure.
PCT/JP2013/076245 2013-09-27 2013-09-27 Procédé et programme d'extraction de superstructure en apprentissage structural de réseau bayésien WO2015045091A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2013/076245 WO2015045091A1 (fr) 2013-09-27 2013-09-27 Procédé et programme d'extraction de superstructure en apprentissage structural de réseau bayésien

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2013/076245 WO2015045091A1 (fr) 2013-09-27 2013-09-27 Procédé et programme d'extraction de superstructure en apprentissage structural de réseau bayésien

Publications (1)

Publication Number Publication Date
WO2015045091A1 true WO2015045091A1 (fr) 2015-04-02

Family

ID=52742295

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/076245 WO2015045091A1 (fr) 2013-09-27 2013-09-27 Procédé et programme d'extraction de superstructure en apprentissage structural de réseau bayésien

Country Status (1)

Country Link
WO (1) WO2015045091A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019527413A (ja) * 2016-07-07 2019-09-26 アスペン テクノロジー インコーポレイテッド 根本的原因分析を実行してプラントワイド操業での希少イベントの発生の予測モデルを構築するコンピュータシステムおよび方法
JP7422946B2 (ja) 2020-07-02 2024-01-26 三菱電機株式会社 ベイジアングラフ探索を用いたニューラルネットワークアーキテクチャの自動構築

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011016281A2 (fr) * 2009-08-06 2011-02-10 株式会社シーエーシー Dispositif de traitement d'informations et programme pour apprendre une structure de réseau de bayes
JP2013206016A (ja) * 2012-03-28 2013-10-07 Sony Corp 情報処理装置および方法、並びにプログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011016281A2 (fr) * 2009-08-06 2011-02-10 株式会社シーエーシー Dispositif de traitement d'informations et programme pour apprendre une structure de réseau de bayes
JP2013206016A (ja) * 2012-03-28 2013-10-07 Sony Corp 情報処理装置および方法、並びにプログラム

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019527413A (ja) * 2016-07-07 2019-09-26 アスペン テクノロジー インコーポレイテッド 根本的原因分析を実行してプラントワイド操業での希少イベントの発生の予測モデルを構築するコンピュータシステムおよび方法
JP7461440B2 (ja) 2016-07-07 2024-04-03 アスペンテック・コーポレーション 根本的原因分析を実行してプラントワイド操業での希少イベントの発生の予測モデルを構築するコンピュータシステムおよび方法
JP7422946B2 (ja) 2020-07-02 2024-01-26 三菱電機株式会社 ベイジアングラフ探索を用いたニューラルネットワークアーキテクチャの自動構築

Similar Documents

Publication Publication Date Title
Vora et al. Kickstarter: Fast and accurate computations on streaming graphs via trimmed approximations
US8903824B2 (en) Vertex-proximity query processing
US10692007B2 (en) Behavioral rules discovery for intelligent computing environment administration
Szárnyas et al. IncQuery-D: A distributed incremental model query framework in the cloud
JP6253555B2 (ja) 高性能のグラフ分析エンジンに関するシステムおよび方法
Cabasino et al. Probabilistic marking estimation in labeled Petri nets
US9176732B2 (en) Method and apparatus for minimum cost cycle removal from a directed graph
Petrenko et al. From passive to active FSM inference via checking sequence construction
CN115358397A (zh) 一种基于数据采样的并行图规则挖掘方法及装置
JP2018169693A (ja) 情報処理装置、情報処理方法および情報処理プログラム
WO2015045091A1 (fr) Procédé et programme d'extraction de superstructure en apprentissage structural de réseau bayésien
JP5975225B2 (ja) 故障の木解析システム、故障の木解析方法及びプログラム
Chatain et al. Symbolic diagnosis of partially observable concurrent systems
Filou et al. Towards proved distributed algorithms through refinement, composition and local computations
Wu et al. Subnettrees for strict pattern matching with general gaps and length constraints
US20160055411A1 (en) Reasoning over cyclical directed graphical models
TWI686704B (zh) 基於圖的資料處理方法和系統
Bowie Applications of graph theory in computer systems
CN104737125B (zh) 用于维护代码生成器输出的完整性的方法和系统
Rabbi et al. A model slicing method for workflow verification
Ba-Brahem et al. The proposal of improved inexact isomorphic graph algorithm to detect design patterns
Moukrim et al. Optimal preemptive scheduling on a fixed number of identical parallel machines
Shao et al. Self-Adaptive Anomaly Detection With Deep Reinforcement Learning and Topology
Fahmi et al. Improving the detection of sequential anomalies associated with a loop
WO2022219809A1 (fr) Dispositif de raisonnement abductif, procédé de raisonnement abductif et programme

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13894896

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13894896

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP