openasce.discovery package

class openasce.discovery.CausalGraph(names=[], bn=None, w: ndarray = None)[source]

Bases: object

Causal Graph Class

Represent the casual graph

Constructor

Parameters
  • names – the node names

  • bn – basic causal graph

  • w – the connection matrix for causal graph

DEFAULT_COLUMN_NAME_PREFIX = 'x'
__dict__ = mappingproxy({'__module__': 'openasce.discovery.causal_graph', '__doc__': 'Causal Graph Class\n\n    Represent the casual graph\n\n    ', 'DEFAULT_COLUMN_NAME_PREFIX': 'x', '__init__': <function CausalGraph.__init__>, 'names_init': <function CausalGraph.names_init>, 'parents_exclude': <function CausalGraph.parents_exclude>, 'random_init': <function CausalGraph.random_init>, 'merge': <function CausalGraph.merge>, 'random_merge': <function CausalGraph.random_merge>, 'mutate': <function CausalGraph.mutate>, 'remove_extra_parents': <function CausalGraph.remove_extra_parents>, 'num_save': <function CausalGraph.num_save>, 'save': <function CausalGraph.save>, 'load': <function CausalGraph.load>, 'is_cyclic': <function CausalGraph.is_cyclic>, 'copy': <function CausalGraph.copy>, 'add_edge': <function CausalGraph.add_edge>, 'remove_edge': <function CausalGraph.remove_edge>, 'score': <function CausalGraph.score>, 'compute_r': <function CausalGraph.compute_r>, 'score_node': <function CausalGraph.score_node>, 'calculate_parameter': <function CausalGraph.calculate_parameter>, '__dict__': <attribute '__dict__' of 'CausalGraph' objects>, '__weakref__': <attribute '__weakref__' of 'CausalGraph' objects>, '__annotations__': {}})
__doc__ = 'Causal Graph Class\n\n    Represent the casual graph\n\n    '
__init__(names=[], bn=None, w: ndarray = None)[source]

Constructor

Parameters
  • names – the node names

  • bn – basic causal graph

  • w – the connection matrix for causal graph

__module__ = 'openasce.discovery.causal_graph'
__weakref__

list of weak references to the object (if defined)

add_edge(parent: Union[int, str], child: Union[int, str], max_parents=None) bool[source]

Adds edge if respects max parents constraint and does not create a cycle

Parameters
  • parent (int) – id of parent

  • child (int) – id of child

  • max_parents (int) – None means no constraints

Returns

True if actually added the edge and False means no way to add the edge

calculate_parameter(data: ndarray, rd: Dict[int, int] = None)[source]

Calculate the edge weight in the graph

Parameters
  • data – samples

  • rd – r[i] = r_i

compute_r(data: ndarray) dict[source]

Compute the number of the value for each node

Parameters

data (np array) – (nsamples, nfeatures)

Returns

r (dict): r[i] = r_i

copy(cg) None[source]

Copies the structure of cg inside self and erases everything else

Parameters

cg (CausalGraph) – model

is_cyclic() bool[source]

Returns True if a cycle is found else False.

Iterates over the nodes to find all the parents’ parents, etc. A cycle is found if a node belongs to its own parent’s set.

load(file_name: str) None[source]

Loads structure from file. See save method

Parameters

file_name – the path of the file to be loaded

merge(g1, g2, p1=1, p2=1, max_parents: int = None, mut_rate: float = 0.0) None[source]

Pick up edges from both g1 and g2 according to some random policy

Parameters
  • g1 (CausalGraph) –

  • g1

  • p1 (float in [0,1]) – proba of an edge in g1 being in self

  • p2 (float in [0,1]) – proba of an edge in g2 being in self p1 + p2 = 1

  • max_parents (int) –

mutate(mut_rate: float = 0) None[source]

Introduces new edges with a probability mut_rate

Parameters

mut_rate (float in [0,1]) – proba of mutation

names_init(names: List[str]) None[source]

Initialize the graph with feature names

initialize the names_to_index and index_to_names attributes initialize parents[i] = set() (no edges for the moment)

Parameters

names (list of string) – the names of the nodes

Returns

None

num_save(file_name: str) None[source]

Saves the graph in number format

Example

parent1, child1 parent2, child2

Parameters

file_name – saved file path

parents_exclude(name_list: List[str]) None[source]
random_init(max_parents: int = None) None[source]

Add edges randomly

For each node, pick a random number of the desired number of parents. Then, for each candidate, pick another random number. In average, the node will have the desired number of parents.

Parameters

max_parents – maximal number of one node’s parents

random_merge(g1, g2, p1, p2) None[source]

Creates graph from edges both in g1 and g2. Adds edges according to proba p1 and p2

Parameters
  • g1 (CausalGraph) –

  • g1

  • p1 (float in [0,1]) – proba of an edge in g1 being in self

  • p2 (float in [0,1]) – proba of an edge in g2 being in self

remove_edge(parent: int, child: int, force: bool = True) None[source]
remove_extra_parents(max_parents: int = None) None[source]

Removes extra edges if does not respect max parents constraint

Parameters

max_parents – the maximal number of the node’s parents

save(file_path: str) None[source]

Saves the graph in the desired format

Example

parent1, child1 parent2, child2

Parameters

file_path – saved file path

score(data: ndarray, rd: Dict[int, int] = None) float[source]

Computes bayesian score of the structure given some data assuming uniform prior

Example

s = cg.score(data)

Parameters

data – (nsamples, nfeatures)

Returns

s (float): bayesian score

score_node(i, data: ndarray, r) float[source]

Compute the score of node i

Parameters
  • i (int) – node

  • data (np array) – (nsamples, nfeatures)

  • r (dict of np array) – r[i] = nb possible instances of i

Returns

s (float): contribution to log score of node i

class openasce.discovery.CausalRegressionDiscovery[source]

Bases: Discovery

Execute the causal discovery by notears method

Attributes:

Constructor

Arguments:

Returns:

__annotations__ = {}
__doc__ = 'Execute the causal discovery by notears method\n\n    Attributes:\n\n    '
__init__() None[source]

Constructor

Arguments:

Returns:

__module__ = 'openasce.discovery.regression_discovery.regression_discovery'
fit(*, X: Union[ndarray, Callable], **kwargs)[source]

Feed the sample data

Parameters

X (num of samples, features or callable returning np.ndarray) – samples

Returns:

get_result() Tuple[CausalGraph, float][source]

Get the causal graph sample data

Parameters

X (num of samples, features or callable returning np.ndarray) – samples

Returns:

class openasce.discovery.CausalSearchDiscovery[source]

Bases: Discovery

Execute the causal inference by search method

Attributes:

Constructor

Arguments:

Returns:

__annotations__ = {}
__doc__ = 'Execute the causal inference by search method\n\n    Attributes:\n\n    '
__init__() None[source]

Constructor

Arguments:

Returns:

__module__ = 'openasce.discovery.search_discovery.search_discovery'
fit(*, X: Union[ndarray, Callable], **kwargs) None[source]

Feed the sample data

Parameters

X (num of samples, features or callable returning np.ndarray) – samples

Returns:

get_result() Tuple[CausalGraph, float][source]

Get the causal graph sample data

Parameters

X (num of samples, features or callable returning np.ndarray) – samples

Returns:

Subpackages

Submodules

openasce.discovery.causal_graph module

class openasce.discovery.causal_graph.CausalGraph(names=[], bn=None, w: ndarray = None)[source]

Bases: object

Causal Graph Class

Represent the casual graph

Constructor

Parameters
  • names – the node names

  • bn – basic causal graph

  • w – the connection matrix for causal graph

DEFAULT_COLUMN_NAME_PREFIX = 'x'
__annotations__ = {}
__dict__ = mappingproxy({'__module__': 'openasce.discovery.causal_graph', '__doc__': 'Causal Graph Class\n\n    Represent the casual graph\n\n    ', 'DEFAULT_COLUMN_NAME_PREFIX': 'x', '__init__': <function CausalGraph.__init__>, 'names_init': <function CausalGraph.names_init>, 'parents_exclude': <function CausalGraph.parents_exclude>, 'random_init': <function CausalGraph.random_init>, 'merge': <function CausalGraph.merge>, 'random_merge': <function CausalGraph.random_merge>, 'mutate': <function CausalGraph.mutate>, 'remove_extra_parents': <function CausalGraph.remove_extra_parents>, 'num_save': <function CausalGraph.num_save>, 'save': <function CausalGraph.save>, 'load': <function CausalGraph.load>, 'is_cyclic': <function CausalGraph.is_cyclic>, 'copy': <function CausalGraph.copy>, 'add_edge': <function CausalGraph.add_edge>, 'remove_edge': <function CausalGraph.remove_edge>, 'score': <function CausalGraph.score>, 'compute_r': <function CausalGraph.compute_r>, 'score_node': <function CausalGraph.score_node>, 'calculate_parameter': <function CausalGraph.calculate_parameter>, '__dict__': <attribute '__dict__' of 'CausalGraph' objects>, '__weakref__': <attribute '__weakref__' of 'CausalGraph' objects>, '__annotations__': {}})
__doc__ = 'Causal Graph Class\n\n    Represent the casual graph\n\n    '
__init__(names=[], bn=None, w: ndarray = None)[source]

Constructor

Parameters
  • names – the node names

  • bn – basic causal graph

  • w – the connection matrix for causal graph

__module__ = 'openasce.discovery.causal_graph'
__weakref__

list of weak references to the object (if defined)

add_edge(parent: Union[int, str], child: Union[int, str], max_parents=None) bool[source]

Adds edge if respects max parents constraint and does not create a cycle

Parameters
  • parent (int) – id of parent

  • child (int) – id of child

  • max_parents (int) – None means no constraints

Returns

True if actually added the edge and False means no way to add the edge

calculate_parameter(data: ndarray, rd: Dict[int, int] = None)[source]

Calculate the edge weight in the graph

Parameters
  • data – samples

  • rd – r[i] = r_i

compute_r(data: ndarray) dict[source]

Compute the number of the value for each node

Parameters

data (np array) – (nsamples, nfeatures)

Returns

r (dict): r[i] = r_i

copy(cg) None[source]

Copies the structure of cg inside self and erases everything else

Parameters

cg (CausalGraph) – model

is_cyclic() bool[source]

Returns True if a cycle is found else False.

Iterates over the nodes to find all the parents’ parents, etc. A cycle is found if a node belongs to its own parent’s set.

load(file_name: str) None[source]

Loads structure from file. See save method

Parameters

file_name – the path of the file to be loaded

merge(g1, g2, p1=1, p2=1, max_parents: int = None, mut_rate: float = 0.0) None[source]

Pick up edges from both g1 and g2 according to some random policy

Parameters
  • g1 (CausalGraph) –

  • g1

  • p1 (float in [0,1]) – proba of an edge in g1 being in self

  • p2 (float in [0,1]) – proba of an edge in g2 being in self p1 + p2 = 1

  • max_parents (int) –

mutate(mut_rate: float = 0) None[source]

Introduces new edges with a probability mut_rate

Parameters

mut_rate (float in [0,1]) – proba of mutation

names_init(names: List[str]) None[source]

Initialize the graph with feature names

initialize the names_to_index and index_to_names attributes initialize parents[i] = set() (no edges for the moment)

Parameters

names (list of string) – the names of the nodes

Returns

None

num_save(file_name: str) None[source]

Saves the graph in number format

Example

parent1, child1 parent2, child2

Parameters

file_name – saved file path

parents_exclude(name_list: List[str]) None[source]
random_init(max_parents: int = None) None[source]

Add edges randomly

For each node, pick a random number of the desired number of parents. Then, for each candidate, pick another random number. In average, the node will have the desired number of parents.

Parameters

max_parents – maximal number of one node’s parents

random_merge(g1, g2, p1, p2) None[source]

Creates graph from edges both in g1 and g2. Adds edges according to proba p1 and p2

Parameters
  • g1 (CausalGraph) –

  • g1

  • p1 (float in [0,1]) – proba of an edge in g1 being in self

  • p2 (float in [0,1]) – proba of an edge in g2 being in self

remove_edge(parent: int, child: int, force: bool = True) None[source]
remove_extra_parents(max_parents: int = None) None[source]

Removes extra edges if does not respect max parents constraint

Parameters

max_parents – the maximal number of the node’s parents

save(file_path: str) None[source]

Saves the graph in the desired format

Example

parent1, child1 parent2, child2

Parameters

file_path – saved file path

score(data: ndarray, rd: Dict[int, int] = None) float[source]

Computes bayesian score of the structure given some data assuming uniform prior

Example

s = cg.score(data)

Parameters

data – (nsamples, nfeatures)

Returns

s (float): bayesian score

score_node(i, data: ndarray, r) float[source]

Compute the score of node i

Parameters
  • i (int) – node

  • data (np array) – (nsamples, nfeatures)

  • r (dict of np array) – r[i] = nb possible instances of i

Returns

s (float): contribution to log score of node i

openasce.discovery.discovery module

class openasce.discovery.discovery.Discovery[source]

Bases: Runtime

Discovery Class

Base class of the causal discovery

node_names

the name of graph node, which should be set before fit

Type

List[str]

__annotations__ = {}
__doc__ = 'Discovery Class\n\n    Base class of the causal discovery\n\n    Attributes:\n        node_names (List[str]): the name of graph node, which should be set before fit\n\n    '
__init__() None[source]
__module__ = 'openasce.discovery.discovery'
fit(*, X: Union[ndarray, Callable], **kwargs) None[source]

Feed the sample data and search the causal relation on them

Parameters

X – Features of the samples.

Returns

None

get_result()[source]

Output the causal graph

Returns

None

property node_names

openasce.discovery.graph_node_form module

class openasce.discovery.graph_node_form.GraphNodeForm(input_data: List[List[float]], columns: List[str])[source]

Bases: object

SCORE_COLUMN_NAME = 'node_score_value'
__dict__ = mappingproxy({'__module__': 'openasce.discovery.graph_node_form', 'SCORE_COLUMN_NAME': 'node_score_value', '__init__': <function GraphNodeForm.__init__>, 'size': <property object>, 'columns': <property object>, 'data': <property object>, 'score_column_index': <property object>, 'index': <function GraphNodeForm.index>, 'set_flag_zero': <function GraphNodeForm.set_flag_zero>, 'set_norm': <function GraphNodeForm.set_norm>, 'multiply_score_column': <function GraphNodeForm.multiply_score_column>, 'sort_by_column': <function GraphNodeForm.sort_by_column>, 'get_score_deviation': <function GraphNodeForm.get_score_deviation>, 'get_score_value': <function GraphNodeForm.get_score_value>, 'set_groupby_sum': <function GraphNodeForm.set_groupby_sum>, '__str__': <function GraphNodeForm.__str__>, '__dict__': <attribute '__dict__' of 'GraphNodeForm' objects>, '__weakref__': <attribute '__weakref__' of 'GraphNodeForm' objects>, '__doc__': None, '__annotations__': {}})
__doc__ = None
__init__(input_data: List[List[float]], columns: List[str]) None[source]
__module__ = 'openasce.discovery.graph_node_form'
__str__()[source]

Return str(self).

__weakref__

list of weak references to the object (if defined)

property columns
property data
get_score_deviation(addition)[source]

multiply ext’s score column to local score column for same key column’s value

Parameters

addition – Another GraphNodeForm used to calculate the deviation

Returns

Calculation result

get_score_value(target_key: str, target_value: int)[source]

multiply ext’s score column to local score column for same key column’s value

Parameters
  • target_key – the column name

  • target_value – the column value

Returns:

index(key: str)[source]
multiply_score_column(key: str, ext) None[source]

multiply ext’s score column to local score column for same key column’s value

Parameters
  • key – the column name

  • ext (GraphNodeForm) – another GraphNodeForm

Returns

None

property score_column_index
set_flag_zero(key: str, value_list: List[int]) None[source]

set score column to 0 if the value of key column is not in input value_list

Parameters
  • key – the column name

  • value_list – the values need to be set

Returns

None

set_groupby_sum(key: str)[source]

multiply ext’s score column to local score column for same key column’s value

Parameters

key – the column name

Returns:

set_norm() None[source]

normalize the value of score column

property size
sort_by_column(key: str) None[source]

sort specified column

Parameters

key – the column name

Returns

None