openasce.discovery package¶

class openasce.discovery.CausalGraph(names=[], bn=None, w: ndarray = None)[source]¶

Bases: object

Causal Graph Class

Represent the casual graph

Constructor

Parameters

names – the node names
bn – basic causal graph
w – the connection matrix for causal graph

DEFAULT_COLUMN_NAME_PREFIX = 'x'¶

__dict__ = mappingproxy({'__module__': 'openasce.discovery.causal_graph', '__doc__': 'Causal Graph Class\n\n Represent the casual graph\n\n ', 'DEFAULT_COLUMN_NAME_PREFIX': 'x', '__init__': <function CausalGraph.__init__>, 'names_init': <function CausalGraph.names_init>, 'parents_exclude': <function CausalGraph.parents_exclude>, 'random_init': <function CausalGraph.random_init>, 'merge': <function CausalGraph.merge>, 'random_merge': <function CausalGraph.random_merge>, 'mutate': <function CausalGraph.mutate>, 'remove_extra_parents': <function CausalGraph.remove_extra_parents>, 'num_save': <function CausalGraph.num_save>, 'save': <function CausalGraph.save>, 'load': <function CausalGraph.load>, 'is_cyclic': <function CausalGraph.is_cyclic>, 'copy': <function CausalGraph.copy>, 'add_edge': <function CausalGraph.add_edge>, 'remove_edge': <function CausalGraph.remove_edge>, 'score': <function CausalGraph.score>, 'compute_r': <function CausalGraph.compute_r>, 'score_node': <function CausalGraph.score_node>, 'calculate_parameter': <function CausalGraph.calculate_parameter>, '__dict__': <attribute '__dict__' of 'CausalGraph' objects>, '__weakref__': <attribute '__weakref__' of 'CausalGraph' objects>, '__annotations__': {}})¶

__doc__ = 'Causal Graph Class\n\n Represent the casual graph\n\n '¶

__init__(names=[], bn=None, w: ndarray = None)[source]¶

Constructor

Parameters

names – the node names
bn – basic causal graph
w – the connection matrix for causal graph

__module__ = 'openasce.discovery.causal_graph'¶

__weakref__¶: list of weak references to the object (if defined)

add_edge(parent: Union[int, str], child: Union[int, str], max_parents=None) → bool[source]¶

Adds edge if respects max parents constraint and does not create a cycle

Parameters

parent (int) – id of parent
child (int) – id of child
max_parents (int) – None means no constraints

Returns: True if actually added the edge and False means no way to add the edge

calculate_parameter(data: ndarray, rd: Dict[int, int] = None)[source]¶

Calculate the edge weight in the graph

Parameters

data – samples
rd – r[i] = r_i

compute_r(data: ndarray) → dict[source]¶

Compute the number of the value for each node

Parameters: data (np array) – (nsamples, nfeatures)

Returns: r (dict): r[i] = r_i

copy(cg) → None[source]¶

Copies the structure of cg inside self and erases everything else

Parameters: cg (CausalGraph) – model

is_cyclic() → bool[source]¶

Returns True if a cycle is found else False.

Iterates over the nodes to find all the parents’ parents, etc. A cycle is found if a node belongs to its own parent’s set.

load(file_name: str) → None[source]¶

Loads structure from file. See save method

Parameters: file_name – the path of the file to be loaded

merge(g1, g2, p1=1, p2=1, max_parents: int = None, mut_rate: float = 0.0) → None[source]¶

Pick up edges from both g1 and g2 according to some random policy

Parameters

g1 (CausalGraph) –
g1 –
p1 (float in [0,1]) – proba of an edge in g1 being in self
p2 (float in [0,1]) – proba of an edge in g2 being in self p1 + p2 = 1
max_parents (int) –

mutate(mut_rate: float = 0) → None[source]¶

Introduces new edges with a probability mut_rate

Parameters: mut_rate (float in [0,1]) – proba of mutation

names_init(names: List[str]) → None[source]¶

Initialize the graph with feature names

initialize the names_to_index and index_to_names attributes initialize parents[i] = set() (no edges for the moment)

Parameters: names (list of string) – the names of the nodes
Returns: None

num_save(file_name: str) → None[source]¶

Saves the graph in number format

Example: parent1, child1 parent2, child2

Parameters: file_name – saved file path

parents_exclude(name_list: List[str]) → None[source]¶

random_init(max_parents: int = None) → None[source]¶

Add edges randomly

For each node, pick a random number of the desired number of parents. Then, for each candidate, pick another random number. In average, the node will have the desired number of parents.

Parameters: max_parents – maximal number of one node’s parents

random_merge(g1, g2, p1, p2) → None[source]¶

Creates graph from edges both in g1 and g2. Adds edges according to proba p1 and p2

Parameters

g1 (CausalGraph) –
g1 –
p1 (float in [0,1]) – proba of an edge in g1 being in self
p2 (float in [0,1]) – proba of an edge in g2 being in self

remove_edge(parent: int, child: int, force: bool = True) → None[source]¶

remove_extra_parents(max_parents: int = None) → None[source]¶

Removes extra edges if does not respect max parents constraint

Parameters: max_parents – the maximal number of the node’s parents

save(file_path: str) → None[source]¶

Saves the graph in the desired format

Example: parent1, child1 parent2, child2

Parameters: file_path – saved file path

score(data: ndarray, rd: Dict[int, int] = None) → float[source]¶

Computes bayesian score of the structure given some data assuming uniform prior

Example: s = cg.score(data)

Parameters: data – (nsamples, nfeatures)

Returns: s (float): bayesian score

score_node(i, data: ndarray, r) → float[source]¶

Compute the score of node i

Parameters

i (int) – node
data (np array) – (nsamples, nfeatures)
r (dict of np array) – r[i] = nb possible instances of i

Returns: s (float): contribution to log score of node i

class openasce.discovery.CausalRegressionDiscovery[source]¶

Bases: Discovery

Execute the causal discovery by notears method

Attributes:

Constructor

Arguments:

Returns:

__annotations__ = {}¶

__doc__ = 'Execute the causal discovery by notears method\n\n Attributes:\n\n '¶

__init__() → None[source]¶

Constructor

Arguments:

Returns:

__module__ = 'openasce.discovery.regression_discovery.regression_discovery'¶

fit(*, X: Union[ndarray, Callable], **kwargs)[source]¶

Feed the sample data

Parameters: X (num of samples, features or callable returning np.ndarray) – samples

Returns:

get_result() → Tuple[CausalGraph, float][source]¶

Get the causal graph sample data

Parameters: X (num of samples, features or callable returning np.ndarray) – samples

Returns:

class openasce.discovery.CausalSearchDiscovery[source]¶

Bases: Discovery

Execute the causal inference by search method

Attributes:

Constructor

Arguments:

Returns:

__annotations__ = {}¶

__doc__ = 'Execute the causal inference by search method\n\n Attributes:\n\n '¶

__init__() → None[source]¶

Constructor

Arguments:

Returns:

__module__ = 'openasce.discovery.search_discovery.search_discovery'¶

fit(*, X: Union[ndarray, Callable], **kwargs) → None[source]¶

Feed the sample data

Parameters: X (num of samples, features or callable returning np.ndarray) – samples

Returns:

get_result() → Tuple[CausalGraph, float][source]¶

Get the causal graph sample data

Parameters: X (num of samples, features or callable returning np.ndarray) – samples

Returns:

Subpackages¶

Submodules¶

openasce.discovery.causal_graph module¶

class openasce.discovery.causal_graph.CausalGraph(names=[], bn=None, w: ndarray = None)[source]¶

Bases: object

Causal Graph Class

Represent the casual graph

Constructor

Parameters

names – the node names
bn – basic causal graph
w – the connection matrix for causal graph

DEFAULT_COLUMN_NAME_PREFIX = 'x'¶

__annotations__ = {}¶

__dict__ = mappingproxy({'__module__': 'openasce.discovery.causal_graph', '__doc__': 'Causal Graph Class\n\n Represent the casual graph\n\n ', 'DEFAULT_COLUMN_NAME_PREFIX': 'x', '__init__': <function CausalGraph.__init__>, 'names_init': <function CausalGraph.names_init>, 'parents_exclude': <function CausalGraph.parents_exclude>, 'random_init': <function CausalGraph.random_init>, 'merge': <function CausalGraph.merge>, 'random_merge': <function CausalGraph.random_merge>, 'mutate': <function CausalGraph.mutate>, 'remove_extra_parents': <function CausalGraph.remove_extra_parents>, 'num_save': <function CausalGraph.num_save>, 'save': <function CausalGraph.save>, 'load': <function CausalGraph.load>, 'is_cyclic': <function CausalGraph.is_cyclic>, 'copy': <function CausalGraph.copy>, 'add_edge': <function CausalGraph.add_edge>, 'remove_edge': <function CausalGraph.remove_edge>, 'score': <function CausalGraph.score>, 'compute_r': <function CausalGraph.compute_r>, 'score_node': <function CausalGraph.score_node>, 'calculate_parameter': <function CausalGraph.calculate_parameter>, '__dict__': <attribute '__dict__' of 'CausalGraph' objects>, '__weakref__': <attribute '__weakref__' of 'CausalGraph' objects>, '__annotations__': {}})¶

__doc__ = 'Causal Graph Class\n\n Represent the casual graph\n\n '¶

__init__(names=[], bn=None, w: ndarray = None)[source]¶

Constructor

Parameters

names – the node names
bn – basic causal graph
w – the connection matrix for causal graph

__module__ = 'openasce.discovery.causal_graph'¶

__weakref__¶: list of weak references to the object (if defined)

add_edge(parent: Union[int, str], child: Union[int, str], max_parents=None) → bool[source]¶

Adds edge if respects max parents constraint and does not create a cycle

Parameters

parent (int) – id of parent
child (int) – id of child
max_parents (int) – None means no constraints

Returns: True if actually added the edge and False means no way to add the edge

calculate_parameter(data: ndarray, rd: Dict[int, int] = None)[source]¶

Calculate the edge weight in the graph

Parameters

data – samples
rd – r[i] = r_i

compute_r(data: ndarray) → dict[source]¶

Compute the number of the value for each node

Parameters: data (np array) – (nsamples, nfeatures)

Returns: r (dict): r[i] = r_i

copy(cg) → None[source]¶

Copies the structure of cg inside self and erases everything else

Parameters: cg (CausalGraph) – model

is_cyclic() → bool[source]¶

Returns True if a cycle is found else False.

Iterates over the nodes to find all the parents’ parents, etc. A cycle is found if a node belongs to its own parent’s set.

load(file_name: str) → None[source]¶

Loads structure from file. See save method

Parameters: file_name – the path of the file to be loaded

merge(g1, g2, p1=1, p2=1, max_parents: int = None, mut_rate: float = 0.0) → None[source]¶

Pick up edges from both g1 and g2 according to some random policy

Parameters

g1 (CausalGraph) –
g1 –
p1 (float in [0,1]) – proba of an edge in g1 being in self
p2 (float in [0,1]) – proba of an edge in g2 being in self p1 + p2 = 1
max_parents (int) –

mutate(mut_rate: float = 0) → None[source]¶

Introduces new edges with a probability mut_rate

Parameters: mut_rate (float in [0,1]) – proba of mutation

names_init(names: List[str]) → None[source]¶

Initialize the graph with feature names

initialize the names_to_index and index_to_names attributes initialize parents[i] = set() (no edges for the moment)

Parameters: names (list of string) – the names of the nodes
Returns: None

num_save(file_name: str) → None[source]¶

Saves the graph in number format

Example: parent1, child1 parent2, child2

Parameters: file_name – saved file path

parents_exclude(name_list: List[str]) → None[source]¶

random_init(max_parents: int = None) → None[source]¶

Add edges randomly

For each node, pick a random number of the desired number of parents. Then, for each candidate, pick another random number. In average, the node will have the desired number of parents.

Parameters: max_parents – maximal number of one node’s parents

random_merge(g1, g2, p1, p2) → None[source]¶

Creates graph from edges both in g1 and g2. Adds edges according to proba p1 and p2

Parameters

g1 (CausalGraph) –
g1 –
p1 (float in [0,1]) – proba of an edge in g1 being in self
p2 (float in [0,1]) – proba of an edge in g2 being in self

remove_edge(parent: int, child: int, force: bool = True) → None[source]¶

remove_extra_parents(max_parents: int = None) → None[source]¶

Removes extra edges if does not respect max parents constraint

Parameters: max_parents – the maximal number of the node’s parents

save(file_path: str) → None[source]¶

Saves the graph in the desired format

Example: parent1, child1 parent2, child2

Parameters: file_path – saved file path

score(data: ndarray, rd: Dict[int, int] = None) → float[source]¶

Computes bayesian score of the structure given some data assuming uniform prior

Example: s = cg.score(data)

Parameters: data – (nsamples, nfeatures)

Returns: s (float): bayesian score

score_node(i, data: ndarray, r) → float[source]¶

Compute the score of node i

Parameters

i (int) – node
data (np array) – (nsamples, nfeatures)
r (dict of np array) – r[i] = nb possible instances of i

Returns: s (float): contribution to log score of node i

openasce.discovery.discovery module¶

class openasce.discovery.discovery.Discovery[source]¶

Bases: Runtime

Discovery Class

Base class of the causal discovery

node_names¶

the name of graph node, which should be set before fit

Type: List[str]

__annotations__ = {}¶

__doc__ = 'Discovery Class\n\n Base class of the causal discovery\n\n Attributes:\n node_names (List[str]): the name of graph node, which should be set before fit\n\n '¶

__init__() → None[source]¶

__module__ = 'openasce.discovery.discovery'¶

fit(*, X: Union[ndarray, Callable], **kwargs) → None[source]¶

Feed the sample data and search the causal relation on them

Parameters: X – Features of the samples.
Returns: None

get_result()[source]¶

Output the causal graph

Returns: None

property node_names¶

openasce.discovery.graph_node_form module¶

class openasce.discovery.graph_node_form.GraphNodeForm(input_data: List[List[float]], columns: List[str])[source]¶

Bases: object

SCORE_COLUMN_NAME = 'node_score_value'¶

__dict__ = mappingproxy({'__module__': 'openasce.discovery.graph_node_form', 'SCORE_COLUMN_NAME': 'node_score_value', '__init__': <function GraphNodeForm.__init__>, 'size': <property object>, 'columns': <property object>, 'data': <property object>, 'score_column_index': <property object>, 'index': <function GraphNodeForm.index>, 'set_flag_zero': <function GraphNodeForm.set_flag_zero>, 'set_norm': <function GraphNodeForm.set_norm>, 'multiply_score_column': <function GraphNodeForm.multiply_score_column>, 'sort_by_column': <function GraphNodeForm.sort_by_column>, 'get_score_deviation': <function GraphNodeForm.get_score_deviation>, 'get_score_value': <function GraphNodeForm.get_score_value>, 'set_groupby_sum': <function GraphNodeForm.set_groupby_sum>, '__str__': <function GraphNodeForm.__str__>, '__dict__': <attribute '__dict__' of 'GraphNodeForm' objects>, '__weakref__': <attribute '__weakref__' of 'GraphNodeForm' objects>, '__doc__': None, '__annotations__': {}})¶

__doc__ = None¶

__init__(input_data: List[List[float]], columns: List[str]) → None[source]¶

__module__ = 'openasce.discovery.graph_node_form'¶

__str__()[source]¶: Return str(self).

__weakref__¶: list of weak references to the object (if defined)

property columns¶

property data¶

get_score_deviation(addition)[source]¶

multiply ext’s score column to local score column for same key column’s value

Parameters: addition – Another GraphNodeForm used to calculate the deviation
Returns: Calculation result

get_score_value(target_key: str, target_value: int)[source]¶

multiply ext’s score column to local score column for same key column’s value

Parameters

target_key – the column name
target_value – the column value

Returns:

index(key: str)[source]¶

multiply_score_column(key: str, ext) → None[source]¶

multiply ext’s score column to local score column for same key column’s value

Parameters

key – the column name
ext (GraphNodeForm) – another GraphNodeForm

Returns

None

property score_column_index¶

set_flag_zero(key: str, value_list: List[int]) → None[source]¶

set score column to 0 if the value of key column is not in input value_list

Parameters

key – the column name
value_list – the values need to be set

Returns

None

set_groupby_sum(key: str)[source]¶

multiply ext’s score column to local score column for same key column’s value

Parameters: key – the column name

Returns:

set_norm() → None[source]¶: normalize the value of score column

property size¶

sort_by_column(key: str) → None[source]¶

sort specified column

Parameters: key – the column name
Returns: None