Root Cause Analysis

After learn the causal graph and estimate the causal effect, we may want to know the reason of a given performance anomaly. This is the root cause analysis. In rca.py, we implement the root cause analysis. You can directly execute the script to parse the workflow config file.

Usage:

python rca.py [-h] [--causal_graph CAUSAL_GRAPH] 
              [--train_data TRAIN_DATA] 
              [--test_data TEST_DATA] 
              [--model_dir MODEL_DIR] 
              [--normal_start NORMAL_START]
              [--normal_end NORMAL_END] 
              [--abnormal_time ABNORMAL_TIME] 
              [--selected_node SELECTED_NODE] 
              [--method {PerfCE,cause_infer}]

Here is the arguments table for the script:

Argument	Description	Special Remark
causal_graph	Path to causal graph file(should be txt file).	`str`, the causal graph generated by `run_blip.py`
train_data	Path to training data file(should be csv file).	`str`, should be `combined.csv` generated by `collect_data.py`.
test_data	Path to test data file(should be csv file).	`str`, should be `combined.csv` generated by `collect_data.py`.
model_dir	Path to learned model directory.	`str`, should be `estimate.py`’s `output_dir`.
normal_start	Start time for database’s normal performance during test (UNIX Timestamp).	`int`, the time period when the performance of the database is normal.
normal_end	End time for database’s normal performance during test (UNIX Timestamp).	`int`, the time period when the performance of the database is normal.
abnormal_time	Time for database’s abnormal performance during test (UNIX Timestamp).	`int`, a specific time point when the user find that there exist a performance anomaly.
selected_node	Node to be explained.	`str`, a specific node in the causal graph selected by the user. It should be a high-level KPI.
method	RCA method to be used.	`str`, there are two choices, `PerfCE` and `cause_infer`.