Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Root Cause Analysis

After learn the causal graph and estimate the causal effect, we may want to know the reason of a given performance anomaly. This is the root cause analysis. In rca.py, we implement the root cause analysis. You can directly execute the script to parse the workflow config file.

Usage:

python rca.py [-h] [--causal_graph CAUSAL_GRAPH] 
              [--train_data TRAIN_DATA] 
              [--test_data TEST_DATA] 
              [--model_dir MODEL_DIR] 
              [--normal_start NORMAL_START]
              [--normal_end NORMAL_END] 
              [--abnormal_time ABNORMAL_TIME] 
              [--selected_node SELECTED_NODE] 
              [--method {PerfCE,cause_infer}]

Here is the arguments table for the script:

Argument Description Special Remark
causal_graph Path to causal graph file(should be txt file). str, the causal graph generated by run_blip.py
train_data Path to training data file(should be csv file). str, should be combined.csv generated by collect_data.py.
test_data Path to test data file(should be csv file). str, should be combined.csv generated by collect_data.py.
model_dir Path to learned model directory. str, should be estimate.py’s output_dir.
normal_start Start time for database’s normal performance during test (UNIX Timestamp). int, the time period when the performance of the database is normal.
normal_end End time for database’s normal performance during test (UNIX Timestamp). int, the time period when the performance of the database is normal.
abnormal_time Time for database’s abnormal performance during test (UNIX Timestamp). int, a specific time point when the user find that there exist a performance anomaly.
selected_node Node to be explained. str, a specific node in the causal graph selected by the user. It should be a high-level KPI.
method RCA method to be used. str, there are two choices, PerfCE and cause_infer.