Root Cause Analysis
After learn the causal graph and estimate the causal effect, we may want to know the reason of a given performance anomaly. This is the root cause analysis. In rca.py, we implement the root cause analysis. You can directly execute the script to parse the workflow config file.
Usage:
python rca.py [-h] [--causal_graph CAUSAL_GRAPH]
[--train_data TRAIN_DATA]
[--test_data TEST_DATA]
[--model_dir MODEL_DIR]
[--normal_start NORMAL_START]
[--normal_end NORMAL_END]
[--abnormal_time ABNORMAL_TIME]
[--selected_node SELECTED_NODE]
[--method {PerfCE,cause_infer}]
Here is the arguments table for the script:
| Argument | Description | Special Remark |
|---|---|---|
| causal_graph | Path to causal graph file(should be txt file). | str, the causal graph generated by run_blip.py |
| train_data | Path to training data file(should be csv file). | str, should be combined.csv generated by collect_data.py. |
| test_data | Path to test data file(should be csv file). | str, should be combined.csv generated by collect_data.py. |
| model_dir | Path to learned model directory. | str, should be estimate.py’s output_dir. |
| normal_start | Start time for database’s normal performance during test (UNIX Timestamp). | int, the time period when the performance of the database is normal. |
| normal_end | End time for database’s normal performance during test (UNIX Timestamp). | int, the time period when the performance of the database is normal. |
| abnormal_time | Time for database’s abnormal performance during test (UNIX Timestamp). | int, a specific time point when the user find that there exist a performance anomaly. |
| selected_node | Node to be explained. | str, a specific node in the causal graph selected by the user. It should be a high-level KPI. |
| method | RCA method to be used. | str, there are two choices, PerfCE and cause_infer. |