Hi Han Qi,
Thanks so much for reading and commenting. These are really thoughtful questions that you have asked:
1. This is something that I addressed when making this product but didn't explicitly write out in this article here. The Prophet algo used to catch anomalies already has a bit of a lag in its reporting period, I added in an 1-hour time-window before the reported anomaly period to better catch those potential causal event transitions as you suggested.
2. Exactly, I am talking about the latter case in which user over- or under-activity activity (both of which trigger an anomaly) on some action on the site could be indicative of some glitch on the site or overload on the site, both of which I wager can be assessed with this method.
3. That's a good point. To address that, I had also added a 1-hour time-window after the end of the anomaly period to catch some of these downstream impacts as well. However, I did not look specifically at the impact of these time-windows when analyzing the work--but maybe someone in the future can do this!
Also, sparse matrices would be a great way to do information reduction on the data and would help speak to the point I made about figuring out more 'typical routes' customers are taking on the site.