Security Alerts: To Slack or not to Slack
The latest release of SnowAlert supports sending alert notifications from Snowflake to Slack. We added that integration as an addition to the Jira support we’ve had from the start. This order might seem backwards for a contemporary security tool so it’s worth exploring why we took this approach.
The security engineers at Snowflake like Slack. A lot. We have channels for team chat, for sharing articles, for syncing with other teams and for debating the optimal cut of cucumbers in the lunch salad bar (obviously crescent cut with peel). But we weren’t using Slack for security alerts.
This is no accident. In fact, a year ago we were getting so many alerts a day that we needed to rely on dashboards in order to stay on top of security issues in our environment. Dashboards can be helpful to give a sense of direction for a security team, indicating for example if the laptop patching situation is getting better or worse. Dashboards are good aggregators, able to synthesize thousands of data points into a single digestible view.
However, we saw major limitations in relying on dashboards for our threat detection. As our security analytics improved and threat detection fidelity increased, we reduced alerts to only a few dozen a day. At that level, it became feasible to open a tracking ticket in Jira for each alert. This was great because it established a queue that could ensure that every detection was investigated and closed out by a member of the team. Tickets hold a state (To Do, In Progress, Done) and an assignee- crucial for metrics and managing the security operation over time.
If alert tickets are so great, why add Slack integration? As our security analytics become increasingly accurate, we’ve identified specific alerts whose “noise rate” is particularly low. These are alerts that we always want to know about, the sooner the better. For example, while some endpoint threat detection cases are familiar and can be handled routinely, we maintain a statistical baseline of these detections in order to identify when an agent picks up an unusual kind of malware. Malware that is statistically unusual for our environment might indicate a targeted attack that we want to make everyone aware of right away. For this subset of alerts, a handful a day at most, it’s worth sending an instant message to the entire team.
Our next notification integration will require greater alert fidelity than ever before. When we add PagerDuty as a notification option, someone might be dragged out of bed on a Saturday night. For alerts with a 24x7 “hit me on the hip” SLA, we need to be very confident in the relevancy of the detection. One such alert might be our detection for a terminated employee signing into a production AWS account.
This process starts with security analytics that reliably call out when action is needed. The more reliable the alerts, the more confidence to build notifications that drive immediate action from the people that can act on them. This is a kind of maturity model and can be represented in the Alert Delivery Pyramid shown below.
Like with any maturity model, start at the beginning. If you rely exclusively on Slack messages for your alerts, for example, you’re missing out on the accountability and metrics of tickets that will push you to improve the fidelity of your detections. By working down this pyramid model your team will hear about the right threats at the right time. That would be something worth Slacking about.