Ten Reasons to Bring Your Own Snowflake and One to Use Your Vendor’s

Omer Singer
9 min readOct 2, 2023

When choosing a cybersecurity solution that is powered by Snowflake, you have an important decision to make: whose Snowflake do you use? This isn’t a question with traditional SaaS products. They run on custom backends that the vendor manages for performance and reliability. In the brave new world of data platforms delivered as a service, customers now have the option to bring their own- and I can think of at least ten good reasons to do so.

What is the “connected application” deployment model?

In a typical SaaS product, the data is stored and analyzed in a backend that the vendor manages on behalf of their customers. The users don’t know or care what’s under the hood. This model is referred to as a managed app deployment. In contrast, what had initially been referred to as “Bring Your Own Snowflake” is now called the warehouse-native, or connected app, model:

“Connected apps store and process customers’ data on the customers’ data platforms rather than on the SaaS vendor’s own. They’re called connected applications because they connect to external data platforms instead of loading data into the SaaS vendor’s managed data platform.”

Source: snowflake.com

An example from the marketing industry is Castled, which touts their connected app deployment model as a key benefit:

Source: castled.io

In the cybersecurity context, connected apps include next-gen SIEMs, compliance automation tools and vulnerability management solutions. The integration is bi-directional, meaning the app writes and reads from the customer’s Snowflake as its analytics backend. Users can then interact with their data via the solution’s UI or directly via worksheets, notebooks or BI.

Typical connected app architecture for cybersecurity

There has been a lot of discussion lately as to what this new model means for the cybersecurity industry, from The Great Splunkbundling to the Lego blocks of the modern (security) data stack. Many leading vendors now offer both managed and connected options. As the customer, which approach is likely to drive better ROI for your project? What are the benefits that would justify moving from a vertically-integrated solution to a stack where you need to buy and operate multiple solutions?

Source: ROSS HALELIUK

Reason 1: Scale up during an incident

Working in cybersecurity can be an exciting gig- you never know when a headline event or a suspected breach will kick your day into high gear. With traditional security tooling, however, you don’t have control over the engine powering your searches and investigations. As a result, you are likely to overpay (via your vendor) for more power than needed during normal operations and be under-provisioned when you need to crunch through months of log data in a rush. Security teams that deployed their tooling in the connected application model have been taking advantage of Snowflake’s ability to 10x their search performance for a few hours or days when dealing with an urgent issue.

Resizing your connected app with one click

Reason 2: Choose your own retention period

Cheap and limitless storage is one of the great things about a modern data lake. But if your data resides with your vendor then their considerations will determine how long that data is available. Vendors tend to prefer cost savings and can choose to flush that data after a month or six months, forcing you to create and manage data copies for compliance and IR. The vendor’s retention policy is also broadly applied to all datasets. In the connected app model, you can choose which records get dropped and when, according to your own policies. You can also selectively delete records if some sensitive details got mixed up in your logs… not that that would ever happen right?

Reason 3: Verify data quality

Speaking of things that never happen (but actually do), your vendor’s data pipeline could fail. They could drop records that exceed a certain size or come from a certain region, and they could catch this after days or months go by. In a managed app, your visibility into collection outages and data quality issues are minimal to none. With a connected app, on the other hand, you have an extensive array of built-in, open source, and off-the-shelf options for ensuring that the data you expect is arriving in the right shape and on time.

Source: https://www.montecarlodata.com/blog-snowflake-data-quality-features/

Reason 4: There’s no maintenance overhead to worry about

Sometimes a security team considering which deployment option makes sense for them will express concerns about care and feeding for another database. Why take on the hassle when the vendor is willing to do it themselves in a managed app deployment. Don’t fall for the FUD. One of the reasons why Snowflake has reawakened the security data lake concept is that, unlike in the bad old days of Hadoop, there is no maintenance overhead. You can create an account, start loading data and never stop to think about the nuts and bolts under the hood. Or whether those nuts and bolts need to be patched. It’s also likely that you already have a team that supports Snowflake at scale for other departments and they can provision resources, set up guardrails and manage users for security the way they do for finance and marketing. But the admin work for deployment can be tightly scoped and is unlikely to break anyone’s quarterly plan.

Reason 5: Tailor your reporting

Canned dashboards can only take you so far. Some SIEM solutions support creating custom reports, and advanced security teams tend to invest heavily in building them out. But these reports suffer from missing data since they run separately from the central enterprise data stack, and they tend to be accessible only to security and IT personnel. Contrast this approach with the connected app architecture where all the SIEM data lives alongside the rest of the enterprise datasets and the data team can help create BI reports that anyone in the enterprise can access.

Security reporting with PowerBI at CSAA. Source: https://www.snowflake.com/en/resources/case-study/csaas-data-driven-security-transformation-with-snowflake/

Reason 6: Combine your data in new ways

As shown in the example above from CSAA’s security dashboards, there are lots of use cases for your SIEM data. And your SIEM can benefit from lots of untraditional datasets. For example, your daily average alert volume is a metric that can play a role in board-level budget planning. And data from HR on who recently changed teams or left the company can impact the risk score of data leak alerts. Running your SIEM on top of your existing Snowflake environment unlocks all of those use cases, improving ROI and TDIR outcomes.

Reason 7: Each tool extends your source of truth

Some security organizations have gone beyond their initial connected app to running multiple tools on their security data lake. I recently worked with a CISO that started out with three connected apps: one for TDIR, one for identity security and one for vulnerability prioritization and security metrics. While traditional tools come as a silo that needs to be broken, each connected app is an extension of the others. And any future in-house project then builds on that source of truth for faster implementation and time to value.

Reason 8: Avoid vendor lock-in

It’s been a few years since Snowflake customers started deploying cybersecurity connected applications. Since requirements and expectations evolve, it’s not surprising that over time some of those customers have moved from one vendor to another. Uniquely, connected app customers are able to maintain control of their data while swapping one connected app for another. With a lower bar for migration, SIEM buyers gain negotiation power and the flexibility to choose solutions that match where they’re at today.

Reason 9: Get started with data science and AI responsibly

Where your data lives is more than just a storage consideration- the modern security data lake is also a platform for data science and AI. This means that logs collected to your Snowflake by the connected app SIEM can be used together with open source large language models within your environment. On the other end of the spectrum, a solution that runs against the vendor’s backend and integrates with external LLM services will be severely limited in what it can train and infer- at least without exposing your logs to third-party shared models. As security use cases emerge for data science in general and LLMs in particular, the speed at which you can adopt them will depend on governance and control.

LLM running inside a Snowflake account. Source: https://www.snowflake.com/blog/container-services-llama2-snowpark-ml/

Reason 10: Accelerate automation initiatives

Your security team probably has any number of automation initiatives that they’d like to implement to reduce risk and make their lives more pleasant. These might involve a SOAR tool or custom apps like the one that Coinbase shared recently. As described in their post,

In the screenshots [below] you can see that by providing just my email address, an analyst can immediately see who I am, whether my account is active, what team I work in, what devices I have, my login history, historical detections relating to my user entity, what applications I have installed, my related IP addresses etc. All of this information is pulled each time an analyst performs a search and allows them to see key relevant information in a matter of seconds.

The astute reader may wonder how this low-code app is able to pull together those relevant details about users, their devices and their activity. It’s all about having the data centralized in advance, as made possible with the connected app model. Coinbase CSIRT has previously shared their experience with connected apps.

An early joint customer is Coinbase, which is using Snowflake and Material Security to centralize their email security data into their existing security data lake. As Matt Muller, Director of Security Operations at Coinbase says: “A security team is only as good as the data that fuels it. Integrating Material Security directly on top of an existing Snowflake security data lake enables faster incident response, reduces investigation times, and creates new correlations for threat detection.”

When a security team owns their data and can access it directly within a flexible and scalable analytics platform, there is no limit to the automation they can create.

Source: https://www.coinbase.com/blog/scaling-detection-and-response-operations-at-coinbase

And one reason to choose the managed app option

If your company doesn’t have Snowflake, you might be better off going with a managed app deployment. Otherwise you need to buy two new products, the connected app and the Data Cloud. But before you opt for the vendor to own your data in their environment, ask around the office. You may find that your data or IT teams do have a Snowflake account that they’ve been using for traditional data warehouse use cases. First of all, it’s good for you to know about that because it may contain sensitive data that you should be monitoring. Secondly, to capture all those benefits described above, you can easily create a new account within the existing Snowflake organization to host your security data and power the connected app. With secure data sharing, it’s easy to make data available between your business and security accounts. What would that look like? Here’s one final screenshot from the SIEM at Workrise which they deployed as a connected application:

Source: https://www.snowflake.com/blog/workrise-builds-strong-security-program/

As application deployment models evolve and security teams get more comfortable with the modern data stack, connected apps will become increasingly prevalent. Hopefully this post will help you navigate the considerations between managed and connected deployment options.

--

--

Omer Singer

I believe that better data is the key to better security. These are personal posts that don’t represent Snowflake.