Bulbous, Not Tapered

Foo-fu and other favorites…

Capacity Planning for Snort IDS

Snort is a very capable network intrusion detection system, but planning a first-time hardware purchase can be difficult. It requires fairly deep knowledge of x86 server performance, network usage patterns at your site, along with some snort-specific knowledge. Documentation is poor, current planning guides tend to focus on one or two factors in depth without addressing other broad issues that can cause serious performance problems. This post aims to be a comprehensive but high-level overview of the issues that must be considered when sizing a medium to large snort deployment.

A Note About Small Sites

Small snort-deployments don’t require much planning. Almost any system or virtual-machine will suffice to experiment with Snort on a DSL or cable internet connection with a bandwidth of 5-10Mbits/sec, just jump right in. If you need to monitor 50-100Mbits/sec of network traffic, or 5-10Gbits/sec of network traffic, then this guide can help you size you size your sensor hardware.

Know Your Site

It helps to know a few things about your site before you start planning.

The most common way to get started is to monitor your internet link(s). Many organizations also expand to monitor some internal links: data-center routers, site-to-site links, or networks with VIP workstations. Unless you know what you’re doing, I suggest starting with your internet links and expanding once you’ve got that performing well. There are generally far fewer internet links to consider, and they are often much lower bandwidth than internal links which can make your first deployment simpler.

Life is simple if you have a single internet connection at a single site. If your network is more complicated then you’ll need to work with the team that manages your routers. They can help you figure out how many locations will need to get a sensor and how many capture interfaces each of those sensors will need to monitor the links at that site.

How much traffic do you need to monitor?

The single biggest factor when sizing your snort hardware is the amount of traffic that it must monitor. The values to consider are the maximum burst speed of each link, and also its average daily peak. It’s common to have burst capacity well in excess of actual usage and when you design your sensors you must decide what traffic level you’re going to plan for. Planning for the burst value ensures that you won’t drop packets even in a worst-case scenario, but may be much more expensive than planning for the average daily peak.

For example, it’s common to contract with an ISP for 100Mbits/sec of bandwidth that is delivered over a 1000Mbits/sec link. The average daily peak for such a link may be 60Mbits/sec, but on rare occasions it may burst up to the full 1000Mbits/sec for short durations. A sensor designed for the relatively small amount of daily peak traffic is inexpensive and simple to manage, but may drop 80% of packets or more during bursts.

If MRTG or Nagios graphs of router utilization are available, they can be very helpful in capacity planning.

Inline, Tap, Span, or VACL Capture

There are various ways to extract traffic for examination. Inline deployments where Snort is used as an intrusion prevention system should be treated with great caution because sizing problems and configuration issues related to Snort can cause network problems and outages for all your users. When running a detection configuration in conjunction with taps, spans, or VACL captures, Snort problems generally don’t cause user-facing network outages are a much much lower risk.

Security teams generally favor taps due to their consistent performance even when a router is overloaded, but there are successful Snort deployments that utilize all of the above methods of obtaining traffic for inspection. Ntop.org has a good document on making the tap vs span decision, and the wikipedia page on network taps provides informative background as well.

Operating System Expertise

Consider what operating systems your technical staff have expertise in. It is common to run high-performance Snort deployments on various Linux distributions or on FreeBSD. At one time, FreeBSD had a considerable performance advantage over equivalent Linux systems but it is currently possible to built a 10Gbit/sec deployment on Linux or BSD based systems using roughly equivalent hardware.

I recommend against deploying on Windows because not all Snort features are supported on that platform. Notably, shared-object rules do not function on windows as of Snort 2.9.0.5. While there are far fewer shared-object rules than normal “GID 1” rules, and they are released less frequently, they can still be a useful source of intelligence.

I also recommend against deploying Snort on *nixes other than Linux or BSD. Although Snort may work well on these platforms, the community employing them is much smaller. It will be much more difficult to find guidance on any platform-specific issues that you encounter.

It’s worth mentioning that my own experience is with high-performance Snort deployments on Linux, and parts of this post reflect that bias.

Single-Threading vs Multiple-CPUs

Snort is essentially single-threaded, which means that out of the box it doesn’t make effective use of multiple CPUs (technically there is more than one thread in a snort process, but the others are used for housekeeping tasks that don’t require much CPU power, not for scaling traffic analysis across multiple CPUs)._ _ As of August 2011, Snort on a single-CPU can be tuned to examine 200-500Mbits/sec, depending on the size of the ruleset used.

It’s possible to scale to 10Gbits/sec by running multiple copies of snort on the same computer, each using a different CPU. A multi-snort/multi-CPU configuration is quite a lot more complex to manage than a single-cpu deployment. Traffic from high-capacity links must be divided up into 200-500Mbit/sec chunks that can be examined by a single CPU, techniques to perform this load-balancing are discussed in the next section. Additionally, startup-scripts often must be customized and it can be difficult to manage multiple configuration files and log files. In spite of the management complexity, large organizations have successfully managed high performance Snort deployments this way for many years.

Suricata is a relatively new project that is well-worth keeping an eye on. It has a multi-threaded architecture that makes effective use of multiple CPUs, but is not as CPU efficient as Snort as of Suricata 1.0.0. As such, Suricata on a large multi-core system is much faster than Snort running on a single CPU, but about 4x slower than many Snort instances running on that same multi-core system. As Suricata matures, performance will improve. Additionally, managing a single Suricata instance is simpler than managing many Snort instances. Update 2013-11: Suricata seems to have addressed it’s performance issues, it can now inspect several hundred mbits/sec/core… which is on par with Snort.

Traffic Capture Frameworks

Snort is a modular system that supports many frameworks for capturing traffic, but not all of them scale equally well.

AFPACKET

The default capture framework on Linux since Snort 2.9, afpacket provides no features to load-balance traffic between multiple instances of snort running on multiple CPUs. As such, it can’t scale beyond 200-500Mbits/sec of throughput without some external technique to balance the load between several network interfaces. Even with this limitation, afpacket is the simplest and best choice for snort deployments with less than 200Mbits/sec of traffic.

Libpcap 0.9.x

The default capture framework on Linux for the Snort the 2.8.x series and prior, libpcap is very similar to afpacket from a user-perspective. It also lacks a built-in load-balancing feature, and can scale to a few hundred Mbits/sec of traffic. Consider upgrading Snort and using afpacket instead.

Libpcap >= 1.x.x

Around 1.0.0, libpcap introduced an mmapped feature designed to improve capture performance. Unfortunately the feature backfired and reduced performance due to a hard-coded buffer-size that is too small for most sites. Use afpacket instead unless you know what you’re doing.

PFRING and TNAPI/DNA

Pfring is a linux kernel-module that provides load-balancing through its ring clusters feature. It additionally supports several capture cards through its TNAPI/DNA high-performance drivers, which are available for $200-250 from the ntop store. Pfring, used in conjunction with a TNAPI-compatible network interface, is the least expensive method available to load-balance traffic to several instances of Snort running on several CPUs, and can scale to 10G on appropriate hardware.

High-Performance Capture Cards

Endace and other companies manufacture high-performance capture cards with integrated drivers that have load-balancing features. Depending on speed and features these cards can cost anywhere from $2,000-$25,000, and at the high end scale to 10Gbits/sec. Most of my high-performance Snort experience is on Endace hardware, which has its niggles but generally works very well.

Sourcefire 3D Hardware

Last, but certainly not least, Sourcefire sells snort hardware that is throughput rated and can simplify much of your planning. Managing a multi-Snort deployment is a lot of work, and Sourcefire has designed their systems to provide the power of Snort with an easy to manage interface, plus some features like RNA that are only available via Sourcefire. They’re more expensive than similar hardware to run open-source snort, but they may be more cost-effective in the long-run unless your organization has a do-it-yourself culture with time and technical expertise to tackle a complex open-source Snort deployment.

Traffic Management Techniques

The following traffic management techniques can be used in conjunction with the capture frameworks above to provide additional flexibility.

Hardware Load-Balancers

Gigamon, CPacket, and Top-Layer produce specialized network switches that can perform load-balancing to multiple network interfaces. The port-channeling feature of retired Cisco routers can be used to similar effect. These devices can be used to distribute traffic to multiple network interfaces in a single server or even to multiple servers, possibly scaling beyond 10G (I haven’t tested beyond 10G). I’ve worked with both Gigamon and Top-Layer hardware and found that they both do what they claim, although only Gigamon offers many 10Gbit/sec interfaces in one device. CPacket has been used by knowledgeable peers of mine and offers a unique feature that allows you to use any vanilla network switch to expand the port count of their load-balancer by using mac-address rewriting. These systems are fairly expensive, typically carrying 5-figure price tags, but often can be put to many uses in a large organization.

Manual Load-Balancing

Sometimes, traffic can be manually divided simply by configuring routers to send about half of your networks over one port and half over another. This “poor man’s” load-balancing can be cost-effective for links that are just a bit too large for one network interface.

Linux Bonded Interfaces

The opposite of load-balancing, if you have several low-bandwidth interfaces that you would like to inspect without the overhead of managing multiple copies of snort you can use bonding to aggregate them together as long as the total throughput isn’t more than a few hundred Mbits/sec.

Sizing Hardware

Now that you know how many locations you need to place a server at, how many links there are to monitor at each location, and what capture-frameworks can work for you, it’s time to choose your servers.

CPU

A very rough and conservative rule of thumb is that Snort running on a single CPU can examine 200Mbits/sec of traffic without dropping an appreciable number of packets. Snort can examine 500Mbits/sec of traffic or even much more on a single CPU with the right networking hardware and a very small or very well-tuned ruleset, but don’t count on achieving that kind of throughput unless you have tested and measured it in your environment. Martin has posted a more detailed CPU sizing exercise on his blog if you’d like to dig a little deeper.

Remember that snort is single-threaded. Unless you plan to use a load-balanced capture-framework, single-CPU performance is more important than number of cores. Alternately, if you know that you have lots of traffic to monitor, you’ll need a multi-core system paired with a load-balanced capture framework. Snort scales very linearly with the number of cores you throw at it, so don’t worry about diminishing returns as you add cores.

RAM

Each snort process can occupy 2Gbytes-5Gbytes of ram. How much depends on:

  • Traffic - The more traffic a sensor handles, the more state it must track. Stream5 can use anywhere from a few Mbytes to 1Gbyte to track TCP state.
  • Pattern Matcher - Some pattern matchers are very CPU efficient, and others are very memory efficient. The ac-nq matcher is the most cpu-efficient, reducing CPU usage by up to 30% over ac-split, but adding over 1Gbyte of ram usage per process.  The ac-bnfa matcher is quite memory efficient, reducing ram usage by several hundred Mbytes per process, but increasing CPU usage by up to 20%.
  • Number of rules - The more rules that are active, the more memory the pattern matcher uses.
  • Preprocessor configs - The stream5 memcap is one crucial factor for controlling memory usage, but all preprocessors occupy memory and many can be configured to be conservative or resource-hungry.

A Snort process inspecting 400Mbits/sec of traffic, with 7000 active rules, using the ac-nq pattern matcher (which is memory-hungry), and a stream5 memcap of 1Gbyte uses about 4.5Gbytes of RAM. With a smaller ruleset and the ac-bnfa pattern matcher (which is memory-efficient), I’ve observed snort processes use about 2.5Gbytes of RAM.

Note that the operating system and other applications will need some RAM as well, and if you don’t have unusual needs 2G is generally plenty. A detailed discussion of RAM sizing for the database is beyond the scope of this post, but generally for a multi-snort deployment it’s worth putting the database on a separate server that has 1-4Gbytes of RAM.

Disk Capacity and I/O

Snort generates very little disk I/O when outputting unified2 logs. Similarly barnyard2 generates very little I/O when reading them. Any hard-disk configuration, even a single low-rpm disk will meet snort’s performance needs.

A detailed discussion of the database I/O needs is beyond the scope of this post. Again, most multi-snort sites should consider putting the database on a different server.  I/O needs will vary depending on the alert-rate, the number of users querying the database, and the front-end used, but in general a 4-disk raid-10 will suffice even for a large multi-gigabit deployment. Small sites with only a few hundred megabits/sec of traffic could even use a single-disk if it meets their availability requirements.

Administrative Network Interface

Snort doesn’t generate a notable amount of network traffic on the administrative interface unless you’re connecting to the database over a low-bandwidth wan-link. Any network interface that is supported under Linux will suffice for even the largest 10Gbit/sec deployments.

Capture Network Interfaces

Each site has widely varying requirements for capture interfaces, so it’s difficult to make generic recommendations. Consider the following factors:

  • Have enough servers to put one at each site where there is a link to be monitored.
  • Have enough interfaces in each server to monitor the number of links at its site.
  • Ensure that each interface is fast enough to monitor the link assigned to it without dropping packets.
  • If any individual link exceeds about 200Mbits/sec, employ a capture framework that features load-balancing and select a compatible interface.

PCI Bus Speed

At multi-Gbit/sec traffic rates, it is possible to saturate the PCI Express bus. Each PCI Express 16x slot has a bandwidth of 32Gbits/sec (4Gbytes/sec), 8x slots are half that, and 4x slots are half again.Theoretically, each slot has dedicated bandwidth such that two PCI Express 16x slots should have a combined bandwidth of 64Gbits/sec, but in practice the uplink between the PCI Express bus and the main memory bus is different in each motherboard chipset and may not be fast enough to provide the full theoretical bandwidth to every slot.

Bus saturation is only a potential issue at very high traffic rates, either involving multiple 10Gbit/sec links or inspection of a single 10Gbit/sec link with multiple sensor applications. Be prepared to split sensor functionality across multiple servers if testing shows unexpected performance bottlenecks that might be related to bus saturation. Hardware load-balancers such as those sold by Gigamon can be useful to duplicate and load-balance very high traffic rates to multiple 10Gbit/sec sensors.

Putting It All Together

There are many factors to consider listed above, but 80% or more of cases fall into a few broad classes that can be summed up briefly:

  1. One or two links, 200Mbits/sec or slower - Almost any server you buy today can handle this. Get 2-4 cores, 8 Gbytes of ram and 2-4 network interfaces of any type if you want to maximize your options.
  2. One or two links, 200-400Mbits/sec - You should consider multi-snort load-balancing with PFRING or another suitable capture framework. If you’re going try to feed this traffic to a single-snort instance in order to avoid the maintenance overhead of multi-snort, get the highest-clocked fastest single-CPU that you can find, otherwise any system with sufficient RAM will work well.
  3. One or two links, 500-1000Mbits/sec - You need multi-snort, consider pfring and with a TNAPI compatible network interface listed on ntop.org.  You’ll need 2-4 snort processes, which means 10-20Gbytes of ram and a quad-core system.
  4. One or two links 1-10Gbit/sec - You definitely need multi-snort with high-performance capture hardware. I’m partial to Endace, but pfring with a 10G TNAPI-compatible card should also work. You need 1-core and 4Gbytes of ram for every 250Mbits/sec of traffic that you need to inspect. Alternatively, consider a Sourcefire system. If you’re just getting started with Snort this is going to be a big project to do on your own.
  5. Many links or greater than 10Gbit/sec traffic - Try to break the problem down into multiple instances of the above cases. A Gigamon box at each site may give you the flexibility that you need to split the problem across multiple servers effectively. You also might also need a moderately high-performance database server, properly tuned and sized.

Wrapping Up

Good luck with your new Snort server. Now go get some rules:

  • Emerging Threats: Excellent for detecting trojans and malware that have successfully compromised systems on your network and are “phoning home”. The ET rules are available free of charge and anyone can contribute fixes or new rules if you find a gap or problem with the ruleset.
  • VRT Subscriber Feed: Excellent for detection of exploits and attacks before they become compromised systems. The subscriber feed is developed and maintained by the experts on Sourcefire’s Vulnerability Research Team, and they charge $30/yr for a personal subscription or $500/yr for a business subscription.
  • VRT Registered Feed: The registered feed contains the same rules as the subscriber feed, but updates are released 30-days after subscribers receive them. The registered feed is a reasonable alternative for personal use, but if you’re protecting a business I recommend the subscriber feed.
  • ETPro: ETPro aims to supplement the ET community sigs with attack/exploit sigs similar to what the VRT provides. Pricing is $35/yr for personal use or $350/yr for businesses. I haven’t used it, though it’s on my todo list to try.

Once you’ve got things running, consider reading my slides on monitoring Snort performance with Zabbix to see how well you sized your system.

License and Feedback

If you find errors in this guide or know of additions that would improve it, leave a comment below.

Creative Commons License Capacity Planning for Snort IDS by Mike Lococo is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Permissions beyond the scope of this license may be available at http://mikelococo.com/2011/08/snort-capacity-planning/#license.

If you’d like to reuse the contents of this post but the cc-by-sa license doesn’t work for you for some reason, I’m happy to discuss offering the contents of this guide under almost any reasonable terms at no cost to individuals and corporations alike. Whether you work for Sourcefire, the OISF, or are just another community member writing a Snort guide, I’m happy to work something out that lets you use any portion of this post you need. Leave a comment below or contact me using the information on the about page if you’d like to discuss.