Author Profile - Michael Patterson is currently the Product Manager of Scrutinizer NetFlow and sFlow Analyzer at Plixer International. Prior to Plixer Michael worked for Cabletron Systems as the Director for outsourced network management.
I’ve wanted to blog about this for awhile ever since a customer approached me with the issue at Cisco Networkers. I’ve pieced together the issue based on information I have collected over time from customers. Below I outline how I compared NetFlow Reports to that status of the TCAM tables on a Cisco Catalyst 6513. Some feedback from a knowledgeable reader would be fantastic. I think this is an important topic.
Apparently there is a condition on the Cisco Catalyst 6513 revolving around TCAM tables and excessive connection requests. If the TCAM tables get full in the Supervisor Engine 720. When this happens, a NetFlow overflow can occur caused by excessive entries.
What is a TCAM?
Ternary Content Addressable Memory. Before we get into TCAMs, let’s start with a plain old ‘CAM’ definition which is used to search only for ones and zeros; a simple operation. Charlie Schluting explained this well: MAC address tables in switches commonly get stored inside binary CAMs. You can bet that just about any switch capable of forwarding Ethernet frames at line-speed gigabit is using CAMs for lookups. If they were using RAM, the operating system would have to remember the address where everything is stored. With CAMs, the operating system can find what it needs in a single operation.
According to Charlie, TCAMs allow the hardware to calculate forwarding decisions based on subnet address using a bit mask and then apply logical AND operations. Routers can store their entire routing table in TCAMs, allowing for very quick lookups.
How is this related to NetFlow?
As stated earlier, if the TCAM tables get full in the Supervisor Engine 720 a NetFlow overflow can occur caused by excessive entries.
The Supervisor Engine turns on aggressive aging when the table size reaches almost 90 percent. The idea behind aggressive aging is that the table is nearly full, so there are new active flows that cannot be created. At this point, users could start calling the helpdesk yelling “I can’t connect to blah blah blah. HELP ME!”.
Therefore, it could make sense to aggressively age-out the less active flows (or inactive flows) in the table in order to make space for more active flows. I’m sure there are pros and cons to this. Comments are welcome.
This example shows the console output that is displayed when this problem occurs:
Aug 24 12:30:53: %EARL_NETFLOW-SP-4-TCAM_THRLD: Netflow TCAM threshold exceeded, TCAM Utilization [97%]
Aug 24 12:31:53: %EARL_NETFLOW-SP-4-TCAM_THRLD: Netflow TCAM threshold exceeded, TCAM Utilization [97%]
The above message indicates that the NetFlow ternary content addressable memory (TCAM) is almost full. Aggressive aging will be temporarily enabled. If you change the NetFlow mask to FULL mode, TCAM for NetFlow can overflow because there are so many entries. Issue the "show mls netflow ip count" command in order to check this information.
The Supervisor Engine 720 checks how full the NetFlow table is every 30 seconds. The Supervisor Engine turns on aggressive aging when the table size reaches almost 90 percent. The idea behind aggressive aging is that the table is nearly full, so there are new active flows that cannot be created. Therefore, it makes sense to aggressively age-out the less active flows (or inactive flows) in the table in order to make space for more active flows.
The capacity for each policy feature card (PFC) NetFlow table (IPv4), for PFC3a and PFC3b, is 128,000 flows. For the PFC3bXL, the capacity is 256,000 flows.
In order to prevent this problem, disable the FULL NetFlow mode. Issue
the *no mls flow ip" command.
Note: Generally, the "no mls flow ip" command does not affect packet forwarding because TCAM for packet forwarding and the TCAM for NetFlow accounting are separate.
In order to recover from this issue, enable MLS fast aging. My customer enabled MLS fast aging time which was initially set to 128 seconds. If the size of the MLS cache continues to grow over 32K entries, decrease the setting until the cache size remains less than 32K. If the cache continues to grow over 32K entries, decrease the normal MLS aging time. Any aging-time value that is not a multiple of 8 seconds is adjusted to the closest multiple of 8 seconds.
Router#*configure terminal*
Router(config)#*mls aging fast threshold 64 time 30
How can NetFlow Reporting Help?
We may have a report in Scrutinizer v7 that will help troubleshoot the issue. It would be great if some of you could test the accuracy.
Issue the "show mls netflow ip count" command on your 6513 in order to check this information:
c6513>show mls netflow ip count
Displaying Netflow entries in Active Supervisor EARL in module 7
Number of shortcuts = 60942
In Scrutinizer v7, I think the Connections report for ALL interfaces bidirectional for 1-2 minute of time will give you something that adds up to your command:
Above I used two minutes because lets say I had the MLS fast aging time value set to 128 seconds.
Top 10 x 4973 pages = ~49730 flows active in this minute. Don’t worry about outbound as what goes in usually goes out and the numbers will be identical. Notice the pagination in our graphs which allows access to all the flows.
I’m sure my lack of experience trouble shooting this issue is partly to blame. Notice that Scrutinizer sometimes understates by a considerable amount. 60942 – 49730 = ~10K. I expect some aggregation but, the connections report doesn’t aggregate much. The results from my comparison were sometimes pretty much right on the money and other times are way off and I want to know why.
The next time I trouble shoot this I will use the Flow View which doesn’t aggregate the flows. In the mean time, can anyone try this and compare the numbers and let me know?
About Scrutinizer NetFlow Analyzer
Below is an incomplete list of features in Scrutinizer v7:
- Detailed Baseline Reports on which hosts, applications, protocols, etc. are using the most network traffic. Drill in and learn how to optimize the network.
- Captures Cisco NetFlow, sFlow and other flow technologies and uses that data to monitor the overall network health. Check out the Activate NetFlow page. Exclude data such as IP addresses, ports and transport layer protocol types per router, interface or even globally across all routers / switches. Useful for excluding VPN traffic which Cisco routers sometimes double export in NetFlow.
- Network maps in flash or Google with clickable links that change color based on utilization.
- IP Grouping support and subnet trends.
- IPv6 Support as well as support for Flexible NetFlow, NBAR and NSEL (NetFlow Security Event Logs such as those on the Cisco ASA).
- Applications defined by combination of ports and IP addresses.
- Extensive flexibility for VoIP reports.
It does all this and was recently released for free. Download Scrutinizer here.








Recent Comments