November 30, 2020

Packet Loss: What is a primary cause of slow throughput in today’s wide-bandwidth networks?

 Packet Loss:

Modelling Microbursts and Packet Queues What is a primary cause of slow throughput in today’s wide-bandwidth networks?


Packet loss

How just a few packet losses can have a debilitating effect on throughput is another matter and has been very well studied –


This takes us into the fields of TCP flow control and congestion avoidance.


But what is the primary cause of packet loss?

Today we’re less likely to find problems in our physical infrastructure such as corroded cables and connectors, or EM interference, and must face the more ephemeral possibility of buffer overflow: a consequence of the severe congestion that occurs quite simply when increasingly-common, large bursts of packets arrive in a router and must wait for their turn to be transmitted. Long queues will be even more likely if the egress link is over-subscribed or feeds a WAN that is slower than the ingress links.

The subject of microbursts is never more prominent than in the securities-trading industry where huge sums depend on receiving news in time and placing orders first. How can we tell if our network has microburst problems and is prone to dropping packets? My search of the web has turned up only one approach. To detect over-subscription you must measure bandwidth utilisation, and, to detect microbursts that might appear for only a few hundreds of microseconds, you measure utilisation over correspondingly small intervals. Many tools compete on their ability to do this better and better. But really?

How many false-positives do you get? It is quite possible to drive a link at 100% utilisation for very long periods without overflowing a buffer. An over-subscribing burst can be handled without loss because that is the whole point of buffers. The severity of a microburst depends on the length of the packet queue when the burst arrives. In other words, you have to take into account all the preceding flows into the queue.

Some network devices understand our requirements very well – they record the high-water marks of packet-queue lengths over time so that we can tell when, how often, and how close we get to overflowing a buffer. Measures of utilisation, over any length of time interval, are little more than a curiosity.

What can we do if we don’t have such sensible network devices, or we are designing a new facility? With a capture file we can, of course, measure bandwidth utilisation, perhaps averaged over time intervals as short as we like. What isn’t widely known is that it is very easy to calculate and display what we really need – the length of the packet queue as it changes with the arrival and departure of every packet. Simple arithmetic tells us the time to transmit a packet of known length over a link of known speed, and tracking queue length involves only incrementing and decrementing a counter as packets arrive and leave a queue.

Thanks to a capture provided by @Sake Blok for a networking challenge at SharkFest 2020 we have a record of all the packets heading for a downstream queue behind a slow link. The large numbers of selective acks (SACKs) indicate quite severe packet losses and we might want to confirm that they are the result of buffer overflow. Furthermore, can we tell the size of the buffer, and what is the speed of the bottleneck?

If we model a queue with an output (egress) speed of exactly 916 Kbps we get this chart of the queue length indicated in two ways: the number of packets; and the space occupied by packets in the queue.



The model requires us to know which packets were lost after they passed the sniffer and our NetData tool infers all the losses from the appearance of selective acks and retransmissions.

Each red marker indicates the length of the queue when a packet was lost, and the fact that packet losses occurred only and always when the modelled length reached a peak confirms the accuracy of the model. The peak in the first burst was lower, probably because the queue was handling unseen traffic. The peak in the second burst is particularly interesting because packets were not lost at a uniform queue length measured in Kbytes, but when the number of packets reached a peak. The following chart shows that a proportion of packets in this burst were very small, and the queue-length evidence suggests that all queued packets were allocated the same buffer space, irrespective of their length.



This fish-net chart focuses only on the second burst and overlays the queue-length graphs with near-vertical short strips arrayed in steeply sloping diagonal lines, to indicate which packets and TCP connections occupied the egress link at any time. Horizontal ticks indicate the ends of packets, and the heights of the strips are proportional to packet length according to the scale on the left.


The popup points to a short packet with only 44 bytes of data and the prevalence of short packets is clearly visible where the horizontal ticks are close together.

Thick red strips in the second half of the chart indicate when the lost packets would have been transmitted if not dropped off the queue. Yes, this chart is hard to read, but it gives us a picture of all the packets in the queue at any time. If the capture contained packets from multiple connections, strip colours would reveal how packets from different connections were interleaved on the link.


Further validation of the model arises from the measurements of round-trip times plotted as black markers at times when the packets left the queue, not when captured by the sniffer. RTTs should increase when packets spend more time in a queue, and that seems to be the case here.



We can judge the effect on RTTs quite precisely because our queueing model knows exactly when each packet arrived and departed the queue. For the next chart NetData overlays calculated, relative queuing times as green circles and we see how closely the black and green markers track each other.



We can be confident in our assessment of the queue’s link speed by experimenting with the model; here we increase the speed by just one percent and regenerate the chart:

Packets are no longer lost at a uniform queue length, and the difference between RTTs and waiting times no longer remains constant throughout a burst. The modelling on the earlier charts tells us that all the lost packets were victims of an overflowing queue limited to 325 packets and about 500 Kbytes, waiting to be transmitted over a link with an effective speed of 916 Kbps.


Could you predict the downstream packet losses from only the rate of bits passing the sniffer – that is, if all you had was the black line on the chart below?


Bob Brownell has more than 45 years’ experience in communications and IT, initially designing and building networks, computer-controlled systems and packet-switching systems in Australia and Europe. He is a founder and director of Measure IT Pty Ltd, a firm that specialises in diagnostic analysis of IT systems, and over the last 20 years has been developing NetData, a powerful network analyser with unique visualisation capabilities that is now freely available. It characterises virtually all transactions with a broad range of application decoders that includes all the major database protocols. NetData has been licensed by IBM, major Australian banks and government departments to diagnose the most complex IT performance problems around the world. rom the University of Tasmania.


Contact

If you would like to understand microbursts in your network and investigate the behaviour of packet queues, packet shapers and packet policers – if you have any intractable performance problem or would like to extend your Wireshark skills and learn more about NetData – please send an email to

NetData Lite can be downloaded free from-


Or see Phil’s YouTube channel that includes training videos:

November 23, 2020

What you didn't know about DDoS attacks (Tom Bienkowski )

 Even before the current pandemic, the types and velocity of distributed denial of service (DDoS) attacks were on the rise!

And with the architectural changes brought about by COVID-19—such as greater reliance on VPN gateways as more employees work from home—organizations are at increased risk of disruption. In fact, according to NETSCOUT most recent Threat Intelligence Report, we have seen a 15 percent increase in DDoS attacks in 2020 compared to the same period in 2019—and a 25 percent increase over the height of the pandemic lockdown. At present, we are on track to experience more than 9 million attacks this year.

What you didn't know about DDoS attacks (Tom Bienkowski )
As organizations consider the steps needed to mitigate the risk from DDoS attacks and maintain resilience and availability, they should keep the following five areas in mind: Be mindful of stateful attacks. When most people think about DDoS attacks, they think first of volumetric attacks. But state-exhaustion DDoS attacks that block stateful devices such as firewalls, load balancers, and VPN concentrators from serving incoming connections from legitimate clients can also negatively impact vital applications, services, infrastructure, and data. This problem is particularly acute now, when we are increasingly reliant on remote connections through VPN concentrators. To protect against state-exhaustion attacks, it is important to design network infrastructure, including applications and service delivery stacks, to minimize state wherever possible. There is a common misconception that firewalls are sufficient to protect against DDoS attacks. This is simply not true, as they are vulnerable to state-exhaustion attacks. This is why best practices (including from firewall vendors) recommend that companies deploy stateless DDoS protection in front of firewalls to protect them from state-exhaustion DDoS attacks. Cloud-based protection is not enough. The most common form of DDoS attack protection is a cloud-based mitigation service, often from ISPs or independent providers. And while such services are indeed vital to stop large, volumetric DDoS attacks that outstrip the volume of internet circuits, that is only one part of a comprehensive protection strategy. For state-exhaustion and application-layer attacks, which are just as common, the industry best practice is a stateless, on-premises solution that can automatically detect and stop such attacks. Be aware of shifting tactics. Many savvy DDoS attackers use attack performance management tools to monitor the effectiveness of their attack in real time. These tools help determine whether defenses are deployed when attack vectors are altered. This can lead to the launch of multivector attacks, which are far more challenging to mitigate without the right solution in place. Size doesn’t always matter. The vast majority of DDoS attacks today are not massive in scale, but rather are smaller-sized and short-lived. It’s important to keep in mind that a DDoS attack does not need to be big and last a long time to have a negative impact. In fact, the overwhelming majority of DDoS attacks last one hour or less, and nearly a quarter of them last less than five minutes. This means organizations need DDoS attack protection that can instantaneously detect and mitigate attacks before the damage is done. Consider a hybrid approach to DDoS protection. At NETSCOUT, we recommend a hybrid approach to DDoS protection. The cloud-based model, which relies on a service provider to deliver DDoS mitigation services against volumetric DDoS attacks, can be highly effective. However, to adequately protect the dynamic nature of most organizations from smaller application-layer DDoS attacks, we recommend augmenting with on-premises DDoS protection. This allows organizations to rapidly deploy customized DDoS protection as new applications or services are rolled out. The fact is, DDoS attacks can be mitigated—if you are prepared! A key part of that preparation lies in a regular reassessment of your DDoS attack protection strategy. After all, today’s DDoS attacks are ever-changing, and traditional methods of protection may not be enough. Organizations should keep up with the latest trends in DDoS attacks, know what the current best practices are for defense, and test those defenses on a regular basis.

Author - Tom Bienkowski - NETSCOUT Product Marketing Director . Tom Bienkowski has been involved in the network and security field for more than 20 years. During his tenure in the industry, he has worked for large enterprises as a network engineer as well as for multiple network management and security vendors in sales engineering/management, technical field marketing, and product management roles. In his current role as director of product marketing at NETSCOUT, he focuses on NETSCOUT’s industry-leading DDoS protection solutions.

November 15, 2020

Sentient Stuff (Paul Smith)

 

Sentient Stuff (Paul Smith)

Those of us who have worked in the data storage industry often wonder how our computers match up to the processor we carry around in our own heads. Comparisons are difficult to come by – we can estimate the average number of neurons in a human brain (~ 86 billion), but they are quite different from the “bits” that comprise computer memory. If they functioned in the same binary way, we would have the storage equivalent of a typical flash drive, and we’d have to start deleting less important memories by the time we reached sixth grade.


Each neuron shares information with about 1000 others, putting the total number of connections at around a trillion. We also know that neurons cooperate with one another in storing memories, resulting in an overall estimated capacity of several Petabytes. This amount of computer memory would store about 3 million hours of video. If you think of your life as one big reality TV show, that’s about 300 years’ worth of narcissistic binge watching.


The size of an individual human memory is difficult to estimate; our more detailed memories probably take up the most room. As we grow and learn, some memories are discarded to clear up space, while others are just too good to let slip away. A great deal of information we consume is just not worth remembering in the first place. Computers and brains have much in common.


The more interesting comparison which has intrigued great thinkers for centuries involves consciousness. How sentient can a non-human entity be? We like to think that a computer doesn’t have feelings and can only mimic them. But is it aware of itself and if so, how does it feel about that? If you tell a new computer that it will never amount to much because its memory is too small or its processor is too slow, will it eventually need counseling?


The presence of consciousness is more than just idle conjecture. Neuroscientist Alysson Muotri of UCSD maintains Petri dishes in his lab where hundreds of tiny sesame seed-sized brains float around. Known as brain organoids, they have been connected to walking robots, used as models for advanced AI systems, and lately employed in the testing of SARS-CoV-2 drugs. None of this seems too alarming - except perhaps for the walking robots.


The point where things get a bit disconcerting is documented in the Muotri Group’s August 2019 Cell Stem Cell article. In this research, the little organoids began to generate coordinated waves of activity much like that seen in a conscious brain. Anticipating the philosophical and moral questions that would surely arise, Dr. Muotri shut down the experiment after a few months. In the meantime, other researchers were having their own epiphanies.


Developmental Biologist Madeline Lancaster knows that, like a computer, a brain without input and output isn’t worth much. Her research team tried growing brain organoids next to the spinal column of a mouse. Once a connection was established, the muscles began to contract. Harvard Molecular Biologist Paola Arlotta was able to induce light sensitivity in some brain organoids. He then observed that their neurons started firing when illuminated. These discoveries and others like them have produced some attention-grabbing research papers – and put many ethicists and theologians on notice – but where do we go from here?


There are some uniquely human conditions (e.g. autism) that cannot be studied in animal models. Effective research on these could benefit greatly from “consciousness in a jar”. In a culture that still debates the dangers of genetically modified tomatoes, this is a heavy lift. Both for the research itself, and for the ethical guidelines that must be developed, a standard way to define and measure consciousness is required. So far this has proven elusive.


Peter Singer, a philosopher and advocate for living things, famously noted that a particularly brilliant chicken might surpass some humans in certain capacities. A quick stroll through the meat department should convince you that this isn’t a very good metric.


Computers and brains have some similarities, but comparisons are sketchy at best. Our silicon tools start with simple, Boolean logic gates and build on those to produce striking complexity. Similarly, brain organoids grown in the laboratory start out as simple multi-cellular structures which can be coaxed into some very human-like behaviors. Whether or not consciousness is one of those remains to be seen, but how will we ever know if that collection of organoids in a jar is sentient?


Someday we may be able to just ask it.


Author Profile - Paul W. Smith - leader, educator, technologist, writer - has a lifelong interest in the countless ways that technology changes the course of our journey through life. In addition to being a regular contributor to NetworkDataPedia, he maintains the website Technology for the Journey and occasionally writes for Blogcritics. Paul has over 40 years of experience in research and advanced development for companies ranging from small startups to industry leaders. His other passion is teaching - he is a former Adjunct Professor of Mechanical Engineering at the Colorado School of Mines. Paul holds a doctorate in Applied Mechanics from the California Institute of Technology, as well as Bachelor’s and Master’s Degrees in Mechanical Engineering from the University of California, Santa Barbara.

Popular post in the past 30 days