Pages

Routing questions to remember

1) Black hole (networking)
In networking, black holes refer to places in the network where incoming traffic is silently discarded (or "dropped"), without informing the source that the data did not reach its intended recipient.
When examining the topology of the network, the black holes themselves are invisible, and can only be detected by monitoring the lost traffic; hence the name.

Black Hole Issue - everything that uses any large packets will be unable to work .

MTU (Maximum Transmission Unit) is the maximum IP packet size that can be transmitted without fragmentation - including IP headers but excluding headers from lower levels in the protocol stack.
The smallest MTU in general use is 576 bytes, so you should be able to safely start with an ICMP buffer of 548, then work up from there.
For normal Ethernet the MTU is 1500. This is the maximum amount of data available to IP, TCP, and the application, it excludes the bytes for the ethernet header and trailer.


The Maximum Segment Size (MSS) is the largest amount of data, specified in bytes, that a computer or communications device can handle in a single, unfragmented piece
or maximum TCP data size the computer can handle without fragmentation (can fit inside the layer 2 (ethernet) frame).

Null route A Cisco IOS router also has an interface called null0. When traffic goes to that interface, the router just discards it. Thus, the null interface on the Cisco router is the “black hole.”

Causes
 - Dead IP addresses (host is down or an IP address to which no host has been assigned),
 - Firewalls and "stealth" ports (firewalls can be configured to silently discard packets addressed to forbidden hosts or ports),
- Black hole filtering (intentional dropping packets at the routing level, prevents DOS/DDOS)
- Path MTU Discovery black holes (communication over some routes may fail if an intermediate network segment has a maximum packet size that is smaller than the maximum packet size of the communicating hosts--and if the router does not send an appropriate Internet Control Message Protocol (ICMP) response to this condition).
Path MTU discovery overcomes a significant performance bottleneck in TCP - fragmentation. 

Detect
When a network router receives a packet that is larger than the size of the Maximum Transmission Unit (MTU) of the next segment of a communications network, and that packet's IP layer "don't fragment" bit is flagged, the router is expected to send an ICMP "destination unreachable" message back to the sending host.
MTU for Ethernet  = 1,500 bytes,
Best (max) Ethernet payload = 1472 bytes = 1500 - 28 (icmp +ip header)

ping computer_name or IP_address -f -l 1472 
-f "do not fragment" bit set.
-l sets payload size of the ICMP echo packet.
By increasing the -l parameter on successive pings, you can identify how large an unfragmented packet can travel a specific route. The smallest MTU that is in general use is 576 bytes, so you can safely start with an ICMP buffer of 548 and then work up from there

When using some VPN solution keep in mind encryption overhead, may vary:
- additional 85 bytes IPSec overhead (type ah-(sha or md5)-hmac esp-aes-(128, 192, or 256),
- additional 45 bytes for esp-(3des or des).
In general MTU = 1400 bytes will be enough.

TCP MSS to 1400 bytes for IPsec tunnel sessions and 1364 bytes for GRE tunnel sessions.

Fixing or working around
1) Enable PMTU Black Hole Detection on the Windows-based hosts
2) Configure intermediate routers to send ICMP Type 3 Code 4 messages ("destination unreachable, don't fragment (DF) bit sent and fragmentation required")
The most common cause of PMTUD breaking is a misconfigured firewall dropping ICMP unreachable (type 3, code 4) messages.
3) Set the MTU of the host interface to be the largest size that the black hole router can handle, to guarantee that the largest possible packet size is sent over that connection.

Links:
MTU and Fragmentation Considerations in an IPsec VPN -
http://www.iphelp.ru/doc/3/Cisco.Press.Comparing.Designing.and.Deploying.VPNs.Apr.2006/1587051796/ch07lev1sec4.html
http://en.wikipedia.org/
http://support.microsoft.com/kb/314825
http://linuxtechres.blogspot.com/2010/03/black-hole-routers-issue-on-internet.html 

2) MTU tale

Usefull Links
http://cciethebeginning.wordpress.com/tag/eompls/
mtu = 1530 bytes

http://stack.nil.si/ipcorner/IP_Fragmentation/
http://thenetworksherpa.com/ospf-master-the-mtu-madness/

I found that the tcp mss adjust is a must-use command to avoid fragmentation and reassembly by the routers if IPSec and/or GRE is involved. The only time where I must set the IP MTU to 1500 inside a GRE tunnel was that my customer was having a home-grown apps that needed to see the entire 1500-byte packet intact.

http://www.cisco.com/en/US/docs/solutions/Enterprise/WAN_and_MAN/IPSec_Over.html
The best way to avoid fragmentation issues in a VPN environment is to manually set the MTU on all client and server Network Interface Cards (NIC) to a smaller value than the Ethernet standard of 1500 bytes. The "tried and true" value to use is 1300 bytes. 
Iif 2 routers with Ethernet interfaces supporting physical MTU 1526 are connected through Ethernet switch, in order to successfully implement some application that will produce this big Ethernet frames, switch must also support forwarding such frames.

https://learningnetwork.cisco.com/thread/48437
Encapsulating ethernet over MPLS requires that you place a new layer-2 header on on the packet. So let's look at extremes :

1500 bytes for the original packet
4 bytes for primary vlan
4 bytes for secondary vlan
So far 1508

6 bytes for the new source mac
6 bytes for the new destination mac
12 bytes for up to 3 MPLS labels
2 bytes for a new ethertype and length
4 bytes PW ethernet control word

That's 1538

Hope this helps
- Darren

---------------
Update : changed
  4 bytes for a new ethertype and length
to
  2 bytes for a new ethertype and length

in case anyone else reads this ... wasn't trying to look like a genius, just making the information cut and paste friendly in case someone else needs it

Thanks for the correction Erik!


Baby jumbo frames
The maximum size of an Ethernet frame should not exceed 1500 bytes + 18 bytes of Header.
If the size of an ethernet frame is more than 1518 bytes, then it is called as Baby Giant Frame or Jumpo frame.
1519 to 1600 bytes means Baby Giant Frame
1601 to 9216 bytes means Jumpo Frame.
Mainly, the Switches are handling the Ethernet Frames.
But Most of the switches do not accept Baby giant and Jumpos.

- So what? Why I am going to increase the size of a frame?
Taken this situation. When you are using VLAN (which uses 802.1Q tagging), four more extra bytes will be inserted into the frame making the frame as a giant frame (1518 +4 = 1822 bytes)
Yes, yes, How to sort out this problem?
If a port is declared as TRUNK PORT (instead of ACCESS PORT), then they will be able to handle these 1822-sized frames having the vlan tagging.

 - Is the any other method?
By configuration also we can increase the MTU size. (MTU stands for Maximum Transmission Unit). You can configure Gigabit Ethernet ports to support frames larger than 1500 bytes by using:
Switch(config)#system mtu 1552
Because the 802.1Q tunneling feature increases the frame size by 4 bytes when the metro tag is added, you must configure all switches in the service-provider network to be able to process maximum frames by increasing the switch system MTU size to at least 1504 bytes. The maximum allowable system MTU for Gigabit Ethernet interfaces is 9000 bytes; the maximum system MTU for Fast Ethernet interfaces is 1998 bytes.

Examples:
<S5328C-1>dis ver
Huawei Versatile Routing Platform Software
VRP (R) software, Version 5.110 (S5300 V200R001C00SPC300)
Quidway S5328C-EI Routing Switch uptime is 10 weeks, 5 days, 20 hours, 12 minutes

<S5328C-1>display interface GigabitEthernet0/0/2
GigabitEthernet0/0/2 current state : UP
Line protocol current state : UP
Switch Port, PVID : 3355, TPID : 8100(Hex), The Maximum Frame Length is 1600

3) Why many protocols (SNMP, DNS) uses UDP (unreliable)?
UDP is faster than TCP, and the simple reason is because its nonexistent acknowledge packet (ACK) that permits a continuous packet stream, instead of TCP that acknowledges a set of packets, calculated by using the TCP window size and round-trip time (RTT).

SNMP 
SNMP uses the User Datagram Protocol (UDP) as the transport protocol for passing data between managers and agents. UDP, defined in RFC 768, was chosen over the Transmission Control Protocol (TCP) because it is connectionless; that is, no end-to-end connection is made between the agent and the NMS when datagrams (packets) are sent back and forth. This aspect of UDP makes it unreliable, since there is no acknowledgment of lost datagrams at the protocol level. It's up to the SNMP application to determine if datagrams are lost and retransmit them if it so desires. This is typically accomplished with a simple timeout. The NMS sends a UDP request to an agent and waits for a response. The length of time the NMS waits depends on how it's configured. If the timeout is reached and the NMS has not heard back from the agent, it assumes the packet was lost and retransmits the request. The number of times the NMS retransmits packets is also configurable.
In a heavily congested and managed network, SNMP over TCP is a bad idea. 
http://oreilly.com/catalog/esnmp/chapter/ch02.html

DNS
Why UDP rather than TCP? It's simply a matter of efficiency. To start a TCP connection a minimum of three packets are required (SYN out, SYN+ACK back, ACK out). By the time you add a data packet into that and close the session off correctly you will have sent several packets. In contrast, UDP can get away with a minimum of two packets (one question, one reply). 
(DNS does support a TCP mode).

3) Why area 0 (backbone) is needed ins OSPF?
You can have multiple OSPF areas without an area zero, they just won't share routes.

Area zero, by design, is special in that other areas will accept and provide routes to it.  (NB: area types and special OSPF statements define and control route exchanges between an non-zero area and area zero.)

The reason for OSPF areas and area zero is to provide scalability.  This two level hiearchy limits the footprint of Dijkstra's algorithm topology (per area) and allows control of route distribution between areas.

A router with multiple OSPF areas, but without an area zero, would behave similar to a router running multiple routing protocols without any distribution between those protocols.

OSPF acts as a distance vector protocol between areas. And because of this OSPF uses a central area, area 0, to exchange routes between other areas. This is part of the reason area 0 exists to stop routing loops. 

But i think you are then making the assumption that without an area 0 you are in danger of a routing loop. But you would only be in danger of a routing loop in OSPF if areas that are not area 0, could actually exchange routes between themselves. But they can't and this is a specific design choice. 
https://supportforums.cisco.com/thread/2095492