Tuesday, February 19, 2008

Everything you need to know about IPv6

Once upon a time...

When the ARPANET was designed in the late 1960s, it was outfitted with a Network Control Protocol (NCP) that made it possible for the very different types of hosts connected to the network to talk with each other. However, it soon became clear that NCP was limiting in some ways, so work started on something better. The engineers decided that it made sense to split the monolithic NCP protocol into two parts: an Internet Protocol that allows packets to be routed between the different networks connected to the ARPANET, and a Transport Control Protocol that takes a data stream, splits it into segments and transmits the segments using the Internet Protocol. On the other side, the receiving Transport Control Protocol makes sure the segments are put together in the right order before they're delivered as a data stream to the receiving application. An important implication of this approach is that unlike, for instance, a phone connected to a wired or wireless phone network, a host connected to the ARPANET then and the Internet now must know its own address.

TCP/IP has served us well since it was born in 1981, but for some time now it has been clear that the IP part has a limitation that makes continued growth of the Internet for decades to come problematic. In order to accommodate a large number of hosts but not waste too much space in the IP packet on overhead, the TCP/IP designers settled on an address size of 32 bits. With 32 bits, it's possible to express 4,294,967,296 different values. Over half a billion of those are unusable as addresses for various reasons, giving us a total of 3.7 billion possible addresses for hosts on the Internet. As of January 1, 2007, 2.4 billion of those were in (some kind of) use. 1.3 billion were still available and about 170 million new addresses are given out each year. So at this rate, 7.5 years from now, we'll be clean out of IP addresses; faster if the number of addresses used per year goes up.








The feasibility of an open IPv4 market


This is usually when someone brings up NAT. Home routers (and a lot of enterprise equipment) use a technique called "network address translation" so that a single IP address can be shared by a larger number of hosts. The discussion usually goes like this:

"Use NAT, n00b. All 1337 of my Linux boxes share a single IP and it's safer, too!"

"NAT is not a firewall."

"NAT sucks."

"You suck."

So what about NAT?

Hosts behind a NAT device get addresses in the 10.0.0.0, 172.16.0.0, or 192.168.0.0 address blocks that have been set aside for private use in RFC 1918. The NAT device replaces the private address in packets sent by the hosts in the internal network with its own address, and the reverse for incoming packets. This way, multiple computers can share a single public address. However, NAT has several downsides. First of all, incoming connections don't work anymore, because when a session request comes in from the outside, the NAT device doesn't know which internal host this request should go to. This is largely solvable with port mappings and protocols like uPnP and NAT-PMP.

IPv4 address ranges
Class A: 1.0.0.1 to 126.255.255.254
Class B: 128.1.0.1 to 191.255.255.254
Class C: 192.0.1.1 to 223.225.254.254
Class D: 224.0.0.0 to 239.255.255.255 — reserved for multicast groups
Class E: 240.0.0.0 to 254.255.255.254 — reserved

Things get even trickier for applications that need referrals. NAT also breaks protocols that embed IP addresses. For instance, with VoIP, the client computer says to the server, "Please send incoming calls to this address." Obviously this doesn't work if the address in question is a private address. Working around this requires a significant amount of special case logic in the NAT device, the communication protocol, and/or the application. For this reason and a few others, most of the people who participate in the Internet Engineering Task Force (IETF) don't care much for NAT.

More to the point, NAT is already in wide use, and apparently we still need 170 million new IP addresses every year.

In the early days of the Internet, some organizations got excessively large address blocks. For instance, IBM, Xerox, HP, DEC, Apple and MIT all received "class A" address blocks of nearly 17 million addresses. (So HP, which acquired DEC, has more than 33 million addresses.) However, reclaiming those blocks would be a huge effort and only buy us a few more years: we currently burn through a class A block in five weeks. It's debatable how long we can make the IP address space last, especially as more and more devices, such as VoIP phones, become Internet-connected, but you can only keep squeezing the toothpaste tube for so long before it makes sense to buy a new one, even if the old one isn't technically empty. So in the early 1990s, the IETF started its "IP next generation" effort.

Larger addresses

The IPng project eventually resulted in IPv6 in 1995. In addition to the source and destination addresses and other housekeeping information, each IP packet contains a version number. For reasons lost in the mists of time, current IP packets have version number 4, and the first version number available for the new protocol was 6. So the old IP is now called IPv4, and the new IP IPv6. Apart from autoconfiguration and a lot of minor details that are best left to another article, IPv6 first and foremost sports larger addresses. Much larger addresses. 40 or 48 bits would have given us more than a trillion or even 281 trillion addresses, respectively, and 64 bits would have been a nice round number. But as the axiom goes, once bitten, twice shy, so the IETF opted for 128 bits this time around. The total number of possible addresses that this gives us:

340,282,366,920,938,463,463,374,607,431,768,211,456

To put this into perspective: there are currently 130 million people born each year. If this number of births remains the same until the sun goes dark in 5 billion years, and all of these people live to be 72 years old, they can all have 53 times the address space of the IPv4 Internet for every second of their lives. Let nobody accuse the IETF of being frugal this time around.

IPv4 addresses are written down by splitting them into four 8-bit values and putting periods between those, for instance, 192.0.2.31. IPv6 addresses on the other hand, are written down as eight 16-bit values with colons between them, and each 16-bit value is displayed in hexadecimal, i.e., using numbers and the letters A - F. For example, 2001:db8:31:1:20a:95ff:fef5:246e. It's not uncommon for IPv6 addresses to have a sequence of consecutive zeroes. In these cases, exactly one of those sequences can be left out. So 2001:db8:31:0:0:0:0:1 becomes 2001:db8:31::1 and the IPv6 loopback address 0:0:0:0:0:0:0:1 becomes ::1.

Stateless autoconfiguration



Although in most regards, IPv6 is still IP and works pretty much the same as IPv4, the new protocol departs from IPv4 in some ways. With IPv4, you need a DHCP server to tell you your address if you don't want to resort to manual configuration. This works very well if there's a single DHCP server, but not so much when there's more than one and they supply conflicting information. It can also be hard to get a system to have the same address across reboots with DHCP.

With IPv6, DHCP is largely unnecessary because of stateless autoconfiguration. This is a mechanism whereby routers send out "router advertisements" (RAs) that contain the upper 64 bits of an IPv6 address, and hosts generate the lower 64 bits themselves in order to form a complete address.

Traditionally, the bottom 64 bits of an IPv6 address are generated from a MAC address by flipping a bit and adding the bits ff:fe in the middle. So the Ethernet MAC address 00:0a:95:f5:24:6e results in 20a:95ff:fef5:246e as the lower 64 bits of an IPv6 address, called the "interface identifier" in IPv6 parlance. This way, if all the routers send out the same prefix for the upper 64 bits, the host will always configure the same IPv6 address for itself. No configuration is required, either on the host or a DHCP server. Alternatively, a host may generate its IPv6 address using a random number so its MAC address remains hidden from the rest of the Internet. Windows uses this type of addresses for outgoing sessions to aid privacy. Other operating systems can also generate these temporary addresses (a new one is generated every 24 hours) but don't do so by default.

When a router sends out several address prefixes, or several routers send out different address prefixes, hosts simply create addresses from each of those prefixes. Routers can make the hosts connected to them renumber their IPv6 addresses by removing the old prefix and advertising a new one. When done right, this is completely seamless.

Although the DHCPv6 protocol (the IPv6 version of DHCP) can give out IPv6 addresses the same way IPv4 DHCP servers give out IPv4 addresses, I haven't encountered any DHCPv6 servers or DHCPv6 clients that support this capability. With IPv6, DHCP is mostly used to distribute additional information, such as DNS server addresses, although there will be a way to do this through router advertisements as well soon, further diminishing the need for DHCP in IPv6.

Special address types

In addition to regular "global unicast" addresses as discussed on the previous page, IPv6 has several other types of addresses. I don't want to mention them all, but the three most important special purpose address types are:

Link local

Link local addresses are used to communicate over a single physical or logical subnetwork, such as an Ethernet. These addresses start with fe80 and are extensively used for IPv6's internal house keeping.

Site local

This is the IPv6 equivalent of the RFC 1918 private address space in IPv4. However, the IETF found the situation where different organizations use the same address space undesirable, so they created "unique site local" addresses where everyone takes a randomly selected block out of the IPv6 address space starting with fd.

Multicast

A multicast address is a group address, so every packet sent to a multicast address is received by all members of the group. Multicast addresses start with ff and can be used for applications where several hosts must receive the same information at the same time, such as live video broadcasts and also for autoconfiguration and discovery.

When running over Ethernet or WiFi, IPv4 hosts use broadcasts for discovery functions. For instance, in order to be able to send a packet over Ethernet, it's necessary to know the destination MAC address. So IPv4 simply broadcasts "who has 192.0.2.31?" to all systems on the network in question. IPv6, on the other hand, sends these packets to a multicast address, so only IPv6 hosts listening for these requests get to see them; on other systems the Ethernet hardware simply ignores the packets, and it's even possible for switches to filter them out by keeping track of the multicast groups hosts are listening on for each switch port.

IPv6 security

There is a lot of talk about how IPv6 is more secure than IPv4. This boils down to two things; one of them is real, the other isn't. The good news is that because the IPv6 address space is so large, randomly scanning for systems that are vulnerable is completely infeasible. The story goes that at the height of the self-propagating malware explosion a few years ago, an unpatched Windows system would be infected faster than it could download the necessary security updates. With IPv6, that is simply impossible: even with a billion infected hosts each scanning a billion IPv6 addresses per second, it takes more than a hundred million years to scan just the IPv6 address space that's given out to ISPs right now, which is about 0.01 percent of what's available. However, targeted scanning, although not easy, is still possible, so security measures like those used with IPv4 are still necessary.

The idea was to give IPv6 security a big push by making IPsec support mandatory. IPsec encrypts each individual packet, so it can be applied to all IP traffic, unlike the widely used SSL, which only works on top of TCP. However, for a number of reasons, it's very difficult to build IPsec support into applications, so it never gained much real-world use except as a mechanism to implement VPNs. And despite the fact that IPsec was developed for IPv6 or at least with IPv6 in mind, it also works with IPv4. All in all, IPsec can't be considered a security advantage for IPv6.

Let me reiterate a point I made earlier: a host that has IPv6 turned on will create a link local address for itself. This means that any host that has IPv6 enabled—out of the box for Windows Vista, Mac OS X, and most Linux and BSD distributions—is reachable over IPv6 for hosts connected to the same Ethernet, even if there's no IPv6 router sending out router advertisements. By monitoring IPv6 autoconfiguration traffic or by trying link local addresses created from MAC addresses seen in other types of traffic, it's not too difficult to find the addresses in question. An even easier method is sending out a multicast ping, and see what comes back. Windows blocks these, but BSD/Mac/Linux generally send back replies. The command line on these systems is:

ping6 -I interface-name ff02::1

Use the ifconfig command to find interface names. On systems where the IPv6 networking stack derives from the KAME implementation, such as the BSD family and MacOS, there are additional ping6 options that are even more helpful for nosy types. Type man ping6 to find out more.

With IPv4, there will generally be a NAT device that functions as a simple firewall by blocking incoming sessions (although there are ways to trick NATs into allowing them). Since there are more than enough public addresses to go around in IPv6, along with the dislike for NAT in IETF circles, there is almost never any NAT with IPv6, so no automatic protection against incoming sessions. This lack of automatic basic firewalling that comes with NAT is only the beginning, though. Many software firewalls that run on the to-be-firewalled host itself only support IPv4 and don't get in the way of IPv6 packets at all. The Windows and Mac OS built-in firewalls don't have this problem, but if you're doing any firewalling on Linux or BSD (or command line firewalling with Mac OS X), make sure that your services are firewalled over IPv6, too. On the BSD/Linux side, a good choice in this regard is the pf firewalling package, because unlike iptables, ipfw, or ipf, it supports both IPv4 and IPv6 and allows rules that apply to both. If you have a router or home gateway that supports IPv6, make sure that it, too, filters IPv6. A stateful filter that allows outgoing connections and return traffic, but not incoming connections closest to the IPv4 NAT filtering functionality.

Running IPv6

Although designing a new protocol isn't exactly trivial, the hard part is getting it deployed. Having to put an entire new infrastructure in place or flipping a switch from "IPv4" to "IPv6" for the current Internet aren't feasible. To avoid these issues as much as possible, the IETF came up with a number of transition techniques. The most important ones are dual stack and tunneling. Dual stack is nothing more than the notion that a host can run both IPv4 and IPv6 side by side, so it can talk to IPv4 hosts over IPv4 and to IPv6 hosts over IPv6. Tunneling means that when IPv6 packets must cross part of the network that only supports IPv4, the IPv6 packets are put inside IPv4 packets, transmitted across the IPv4-only part of the network, and then the IPv4 part is removed and the packets continue on their way over IPv6.

As mentioned earlier, most modern operating systems are set up for dual-stack operation by default. So if there's an IPv6 router on the local network that advertises an IPv6 prefix, a host will generate an IPv6 address for itself so it can talk to the IPv6 Internet. Now that Microsoft has enabled IPv6 by default in Vista (it can be turned on and off with ipv6 install and ipv6 uninstall in XP), we can probably expect more IPv6-enabled home routers like Apple's draft-802.11n Airport Extreme in the future.




Note that there's no requirement that your ISP supports the new protocol in order to use IPv6: an IPv6-enabled router or a host itself can use a tunnel to reach the IPv6 Internet. There are several tunneling techniques, but the most common ones are "manual" IPv6 in IP tunnels where the exact path of the tunneled IPv6 packets is set up through manual configuration, and 6to4 automatic tunneling. With 6to4, a host or router can create a range of IPv6 addresses from its IPv4 address. 6to4 addresses are easily recognizable because they always start with 2002. Because every 6to4-derived IPv6 address maps to an IPv4 address, it's easy for a system that understands 6to4 to tunnel the IPv6 packets to the right place over IPv4. Gateways make it possible for native IPv6 systems to communicate with 6to4 systems.

6to4 is easy to use because it doesn't require any configuration, and has the added bonus that it comes with built-in IPv6 address space. However, only public IPv4 addresses can be used for 6to4, so hosts behind NAT can't do 6to4 tunneling, and another limitation is the dependence on public gateways, which makes 6to4 slower and less reliable than other forms of IPv6 connectivity. If you're serious about IPv6, you'll want to set up a manual tunnel. If your ISP offers this service, that's the best choice to avoid unnecessary tunnel detours, but one of the many tunnel brokers is a good alternative.

Note that Windows Vista (and Windows XP with IPv6 enabled) have 6to4 enabled by default when the system has a public IPv4 address. The same is true for the new Airport Extreme, which will send out router advertisements with its 6to4 IPv6 address prefix so hosts connected to it will configure an IPv6 address and be tunneled over 6to4 by the router. 6to4 is also relatively easy to turn on with Mac OS X and BSD/Linux.

Systems with IPv6 connectivity (regardless of the type) decide whether to use IPv4 or IPv6 to reach a destination by consulting the DNS. Communication over the Internet requires addresses, but we generally work with domain names. The DNS takes care of the difference by having one or more A (address) records that contain an IPv4 address associated with a given name. If a system also has an IPv6 address, this is added to the DNS with an AAAA (quad-A) record. Hosts that only have IPv4 connectivity ignore the AAAA records, but dual stack hosts ask the DNS for both the A and AAAA records. They will then generally prefer to connect to a destination over IPv6 if possible, and use IPv4 if there's no AAAA record in the DNS or connecting over IPv6 doesn't work. Some applications and/or OSes always ask for AAAA records when IPv6 is turned on, which creates a problem with some (increasingly rare) buggy DNS servers that return an error after an AAAA query. In these cases, turning off IPv6 can make surfing the web a lot faster.

You can see if your computer has working IPv6 connectivity by connecting to www.kame.net or www.apnic.net. KAME is a Japanese project that built an IPv6 networking stack for BSD and Mac OS. Their mascot is a turtle, which dances if you connect over IPv6. APNIC is responsible for giving out IP addresses in the Asia-Pacific region, and their web site will tell you your IP address (IPv4 or IPv6) in the top left corner of the page. Internet Explorer under Windows, Safari on Mac OS X 10.4, and Firefox under Windows, Linux and BSD will use IPv6 when available on the system, but Firefox on the Mac has IPv6 turned off in about:config.

IPv6 and the future of home networking

Although stateless autoconfig works very differently from DHCP, in practice IPv6 works much the same as IPv4 in a home network: computers and other devices automatically get an address from a router, modem or gateway so they can connect to the 'Net without manual intervention. Firewalling is a bit different, because with IPv4, most people don't have the option to keep their network completely open.

When IPv6 takes off, we'll probably see a new class of home firewall products that allow more granular blocking of services and devices in a home IPv6 network than either block incoming sessions or allow everything, like we have in today's first IPv6 home routers. The abundance of address space also makes it possible to have separate subnetworks for different purposes, which will be helpful as more and more devices connect to the network. And we still have a lot to look forward to: the IETF is currently working on mobility and multihoming extensions to IPv6. Mobility means moving from one network to another while keeping the same IP address. So a VoIP call could start on your home network, continue over wireless service and then finish at work. Multihoming means connecting to more than one ISP at the same time, so that when one fails, communication sessions automatically move over to the other.

Moral of the story

Although IPv6 is taking its sweet time to conquer the world, it's now showing up in more and more places, so you may actually run into it one of these days. If you're working on security, keep your eye out for IPv6 because if overlooked, IPv6 could allow things that are blocked over IPv4. And if you're buying expensive equipment, you may want to make sure that if it doesn't do IPv6 today, it's at least upgradable, so you can still use your gear if IPv6 picks up more quickly than expected as IPv4 addresses run out. And it never hurts to experiment a bit with the new protocol so you know how it works by the time you need it.

1 comment:

Bobo said...

dont have time to learn about ip6.. but it should be learn, isn't it :)