[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Bill Frezza's mail on IPv6 and privacy
This is a not Bill Frezza sent out after his last column
on IPv6 and privacy. Jamie
----
>From frezza@alum.MIT.EDU Sun Oct 10 21:16:25 1999
Date: Sun, 10 Oct 1999 18:27:39 -0400
From: Bill Frezza <frezza@alum.MIT.EDU>
To: Address List Suppressed <frezza@interramp.com>
Subject: Re: My InternetWeek op-ed column on IPv6 privacy issues
Gentle readers,
Thank you for your feedback on my recent column "Where's All The Outrage
About The IPv6 Privacy Threat"
http://www.internetwk.com/columns/frezz100499.htm
Normally I respond to every letter I receive, but the volume of mail on
this one broke all records so I'm afraid I have to send out a group note.
(It seems that peeved IETF geeks are even more vociferous than peeved
Mac-heads. :)
Let me, in particular, address some of the comments from the folks who
insist that there is no privacy problem with IPv6 or that the problem is
identical with IPv4 or that the problem has already been solved by the IETF
or that the use of EUI-64 addressing is optional and not mandatory,
therefore there's no problem. (I do not intend to reply to the folks who
just plain don't care about privacy as this is a philosophical issue beyond
the scope of this note. As for readers who don't like my inflammatory
writing style, feel free not to read my column. I write op-eds, not news
articles.)
1) Yes, IPv4 has its own privacy problems. In fact, any time one uses a
static or long-lived IP address, the possibility exists for abusive
systematic surveillance. The fact that a central registry of addresses does
or does not exist has no bearing on the potential threat as such a data
base can be built over time, particularly for individuals or groups that
have been specifically targeted. In addition, countries like China or
Singapore could easily require registration. Why give them the tools in the
first place?
2) Yes, dynamic assignment of IP addresses for dial-up users as well as
Network Address Translation (NAT) helps mitigate (though does not
eliminate) the privacy problem. As we move towards an "always connected"
Internet, with more and more users communicating via Cable Modems or DSL,
more IP addresses will become long lived, hence the problem will get worse.
3) It's too late to change IPv4 while it is not too late to change IPv6. My
column was intended as a call to action to get the IETF off it's duff. A
fault in my column is that I did not specifically describe proposals being
circulated to address the issue, one of which is attached below. Mea culpa,
and my apologies to the good guys for ignoring their efforts. May the force
be with you.
4) Be that as it may, draft-ietf-ipngwg-addrconf-privacy-00.txt is just a
proposal. It has not been and may not be adopted. Absent further action, it
will go away. If it is adopted, it may or may not be implemented by major
vendors, especially if the final standard offers a Chinese menu of choices.
5) Note that the proposal ACKNOWLEDGES THE PROBLEM. In addition it points
out that
> The desires of protecting individual privacy vs. the desire to
> effectively maintain and debug a network can conflict with each
> other.
The same can be said across the board for many aspects of law enforcement
in a society that values liberty. Just think how much safer the streets
would be if we all walked around with electronic radio ID collars
registering our movements. Fortunately, we have chosen not to construct
such a society (although if you follow the development of the CALEA laws,
this is not for want of the FBI trying).
6) The solutions proposed in draft-ietf-ipngwg-addrconf-privacy-00.txt
cause a a major problem with one of the other goals of IPv6
> The IPv6 addressing architecture goes to great lengths to ensure that
> interface identifiers are globally unique. During the IPng
> discussions of the GSE proposal [GSE], it was felt that keeping
> interface identifiers globally unique in practice might prove useful
> to future transport protocols. Usage of the algorithms in this
> document would eliminate that future flexibility.
The random assignment algorithms look very promising. I hope they are
adopted, but no one knows yet how this conflict is going to be resolved.
7) At the end of the day, what matters to the average netizen is not the
menu of possible alternatives described in IETF standards, but the actual
default implementation in popular products, e.g. Windows. Just because an
educated and motivated geek can get into the plumbing of his machine and
find a way to solve his own privacy problem doesn't mean the problem has
been solved for the bulk of average users. If the folks at Microsoft don't
properly address this in their future products, I can positively,
absolutely guarantee that it will blow up in their face.
8) Readers who are interested in registering their concerns should contact
the CDT at www.CDT.org. I got a very nice note from them indicating that
they are now wading into the issue. One hopes that EPIC and the EFF will
follow suit.
Cheers,
Bill Frezza
InternetWeek
frezza@alum.MIT.edu
PS Special thanks to Scott Bradner and Mike O'Dell for helping me
understand some of the technical nuances of this issue that I had
previously misunderstood. Notwithstanding, I stand by my column and hope
all you geeks do something about it. The opinions above are my own.
------------
INTERNET-DRAFT Thomas Narten
IBM
<draft-ietf-ipngwg-addrconf-privacy-00.txt> June 1999
Privacy Extensions for Stateless Address Autoconfiguration in IPv6
<draft-ietf-ipngwg-addrconf-privacy-00.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
Nodes use IPv6 stateless address autoconfiguration to generate
addresses without the necessity of a DHCP server. Addresses are
formed by combining network prefixes with a constant interface
identifier derived from the interface's IEEE Indentifier. This
document describes an optional extension to IPv6 stateless address
autoconfiguration that results in a node generating addresses from an
interface identifier that changes over time. Changing the interface
identifier over time makes it more difficult for eavesdroppers and
other information collectors to identify when different addresses
used in different transactions actually correspond to the same node.
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 1]
INTERNET-DRAFT June 24, 1999
Contents
Status of this Memo.......................................... 1
1. Introduction............................................. 2
2. Background............................................... 3
3. Protocol Description..................................... 5
4. Implications of Changing Interface Identifiers........... 7
5. Open Issues and Future Work.............................. 7
6. Security Considerations.................................. 8
7. References............................................... 8
8. Authors' Addresses....................................... 9
1. Introduction
Stateless address autoconfiguration [ADDRCONF] defines how an IPv6
node generates addresses without the need for a DHCP server. Network
interfaces typically come with an embedded IEEE Identifier (i.e., a
link-layer MAC address), and stateless address autoconfiguration uses
the IEEE identifier to generate a 64-bit interface identifier
[ADDRARCH]. By design, the interface identifier will typically be
globally unique. The interface identifier is in turn appended to a
prefix to form a 128-bit IPv6 address. All nodes use this technique
to generate link-local addresses for their attached interfaces.
Additional addresses, including site-local and global-scope
addresses, are then created by combining prefixes advertised in
Router Advertisements via Neighbor Discovery [DISCOVERY] with the
interface identifier.
As mobile devices (e.g., laptops, PDAs, etc.) move topologically,
they form new addresses for their current topological point of
attachment. While the node's address changes as it moves, however,
the interface identifier contained within the address remains the
same. Because the interface identifier associated with a node can
potentially remain fixed for a long period of time (e.g., months or
years) concern has been voiced that the interface identifier could in
some cases be used to track the movement and usage of a particular
machine. For example, a server that logs the source addresses of
incoming connections would simultaneously collect identical
information keyed on the interface id, allowing one to correlate
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 2]
INTERNET-DRAFT June 24, 1999
activities based on interface identifiers in addition to addresses.
This is of particular concern with the expected proliferation of
next-generation network-connected devices (e.g, PDAs, cell phones,
etc.) in which large numbers of devices are in practice associated
with a single user. Thus, the interface identifier embedded within an
address could be used to track activities of an individual.
This document discusses concerns associated with the embedding of
interface identifiers within IPv6 addresses and describes optional
extensions to stateless address autoconfiguration that can help
mitigate those concerns in environments where such concerns are
significant.
2. Background
This section discusses the problem in more detail and provides
context for evaluating the significance of the concerns in specific
environments and makes comparisons with existing practices.
2.1. Extended Use of the Same Identifier
The use of a non-changing interface identifier to form addresses is a
specific instance of the more general case where a constant
identifier is reused over an extended period of time and in multiple
independent activities. Anytime the same identifier is used in
multiple contexts, it becomes possible for that identifier to be used
to correlate seemingly unrelated activity. For example, a network
sniffer placed strategically on a link across which all traffic
to/from a particular host crosses could keep track of which
destinations a node communicated with and at what times. Such
information can in some cases be used to infer things, such as what
hours an employee was active, when someone is at home, etc.
One of the requirements for correlating seemingly unrelated
activities is the use (and reuse) of an identifier that is
recognizable over time within different contexts. IP addresses
provide one obvious example, but there are more. Many nodes also have
DNS names associated with their addresses, in which case the DNS name
serves as a similar identifier. Although the DNS name associated with
an address is more work to obtain (it may require a DNS query) the
information is often readily available. In such cases, changing the
address on a machine over time would do little to address the concern
raised in this document, as the DNS name would become the correlating
identifier.
The use of a constant identifier within an address is of special
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 3]
INTERNET-DRAFT June 24, 1999
concern because addresses are a fundamental requirement of
communication and cannot easily be hidden from eavesdroppers and
other parties. Even when higher layers encrypt their payloads,
addresses in packet headers appear in the clear. Consequently, if a
mobile host (e.g., laptop) accessed the network from several
different locations, an eavesdropper might be able to track the
movement of that mobile host from place to place, even if the upper
layer payloads were encrypted [SERIALNUM].
2.2. Not a New Issue
Although the topic of this document may at first appear to be an
issue new to IPv6, similar issues already exist in today's Internet
already. That is, addresses used in today's Internet are often
constant in practice for extended periods of time. In many sites,
addresses are assigned statically; such addresses typically change
infrequently. However, many sites are moving away from static
allocation to dynamic allocation via DHCP. In theory, the address a
client gets via DHCP can change over time, but in practice servers
return the same address to the same client (unless addresses are in
such short supply that they are reused immediately by a different
node when they become free). Thus, although many sites use DHCP,
clients end up using the same address for months at a time.
Nodes that need a (non-changing) DNS name generally have static
addresses assigned to them to simplify the configuration of DNS
servers. Although Dynamic DNS [DDNS] can be used to update the DNS
dynamically, it is not widely deployed today. In addition, changing
an address but keeping the same DNS name does not really address the
underlying concern, since the DNS name becomes a non-changing
identifier. Servers generally require a DNS name (so clients can
connect to them), and clients often do as well (e.g., some servers
refuse to speak to a client whose address cannot be mapped into a DNS
name that also maps back into the same address).
Many network services require that the client authenticate itself to
the server before gaining access to a resource. The authentication
step binds the activity (e.g., TCP connection) to a specific entity
(e.g., an end user). In such cases, a server already has the ability
to track usage by an individual, independent of the address they
happen to use. Indeed, such tracking is an important part of
accounting.
Web browsers and servers typically exchange "cookies" with each
other. Such cookies allow web servers to correlate a current activity
with a previous activity. One common usage is to send back targeted
advertising to a browser by noting that a transaction that it is
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 4]
INTERNET-DRAFT June 24, 1999
performing was started by an entity that previously requested
information that had the side-effect of indicating the interest of
the querier.
2.3. Possible Approaches
One way to avoid some of the problems discussed above would be to use
DHCP for obtaining addresses. With DHCP, the DHCP server could
arrange to hand out addresses that change over time.
Another approach, one compatible with the stateless address
autoconfiguration architecture would be to change the interface id
portion of an address over time. For example, upon each system
restart, select a new interface identifier different from the ones
used previously. Changing the interface identifier makes it more
difficult to look at the IP addresses in independent transactions and
identify which ones actually correspond to the same node.
In order to make it difficult to make educated guesses as to whether
two different interface identifiers belong to the same node, the
algorithm for generating alternate identifiers must include input
that has an unpredictable component from the perspective of the
outside entity's collecting information. Picking identifiers from a
pseudorandom sequence suffices, so long as the specific sequence
cannot be determined by an outsider examining just the identifiers
that appear in addresses. This document proposes the use of an MD5
hash, using a per-interface "key" that varies from one interface to
another. Specifically, we use the interface identifier generated
using the normal procedure [ADDRARCH] as the key.
3. Protocol Description
The goal of this section is to define procedures that:
1) Result in a different interface identifier being generated at each
system restart or attachment to a network.
2) Produce a sequence of interface identifiers that appear to be
random in the sense that it is difficult for an outside observer
to predict a future identifier based on a current one and it is
difficult to determine previous identifiers knowing only the
present one.
We describe two approaches. The first assumes the presence of stable
storage that can be used to record state history for use as input
into the next iteration of the algorithm, i.e., after a system
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 5]
INTERNET-DRAFT June 24, 1999
restart. A second approach addresses the case where stable storage is
unavailable and the interface identifier must be generated at random.
3.1. When Stable Storage is Present
The following algorithm assumes the presence of a 64-bit "history
value" that is used as input in generating an interface identifier.
The very first time the system boots (i.e., out-of-the-box), any
value can be used including all zeros. Whenever a new interface
identifier is generated, its value is saved in the seed for the next
iteration of the process.
Section 5.3 of [ADDRCONF] describes the steps for generating a link-
local address when an interface becomes enabled. This document
modifies that step in the following way. Rather than use interface
identifiers generated as described in [ADDRARCH], the identifier is
generated as follows:
1) Take the history value from the previous iteration (or 0 if there
is no previous value) and append to it the interface identifier
generated as described in [ADDRARCH].
2) Compute the MD5 message digest [MD5] over the quantity created in
step 1).
3) Take the left-most 64-bits of the MD5 digest and set bit 6 (the
left-most bit is numbered 0) to zero. This creates an interface
identifier with the universal/local bit indicating local
significance only. Use the resultant identifier for generating
addresses as outlined in [ADDRCONF]. That is, use the interface
identifier to generate a link-local and other appropriate
addresses.
4) Save the interface identifier created in step 3) in stable storage
as the history value to be used in the next iteration of the
algorithm.
MD5 was chosen for convenience, not because of strict requirements.
IPv6 nodes are already required to implement MD5 as part of IPsec
[IPSEC], thus the code will already be present on IPv6 machines.
3.2. In The Absence of Stable Storage
In the absence of stable storage, no history information will be
available to generate a pseudo-random sequence of interface
identifiers. Consequently, identifiers will need to be generated at
random. A number of techniques might be appropriate. Consult [RANDOM]
for suggestions on good sources for obtaining random numbers. Note
that even though a machine may not have stable storage for storing
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 6]
INTERNET-DRAFT June 24, 1999
the previously using interface identifier, they will in many cases
have configuration information that differs from one machine to
another (e.g., user identity, security keys, etc.). One approach to
generating random interface identifiers in such cases is to use the
configuration information to generate some data bits (which may be
remain constant for the life of the machine, but will vary from one
machine to another), append some random data and compute the MD5
digest as before. The remaining details for generating addresses
would be analogous to those of the previous section.
4. Implications of Changing Interface Identifiers
The IPv6 addressing architecture goes to great lengths to ensure that
interface identifiers are globally unique. During the IPng
discussions of the GSE proposal [GSE], it was felt that keeping
interface identifiers globally unique in practice might prove useful
to future transport protocols. Usage of the algorithms in this
document would eliminate that future flexibility.
The desires of protecting individual privacy vs. the desire to
effectively maintain and debug a network can conflict with each
other. Having clients use addresses that change over time will make
it more difficult to track down and isolate operational problems. For
example, when looking at packet traces, it could become more
difficult to determine whether one is seeing behavior caused by a
single errant machine, or by a number of them.
5. Open Issues and Future Work
This document specifies that a node generate a new interface
identifier each time it autoconfigures an interface. The same
identifier is used to generate all addresses, including link-local,
site-local and global. However, the concerns this document addresses
are most likely relevant only to global-scope addresses. Thus, it may
make sense for a node to have two interface identifiers, the standard
one [ADDRCONF] used for link-local and site-local addresses, with a
changing one used only for global-scope addresses. This would appear
to require only small changes from the current specification.
In some cases, one could imagine the need to change an address more
frequently than upon reboot or movement to a new location. For
example, for machines that do not restart for months at time, one
might change addresses every few days or weeks. In extreme cases, one
might even want to change addresses upon the initiation of each new
TCP connection. Doing frequent changes would appear to add
significant issues and possible implementation complications. For
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 7]
INTERNET-DRAFT June 24, 1999
example, an implementation might need to support a significant number
of address on an interface simultaneously. An implementation would
also need to keep track of which addresses were being used so as to
be able to stop using an address once no upper layer protocols are
using it (but not before). This is in contrast to current approaches
where addresses are removed from an interface when they become
invalid [ADDRCONF], independent of whether or not upper layer
protocols are still using them.
Some machines server as both clients and servers. In such cases, the
server would need a DNS name. Whether the address stays fixed or
changes doesn't matter since the DNS name remains constant.
Simultaneously, when acting as a client (e.g., initiating
communication) it may want to vary the address it uses. In such
environments, one might need multiple addresses. Source address
selection rules would need to take into account the policy aspects of
which addresses would be acceptable for use when initiating
communication.
6. Security Considerations
The motivation for this document stems from privacy concerns for
individuals. This document does not appear to add any security issues
beyond those already associated with stateless address
autoconfiguration [ADDRCONF].
7. References
[ADDRARCH] Hinden, R. and S. Deering, "IP Version 6 Addressing
Architecture", RFC 2373, July 1998.
[ADDRCONF] Thomson, S. and T. Narten, "IPv6 Address
Autoconfiguration", RFC 2462, December 1998.
[DHCP] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131,
March 1997.
[DDNS] Vixie et. al., "Dynamic Updates in the Domain Name System (DNS
UPDATE)", RFC 2136, April 1997.
[DISCOVERY] Narten, T., Nordmark, E. and W. Simpson, "Neighbor
Discovery for IP Version 6 (IPv6)", RFC 2461, December 1998.
[GSE-ANALYSIS] Crawford et. al., "Separating Identifiers and Locators
in Addresses: An Analysis of the GSE Proposal for IPv6 ",
draft-ietf-ipngwg-esd-analysis-04.txt.
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 8]
INTERNET-DRAFT June 24, 1999
[IPSEC] Kent, S., Atkinson, R., "Security Architecture for the
Internet Protocol", RFC 2401, November 1998.
[MD5] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April
1992.
[SERIALNUM] Moore, K., "Privacy Considerations for the Use of
Hardware Serial Numbers in End-to-End Network Protocols",
draft-iesg-serno-privacy-00.txt.
8. Authors' Addresses
Thomas Narten
IBM Corporation
P.O. Box 12195
Research Triangle Park, NC 27709-2195
USA
Phone: +1 919 254 7798
EMail: narten@raleigh.ibm.com
draft-ietf-ipngwg-addrconf-privacy-00.txt [Page 9]