MSN Cookie Data Crosses Domains
And, MSN GUIDs Are Accessible to Anyone
Thursday, 31 August 2000 Updated Sunday, 3 September 2000
Utilized by Numerous Microsoft
Domains Denies Access Without MSN Identifier Violates
"Trusted Zone" Settings
When visiting this link today: http://www.linkexchange.com ... I found that the connection results in a fascinating series of events. I'm sure this isn't news to many hard-core techies, but I think it's well worth explaining for the benefit of all. And I think it raises some questions that ought to be asked of Microsoft about at least one of their servers.
Re-Re-Re-direction
The initial link to www.linkexchange.com immediately results in a redirection to another domain, which is for the moment, invisible to the user. The URL is: http://www.bcentral.com/?leindex
This connection in turn responds with an interesting cookie. Here's the "raw" cookie as it's returned in the http header:
Set-Cookie: CheckCookieTest=1; ; expires=Sat, 04-Oct-2003 00:00:00 GMT; path=/
Note it's explicitly a test cookie. Meanwhile, the link still displays nothing to the user, and it redirects the browser to yet another URL, which is: http://msid.msn.com/mps_id_sharing/redirect.asp?www.bcentral.com/?leindex
Now, note that this link is to a machine on yet another entirely different domain: MSN.COM.
The response from msid.msn.com does two things. It sets this cookie:
Set-Cookie: MC1=V=2&GUID=F4DF22E57F5F11D4A7FC00805F7786DE; expires=Sat, 04-Oct-2003 19:00:00 GMT; domain=.msn.com; path=/
...and it redirects to yet another URL, which is the same as the first redirection, but with a shiny new unique identifier appended: http://www.bcentral.com/?leindex&newguid=F4DF22E57F5F11D4A7FC00805F7786DE
The ID That Keeps On Giving
At long last, the browser is sent a displayable page. This page appears to be the bCentral.com main home page.
The ordinary user, who thought he was going to LinkExchange.com, probably sees the bCentral.com name in the Location indicator, but the LinkExchange name is prominently displayed as well. Nothing seems particularly amiss.
The user doesn't know it, but he's just visited MSN.COM and received a cookie from MSN.COM. Not only that, but MSN.COM has passed that cookie's unique identifier back to www.bCentral.com.
The bCentral.com server, as it delivers that viewable page, sets a new cookie -- containing the MSN identifier! If his cookies are enabled, the result is cookies stored on the user's machine for both sites using the same identifier. Both servers now have corresponding tracking records and the same identifier.
The most interesting thing about all this is that it accomplishes cross-domain exchange of cookie information. Cookies ordinarily don't get sent back to any but the originating domain. This mechanism of redirects allows cookie data to be carried invisibly from one domain to another, and for matching cookies to be created. It is a very clever technique.
An important aspect of this is its invisibility. Any ordinary web browser follows the trail it's forced to follow by the redirections, displaying nothing, while the user is none the wiser.
A user who's merely checking the HTML source of the pages he's visiting will see no indication of this exchange. Furthermore, caching is disabled by meta tags and by http headers at strategic points, so the user's browser cache doesn't retain any evidence of what was done. Only the matching cookies remain to attest to the data exchange. Who checks their cookies for matching data across domains?
The trick utilizes two different types of redirection. The redirect from bCentral to msn.com is announced in the header thusly:
HTTP/1.1 302 Redirect
The one from MSN back to bCentral has this header:
HTTP/1.1 302 Object moved
These two redirection methods don't behave in quite the same way. The reason for their use becomes clear on inspection of Referers.
The Referer is the URL of the current web page being viewed. It is sent along with most browser requests, thus informing the receiving server of the source of the link that's being followed. In Netscape browsers, Referers can be turned off using a rather obscure trick; but in Internet Explorer, there is no option whatsoever to block Referers. While the Referer has a number of practical uses, it's also a powerful tool for tracking Web users.
Careful examination reveals that fetching page components (images, etc.) for the bCentral home page, the browser is continually directed back through the MSN server. The Referer carries right through the Redirect, delivering its unique GUID back to the MSN server; whereas the "Object Moved" type redirection appears not to carry a Referer through its request for the "moved" page.
Particularly with cookies disabled, the Referer becomes vitally important to the MSN server's cross-domain data sharing. The numeric identifier appears in the Referer with every request for every page component, including those to ad servers on other domains.
However, if cookies are enabled, the GUID is absent from the Referer. This is because the bCentral.com server, having verified cookies are working by way of its temporary test cookie, and having set a new cookie containing the MSN GUID, then does an additional redirect to its own URL -- sans-GUID.
When a link that leads to any page on bCentral is followed by the user, the same behavior repeats. All requests follow the same path via the MSN machine. The Referer is passed back and forth as needed, and the servers continue to share the same GUID.
So, if the browser allows either cookies or Referers, the cross-domain correlation is consistent. The GUID remains unchanged, even when a new page on bCentral.com is requested.
If cookies are enabled, the GUID is persistent indefinitely. It will continue to identify the user in all subsequent visits to either domain.
Because the GUID is part of all URLs on the bCentral.com domain whenever cookies are disabled, it will also be included in any bookmarks (favorites) or shortcuts the user may create. Thus, the unique identifier may often survive across sessions despite the lack of cookies.
There's additional ad-related tracking too. Along with the cookies from MSN and bCentral that are set when the main page is accessed, a cookie is set by LinkExchange.com. Cookies are also set by at least two additional domains, because ad images are retrieved from their servers.
I ran a number of tests to confirm the mechanism. For instance, using a fresh new default installation of Netscape 4.7 with cookies enabled, I visited the MSN home page and set a few custom options. Subsequent visits produced my local weather, my stock picks, and and so forth. MSN had set a GUID along with a number of other cookies to retain this configuration. All subsequent visits to MSN and to bCentral.com used the same GUID. This demonstrates that the GUID that's being set and/or utilized by msid.msn.com and passed to other domains is the very same one MSN uses routinely to identify its portal users.
These are the cookies left in Netscape's cookies.txt file after one of my tests. Note the identical GUID in the MSN.com and bCentral.com cookies.
.msn.com TRUE / FALSE 1065294000 MC1 V=2&GUID=3213FFC77F7211D498880008C7D9E3DB .bcentral.com TRUE / FALSE 1065225600 MC1 V=2&GUID=3213FFC77F7211D498880008C7D9E3DB .linkexchange.com TRUE / FALSE 1125516028 LE_COOKIE 1.2a.09aac48157e4b91a580deb5258bbf2e7c387e6df81e11ade08ddd1ba9b1dc4bbe043664e73357f91801773c5b271cdf0f62a5d372ba8b9f1 .focalink.com TRUE / FALSE 1293796800 SB_ID 096774985400004255221929384950 .avenuea.com TRUE / FALSE 1283040000 AA002 00967749864-152731125/968959464
(Something else I found an interesting puzzle: the closely similar string of digits in the last two cookies [967749854 and 967749864]. Those numbers appear to encode the time of my visits -- expressed as the number of seconds since January 1, 1970 -- Greenwich time, I do believe.)
I found that if cookies are not enabled, there are huge numbers of attempts to set a cookie from the bCentral and MSN servers. In two test browsing sessions which were identical except for cookie settings, there were 133 set-cookie headers delivered while browser cookies were disabled; and only 7 with cookies enabled.
While nothing overtly nefarious is happening in this particular instance (MSN owns LinkExchange and bCentral, no secret is made of the connections between the companies and domains, and personal data wasn't involved in this case), the technique is certainly of huge interest, with its ability to pass cookie-type data (in other words, practically any data) across domains without any notice whatsoever to the ordinary user that he ever contacted some other domain.
The only thing the user sees in this case is that surfing to www.linkexchange.com led his browser to www.bCentral.com. He sees MSN.com's presence -- it's overtly advertised -- but he has no idea he visited an MSN server.
Utilized By Numerous Microsoft Domains
LinkExchange and bCentral are not unique. A brief examination reveals consistent use of the "MSID" server by at least two other major Microsoft-controlled domains.
The most prominent is MSNBC, where accessing practically any page or article, including the main home page, forces the same invisible redirections through the MSN ID server. Again, the GUID is passed by the same technique and an MSNBC cookie is created, containing the same GUID as the MSN cookie.
Also, links throughout the pages encountered by users of HotMail.com lead ultimately to the MSN ID server.
Denies Access Without MSN Identifier
An aspect of this which is paticularly vexing is the fact that the bCentral server constantly redirects almost all requests through the MSN server if they do not carry the GUID tag. The effect is that any effort to thwart the "tagging" by blocking the server selectively results in denial of access to the bCentral.com website. Blocking redirects has the same effect. It amounts to "let us track you or go away." But they don't say this to the sites' visitors; they merely make it an unavoidable fact. You will contact the MSN ID server or else you will not access any bCentral or MSNBC page that requires it.
Violates "Trusted Zone" Settings
Savvy Internet Explorer users often use IE's "Trusted Zone" options to help provide protection against cookies and other intrusions. Users can browse the Net at large with very tight security settings, while allowing the convenience of cookies and active content on sites they believe they can trust. As a result, millions of users have cookies disabled for ordinary browsing, but enabled in their Trusted Zone.
This data-passing tactic allows Microsoft to take undue advantage of those users (a huge number of them) who have placed MSN.com in their Trusted Zone. The fact that a trusted domain is in the data-sharing "loop" means the GUID will be retained indefinitely via the MSN cookie. Microsoft can reliably track those users on its other enterprises' domains using their MSN GUID.
(Many thanks to Milly for pointing out this fascinating fact! It provides one very likely explanation for Microsoft's decision to use such a sneaky trick.)
GUID, Anyone?
Finally, I performed one more little experiment, which led to a very significant result. I created and tried this URL, pointing to a page on my own site:
http://msid.msn.com/mps_id_sharing/redirect.asp?www.pc-help.org/trace.htm?ID=Hi_There
The MSN machine, obviously running a very simple script, obliged; creating its unique identifier and directing the browser to the page I had specified, with the GUID appended. As a result, I visited my own page using this URL:
http://www.pc-help.org/trace.htm?ID=Hi_There&newguid=3213FFC77F7211D498880008C7D9E3DB
The identifier was of course logged by my server as part of the http request string.
Since the MSN server returns the ID found in pre-existing cookies, anyone, anywhere can create links to his own pages which will deliver visiting users' MSN GUIDs to his own server.
I don't know precisely how many of Microsoft's servers may behave this way, nor whether this practice is widespread on the Web. But to the degree that such identifiers might lead to personal information, this indiscriminate handing-out of GUIDs could have very undesirable consequences to users' privacy.
Notice: I would very much appreciate it if some of those reading this article would examine their IE and Netscape cookies for the duplicate GUIDs described here. I would like to hear from individuals who find that these "cross-domain cookies" have been placed on their system during normal browsing without their knowledge, and who are willing to provide me with copies of their cookie files. |