Cookies Are One Piece of a Larger Puzzle
There has been an odd preoccupation with cookies for some time now—to the exclusion of other forms of browser tracking, some of which are much more flexible and more robust in their data collection capabilities than cookies. Despite this fact, these other, non-cookie tracking technologies are often not referenced in privacy policies and cookie policies, even though they are used to “store information” and / or “gain access to information stored in the terminal equipment” for purposes of the ePrivacy Directive and will presumably qualify as personal information under the CCPA as well.
LocalStorage and Cookies—The Similarities
Cookies are typically sent by a remote host (a server on the Internet) and stored in the user’s browser. They are transmitted by the end-user’s browser back to the remote host whenever an HTTP Request is subsequently made to that host. However, HTML5 localStorage is more flexible form of persistent data storage in browsers (i.e. for storing such things as tracking IDs, location, preferences, purchase history and the like). HTML5 localStorage is a part of the “Web Storage” specification created by the W3C standards body and, like cookies, stores data persistently in “name-value pairs” in the browser.
LocalStorage and Cookies—The Differences
There are, however, some important differences. Unlike HTTP cookies, localStorage can store much larger quantities of data. Cookies can store kilobytes, but localStorage can store Megabytes. LocalStorage can easily be used to store data like calendar invites, a series of long messages, shopping history and other potentially sensitive information in relatively large amounts compared to cookies. LocalStorage is also more flexible insofar as a remote host’s local storage contents do not get transmitted from the end-user’s browser with every network request, but can instead be accessed only when needed through the use of JavaScript. Lastly, although the format of storage is the “name-value” pair, the names and values in localStorage can be “nested,” creating a data structure that is more difficult for the layperson to untangle than cookies. (For technical readers, the storage format is typically JSON.)
The Potential Privacy Issues With LocalStorage
The potential privacy and security issues associated with persistent, flexible, high-density storage are not lost on the W3C in its standards document:
“A third-party advertiser (or any entity capable of getting content distributed to multiple sites) could use a unique identifier stored in its local storage area to track a user across multiple sessions, building a profile of the user’s interests to allow for highly targeted advertising. In conjunction with a site that is aware of the user’s real identity (for example an e-commerce site that requires authenticated credentials), this could allow oppressive groups to target individuals with greater accuracy than in a world with purely anonymous Web usage.” See https://www.w3.org/TR/webstorage/#privacy.
Often Excluded From Cookie Policies and Privacy Policies
So the bottom line is that localStorage is “like” cookies, except more powerful. Notwithstanding this fact, few websites that use localStorage disclose their use of localStorage in their cookie policies or privacy policies. There are several reasons for this. First, is the lack of education, even among privacy professionals, regarding arcane aspects of persistent browser storage that may trigger relevant privacy laws. Second, the technical demands associated with cataloging HTML5 localStorage on websites outstrips the technical capabilities of the cookie scanning solutions available to companies. Third, in the United States, one can persuasively argue that companies have not had to worry about this type of thing until the CCPA was passed. Lastly, even if companies could catalog their own use of localStorage by looking at their own, first-party JavaScript, companies don’t have access to the third-party JavaScript served in real-time during browser sessions, because this happens entirely on the client side (i.e. intermediated entirely by the user’s browser).
What to Do
Cataloging HTML5 involves using a modified browser to track the execution of JavaScript commands in real-time. This allows the cataloger to know on which web pages localStorage events occur; the responsible JavaScript code; and the data that is stored or accessed. This has been done in the research context by Dr. Steven Englehardt (see https://senglehardt.com/papers/openwpm_03-2015.pdf) but does not appear to be addressed in the commercial context. As a result, for our testing and analysis platform we use with clients—NT Analyzer—we rely in part on tools created in the University context in order to isolate and identify localStorage data at scale.