Francois Marier: How Tracking Protection works in Firefox |
Firefox 42, which was released last week, introduced a new feature in its Private Browsing mode: tracking protection.
If you are interested in how this list is put together and then used in Firefox, this post is for you.
There are many possible ways to download URL lists to the browser and check against that list before loading anything. One of those is already implemented as part of our malware and phishing protection. It uses the Safe Browsing v2.2 protocol.
In a nutshell, the way that this works is that each URL on the block list is
hashed (using SHA-256
) and then that list of hashes is
downloaded by Firefox and stored into a data structure on disk:
~/.cache/mozilla/firefox/XXXX/safebrowsing/mozstd-track*
on Linux~/Library/Caches/Firefox/Profiles/XXXX/safebrowsing/mozstd-track*
on MacC:\Users\XXXX\AppData\Local\mozilla\firefox\profiles\XXXX\safebrowsing\mozstd-track*
on WindowsThis sbdbdump script can be used to extract the hashes contained in these files and will output something like this:
$ ~/sbdbdump/dump.py -v .
- Reading sbstore: mozstd-track-digest256
[mozstd-track-digest256] magic 1231AF3B Version 3 NumAddChunk: 1 NumSubChunk: 0 NumAddPrefix: 0 NumSubPrefix: 0 NumAddComplete: 1696 NumSubComplete: 0
[mozstd-track-digest256] AddChunks: 1445465225
[mozstd-track-digest256] SubChunks:
...
[mozstd-track-digest256] addComplete[chunk:1445465225] e48768b0ce59561e5bc141a52061dd45524e75b66cad7d59dd92e4307625bdc5
...
[mozstd-track-digest256] MD5: 81a8becb0903de19351427b24921a772
The name of the blocklist being dumped here (mozstd-track-digest256
) is set in the urlclassifier.trackingTable
preference which you can find in about:config
. The most important part of the output shown above is the addComplete
line which contains a hash that we will see again in a later section.
Once it's time to load a resource, Firefox hashes the URL, as well as a few variations of it, and then looks for it in the local lists.
If there's no match, then the load proceeds. If there's a match, then we do an additional check against a pairwise allowlist.
The pairwise allowlist (hardcoded in the urlclassifier.trackingWhitelistTable
pref)
is designed to encode what we call "entity relationships". The list groups related domains together for
the purpose of checking whether a load is first or third party (e.g. twitter.com
and
twimg.com
both belong to the same entity).
Entries on this list (named mozstd-trackwhite-digest256
) look like this:
twitter.com/?resource=twimg.com
which translates to "if you're on the twitter.com
site, then don't block
resources from twimg.com
.
If there's a match on the second list, we don't block the load. It's only when we get a match on the first list and not the second one that we go ahead and cancel the network load.
If you visit our test page, you will see tracking protection in action with a shield icon in the URL bar. Opening the developer tool console will expose the URL of the resource that was blocked:
The resource at "https://trackertest.org/tracker.js" was blocked because tracking protection is enabled.
The blocklist is created by Disconnect according to their definition of tracking.
The Disconnect list is on their Github page, but the copy we use in Firefox is the copy we have in our own repository. Similarly the Disconnect entity list is from here but our copy is in our repository. Should you wish to be notified of any changes to the lists, you can simply subscribe to this Atom feed.
To convert this JSON-formatted list into the binary format needed by the Safe Browsing code, we run a custom list generation script whenever the list changes on GitHub.
If you run that script locally using the same configuration as our server stack, you can see the conversion from the original list to the binary hashes.
Here's a sample entry from the mozstd-track-digest256.log
file:
[m] twimg.com >> twimg.com/
[canonicalized] twimg.com/
[hash] e48768b0ce59561e5bc141a52061dd45524e75b66cad7d59dd92e4307625bdc5
and one from mozstd-trackwhite-digest256.log
:
[entity] Twitter >> (canonicalized) twitter.com/?resource=twimg.com, hash a8e9e3456f46dbe49551c7da3860f64393d8f9d96f42b5ae86927722467577df
This in combination with the sbdbdump
script mentioned earlier, will allow you to
audit the contents of the local lists.
The way that the binary lists are served to Firefox is through a custom server component written by Mozilla: shavar.
Every hour, Firefox requests updates from shavar.services.mozilla.com
. If new data is available, then the whole list is downloaded again. Otherwise, all it receives in return is an empty 204
response.
Should you want to play with it and run your own server, follow the
installation
instructions
and then go into about:config
to change these preferences to point to
your own instance:
browser.trackingprotection.gethashURL
browser.trackingprotection.updateURL
Note that on Firefox 43 and later, these prefs have been renamed to:
browser.safebrowsing.provider.mozilla.gethashURL
browser.safebrowsing.provider.mozilla.updateURL
If you want to learn more about how tracking protection works in Firefox, you can find all of the technical details on the Mozilla wiki or you can ask questions on our mailing list.
Thanks to Tanvi Vyas for reviewing a draft of this post.
http://feeding.cloud.geek.nz/posts/how-tracking-protection-works-in-firefox/
Комментировать | « Пред. запись — К дневнику — След. запись » | Страницы: [1] [Новые] |