Mozilla Privacy Blog: The new EU digital strategy: A good start, but more to be done

Среда, 19 Февраля 2020 г. 15:27 + в цитатник

In a strategy and two white papers published today, the Commission has laid out its vision for the next five years of EU tech policy: achieving trust by fostering technologies working for people, a fair and competitive digital economy, and a digital and sustainable society. This vision includes big ambitions for content regulation, digital competition, artificial intelligence, and cybersecurity. Here we give some recommendations on how the Commission should take it forward.

We welcome this vision the Commission sketches out and are eager to contribute, because the internet today is not what we want it to be. A rising tide of illegal and harmful content, the pervasiveness of the surveillance economy, and increased centralisation of market power have damaged the internet’s original vision of openness. We also believe that innovation and fundamental rights are complementary and should always go hand in hand – a vision we live out in the products we build and the projects we take on. If built on carefully, the strategy can provide a roadmap to address the many challenges we face, in a way that protects citizens’ rights and enhances internet openness.

However, it’s essential that the EU does not repeat the mistakes of the past, and avoids misguided, heavy handed and/or one-size-fits-all regulations. The Commission should look carefully at the problems we’re trying to solve, consider all actors impacted and think innovatively about smart interventions to open up markets and protect fundamental rights. This is particularly important in the content regulation space, where the last EU mandate saw broad regulatory interventions (e.g. on copyright or terrorist content) that were crafted with only the big online platforms in mind, undermining individuals’ rights and competition. Yet, and despite such interventions, big platforms are not doing enough to tackle the spread of illegal and harmful content. To avoid such problematic outcomes, we encourage the European Commission to come up with a comprehensive framework for ensuring that tech companies really do act responsibly, with a focus on the companies’ practices and processes.

Elsewhere we are encouraged to see that the Commission intends on evaluating and reviewing EU competition rules to ensure that they remain fit for purpose. The diminishing nature of competition online and the accelerating trend towards web centralisation in the hands of a few powerful companies goes against the open and diverse internet ecosystem we’ve always fought for. The nature of the networked platform ecosystem is giving rise to novel competition challenges, and it is clear that the regulatory toolbox for addressing them is not fit-for-purpose. We look forward to working with EU lawmakers on how EU competition policy can be modernised, to take into account bundling, vertical integration, the role of data silos, and the potential of novel remedies.

We’re also happy to see the EU take up the mantle of AI accountability and seek to be a standard-setter for better regulation in this space. This is an area that will be of crucial importance in the coming years, and we are intent on shaping a progressive, human-centric approach in Europe and beyond.

The opportunity for EU lawmakers to truly lead and to set the future of tech regulation on the right path is theirs for the taking. We are eager and willing to help contribute and look forward to continuing our own work to take back the web.

The post The new EU digital strategy: A good start, but more to be done appeared first on Open Policy & Advocacy.

https://blog.mozilla.org/netpolicy/2020/02/19/the-new-eu-digital-strategy-a-good-start-but-more-to-be-done/

Комментарии (0)

Karl Dubost: Week notes - 2020 w07 - worklog - flask blueprint

Среда, 19 Февраля 2020 г. 10:10 + в цитатник

A string of secondary issues have been plaguing our restart of anonymous reporting on webcompat.com.

This week's issues
Last week issues
fixed! Dependencies upgrade
fixed! Anonymous reporting works on staging but fails on prod. This one gave me headaches and a lot of testing, but mike got us out of the woods. OAuth tokens have scope. Our token was valid only for public repos, not the private ones. It's why webcompat-bot was unable to publish in public.
fixed! Closing a private issue after it's been moved to "accepted" sets the moderation template to rejected belt-on: phase 2 . So I probably need to explain the new workflow here.

new anonymous workflow reporting.

A bug is reported anonymously
We send the data to a private repository (waiting for moderation)
We put a placeholder on the public repository, saying that this will be moderated later on.
In the private repo, the moderators can either:
- set the milestone to accepted in the private repo and the public moderation placeholder will be replaced with the real issue content.
- close the issue in the private repo (means it has been rejected) and it will replace the public moderation placeholder by another message saying it was rejected.

Simple! I had forgotten to handle the case of private issue with milestone accepted being closed. This erased a valid moderated issue. Not good. So we fixed it. This is now working.

from string to boolean in python

There was a solution to the issue we had last week about our string which is not a boolean: strtobool. Thanks to rik. Implementation details. Values include on and off. Neat!

coverage and pytest

In the process of trying to improve the project, I looked at the results of coverage on the project. I was pleasantly surprised for some areas of the code. But I also decided to open a couple of issues related to other parts. The more and better tests we have, the more robust the project will be.

While running coverage, I also stumbled upon this sentence in the documentation:

Nose has been unmaintained for a long time. You should seriously consider adopting a different test runner.

So I decided to create an issue specific on switching from nosetests to pytest.

And I started to work on that. It led to an interesting number of new breakages and warnings. First pytest is working better with an installable code.

pip install -e .

So I created a very simple and basic setup.py

then I ran to an issue that has bitten me in the past: flask blueprint.

Do NOT name the module, the directory and the blueprint with the same name.

Basically our code has this kind of constructs. subtree to make it simpler.

-- webcompat
   |-- __init__.py
   |-- form.py
   |-- api
   |   |-- __init__.py
   |   |-- uploads.py
   |   |-- endpoints.py
   …
   |-- helpers.py
   |-- views.py

so in webcompat/__init__.py

from webcompat.api.endpoints import api
app = Flask(__name__, static_url_path='')
app.register_blueprint(api)

and in webcompat/api/endpoints/__init__.py

from webcompat.helpers import cool_feature

api = Blueprint('api', __name__, url_prefix='/api')

@api.route('blah')
def somewhere(foo):
    """blablah"""
    yeah = cool_feature()

So what is happening here? The module and the blueprint share the same name. So if in a test we need to mock cool_feature:

with patch('webcompat.api.endpoints.cool_feature') as mock_cool:

We need to remember that when mocking, we do not mock the feature where it has been defined (aka webcompat.helpers.cool_feature) but where it has been imported (aka webcompat.api.endpoints.cool_feature). We will not be able to mock in this case because there will be a conflict of names. The error will be:

E    AttributeError: 'Blueprint' object has no attribute 'endpoints'

because the named webcompat.api blueprint has no attribute endpoints while the module webcompat.api has one.

So I will need to fix this next week.

changing circleCI

I also needed to changed CircleCI configuration to be able to run with pytest, even if it breaks for now.

Friday : diagnosis.

Friday I did some diagnosis and I'll do next monday and probably tuesday too.

Miscellaneous

my keyboard is having another hiccup (this is irregular). I kind of cope with it until there is a new model in the size i want with the new keyboard.
left shift key not working 70% of the time
number 2 (repeating itself 20% of time)
letter m (repeating itself 50% of time)
Coronavirus is hitting hard the boat. And some cases pop up here and there without apparent reasons sometimes. I minimize going out of home. Our local hospital has some infected patients. The response from the japanese authorities seems to be to say the least… very strange.

Otsukare!

http://www.otsukare.info/2020/02/14/week-notes-2020-07

Комментарии (0)

The Mozilla Blog: Thank You, Ronaldo Lemos

Вторник, 18 Февраля 2020 г. 20:20 + в цитатник

Ronaldo Lemos joined the Mozilla Foundation board almost six years ago. Today he is stepping down in order to turn his attention to the growing Agora! social movement in Brazil.

Over the past six years, Ronaldo has helped Mozilla and our allies advance the cause of a healthy internet in countless ways. Ronaldo played a particularly important role on policy issues including the approval of the Marco Civil in Brazil and shaping debates around net neutrality and data protection. More broadly, he brought his experience as an academic, lawyer and active commentator in the fields of intellectual property, technology and culture to Mozilla at a time when we needed to step up on these topics in an opinionated way.

As a board member, Ronaldo also played a critical role in the development of Mozilla Foundation’s movement building strategy. As the Foundation evolved it’s programs over the past few years, he brought to bear extensive experience with social movements in general — and with the open internet movement in particular. This was an invaluable contribution.

Ronaldo is the Director of the Institute for Technology & Society of Rio de Janeiro (ITSrio.org), Professor at the Rio de Janeiro State University’s Law School and Partner with the law firm Pereira Neto Macedo.

He recently co-founded a political and social movement in Brazil called Agora!. Agora! is a platform for leaders engaged in the discussion, formulation and implementation of public policies in Brazil. It is an independent, plural and non-profit movement that believes in a more humane, simple and sustainable Brazil — in an efficient and connected state, which reduces inequalities and guarantees the well-being of all citizens.

Ronaldo remains a close friend of Mozilla, and we’ll no doubt find ample opportunity to work together with him, ITS and Algora! in the future. Please join me in thanking Ronaldo for his tenure as a board member, and wishing him tremendous success in his new endeavors.

Mozilla is now seeking talented new board members to fill Ronaldo’s seat. More information can be found here: https://mzl.la/MoFoBoardJD

The post Thank You, Ronaldo Lemos appeared first on The Mozilla Blog.

https://blog.mozilla.org/blog/2020/02/18/thank-you-ronaldo-lemos/

Комментарии (0)

Mike Hoye: Dexterity In Depth

Вторник, 18 Февраля 2020 г. 18:50 + в цитатник

I’m exactly one microphone and one ridiculous haircut away from turning into Management Shingy when I get rolling on stuff like this, because it’s just so clear to me how much this stuff matters and how little sense I might be making at the same time. Is your issue tracker automatically flagging your structural blind spots? Do your QA and UX team run your next reorg? Why not?

This all started life as a rant on Mastodon, so bear with me here. There are two empirically-established facts that organizations making software need to internalize.

The first is that by wide margin the most significant predictive indicator that there will be a future bug in a piece of software is the relative orgchart distance of the people working on it. People who are working on a shared codebase in the same room but report to different VPs are wildly more likely to introduce errors into a codebase than two people who are on opposite sides of the planet and speak different first languages but report to the same manager.

The second is that the number one predictor that a bug will be resolved is if it is triaged correctly – filed in the right issue tracker, against the right component, assigned to the right people – on the first try.

It’s fascinating that neither of the strongest predictive indicators of the most important parts of a bug’s lifecycle – birth and death – actually take place on the developers’ desk, but it’s true. In terms of predictive power, nothing else in the software lifecycle comes close.

Taken together, these facts give you a tools to roughly predict the effectiveness of collaborating teams, and by analyzing trends among bugs that are frequently re-assigned or re-triaged, can give you a lot of foresight into how, where and why a company need to retrain or reorganize those teams. You might have read Agile As Trauma recently, in which Dorian Taylor describes agile development as an allergic reaction to previously bad management:

The Agile Manifesto is an immune response on the part of programmers to bad management. The document is an expression of trauma, and its intellectual descendants continue to carry this baggage. While the Agile era has brought about remarkable advancements in project management techniques and development tools, it remains a tactical, technical, and ultimately reactionary movement.

This description is strikingly similar to – and in obvious tension with – Clay Shirky’s description of bureaucracy as the extractive mechanism of complexity and an allergic reaction to previous institutional screwups.

Bureaucracies temporarily suspend the Second Law of Thermodynamics. In a bureaucracy, it’s easier to make a process more complex than to make it simpler, and easier to create a new burden than kill an old one.

… which sounds an awful lot like the orgchart version of “It’s harder to read code than to write it”, doesn’t it?

I believe both positions are correct. But that tension scribes the way forward, I think, for an institutional philosophy that is responsive, flexible and empirically grounded, in which being deliberate about the scale, time, and importance of different feedback cycles gives an organization the freedom to treat scaling like a tool, that the signals of different contexts can inform change as a continuum between the macro and micro levels of organizational structure and practice. Wow, that’s a lot of words in a strange order, but hear me out.

It’s not about agile, or even agility. Agility is just the innermost loops, the smallest manifestation of a wide possible set of tightly-coupled feedback mechanisms. And outside the agile team, adjacent to the team, those feedback loops may or may not exist however much they need to, up and down the orgchart (though there’s not often much “down” left in the orgchart, I’ve noticed, where most agile teams live…) but more importantly with the adjacent and complementary functions that agile teams rely on.

It is self-evident that how teams are managed profoundly affects how they deliver software. But agile development (and every other modern developer-cult I’m aware of) doesn’t close that loop, and in failing to do so agile teams are reduced to justifying their continued existence through work output rather than informing positive institutional change. And I don’t use “cult” lightly, there; the current state of empirical evaluation of agile as a practice amounts to “We agiled and it felt good and seemed to work!” And feeling good and kinda working is not nothing! But it’s a long way from being anything more than that.

If organizations make software, then starting from a holistic view of what “development” and “agility” means and could be, looking carefully at where feedback loops in an organization exist, where they don’t and what information they circulate, all that suggests that there are reliable set of empirical, analytic tools for looking at not just developer practice, but the organizational processes around them. And assessing, in some measurable, empirical way, the real and sustainable value of different software development schools and methodologies.

But honestly, if your UX and QA teams aren’t informing your next reorg, why not?

http://exple.tive.org/blarg/2020/02/18/dexterity-in-depth/

Комментарии (0)

Hacks.Mozilla.Org: WebThings Gateway Goes Global

Вторник, 18 Февраля 2020 г. 18:39 + в цитатник

https://hacks.mozilla.org/2020/02/webthings-gateway-goes-global/

Комментарии (0)

Mozilla GFX: Challenge: Snitch on the glitch! Help the Graphics team track down an interesting WebRender bug…

Вторник, 18 Февраля 2020 г. 17:44 + в цитатник

For the past little while, we have been tracking some interesting WebRender bugs that people are reporting in release. Despite best efforts, we have been unable to determine clear steps to reproduce these issues and have been unable to find a fix for them. Today we are announcing a special challenge to the community – help us track down steps to reproduce (a.k.a STR) for this bug and you will win some special, limited edition Firefox Graphics team swag! Read on for more details if you are interested in participating.

What we know so far about the bug:

Late last year we started seeing reports of random UI glitching bugs that people were seeing in release. You can check out some of the reports on Bugzilla. Here is what we know so far about this bug:

At seemingly random intervals, either two things seem to happen:
- The text in the UI and content area of the browser seem to glitch out
  - Scrolling through the page or mousing over the affected area seems to fix it
- Or black boxes appear on the screen and also seem to disappear after scrolling around

The majority of the reports we have seen so far have come from people using NVIDIA graphics cards, although we have seen reports come in of this happening on Intel and AMD as well. That could be though because the majority of the people we have officially shipped WR to in release are on NVIDIA cards.
There doesn’t seem to be one clear driver version correlated to this bug, so we are not sure if it is a driver bug.
All reporters so far have been using Windows 10
No one who has reported the bug thus far has been able to determine clear and consistent STR, and no one on the Graphics team has found a way to reproduce it either. We all use WebRender daily and none of us have encountered the bug.

How can you help?

Without having a way to reliably reproduce this bug, we are at a loss on how to solve it. So we decided to hold a challenge to engage the community further to help us understand this bug better. If you are interested in helping us get to the root of this tricky bug, please do the following:

Download Firefox Nightly (if you don’t already use it)
Ideally you are using Windows 10 (but if you see this bug on other platforms, we are interested in hearing about it!)
Ensure WebRender is enabled
- Go to about:config and set gfx.webrender.all to true, then restart your browser
If you encounter the bug, help us by filing a bug in Bugzilla with the following details:
- What website are you on when the bug happens?
- Does it seem to happen when specific actions are taken?
- How frequently does the bug happen and can you ‘make’ it happen?
- Attach the contents of your about:support as a text file
The main thing we really need are consistent steps that result in the bug showing up. We will send some limited edition Graphics swag to the first 3 bug reporters who can give us consistent STR!

Even if you can’t easily find STR, we are still interested in hearing about whether you see this bug!

Challenge guidelines

The winners of this challenge will be chosen based on the following criteria:

The bug report contains clear and repeatable steps to make the bug happen
- This can include things like having a specific hardware configuration, using certain add ons and browsing certain sites – literally anything as long as it can reliably and consistently cause the bug to appear
- BONUS: A member of the Graphics team can follow your steps and can also make the bug appear
We will choose the first 3 reporters who can meet this criteria (we say 3 because it is possible there is more than one bug and more than one way to reproduce it)
Winners will receive special limited edition Graphics Team swag! (t-shirt and stickers)

Update: we have created the channel #gfx-wr-glitch:mozilla.org on Matrix so you can ask questions/chat with us there. For more info about how to joing Matrix, check out: https://wiki.mozilla.org/Matrix

https://mozillagfx.wordpress.com/2020/02/18/challenge-snitch-on-the-glitch-help-the-graphics-team-track-down-an-interesting-webrender-bug/

Комментарии (0)

Wladimir Palant: Insights from Avast/Jumpshot data: Pitfalls of data anonymization

Вторник, 18 Февраля 2020 г. 12:00 + в цитатник

There has been a surprising development after my previous article on the topic, Avast having announced that they will terminate Jumpshot and stop selling users’ data. That’s not the end of the story however, with the Czech Office for Personal Data Protection starting an investigation into Avast’s practices. I’m very curious to see whether this investigation will confirm Avast’s claims that they were always fully compliant with the GDPR requirements. For my part, I now got a glimpse of what the Jumpshot data actually looks like. And I learned that I massively overestimated Avast’s success when anonymizing this data.

Conveyor belt putting false noses on avatars in a futile attempt of masking their identity

In reality, the data sold by Jumpshot contained plenty of user identifiers, names, email addresses, even home addresses. That’s partly due to Avast being incapable or unwilling to remove user-specific data as they planned to. Many issues are generic however and almost impossible to avoid. This once again underlines the central takeaway: anonymizing browser history data is very hard. That’s especially the case if you plan to sell it to advertisers. You can make data completely anonymous, but you will have to dumb it down so much in the process that advertisers won’t have any use for it any more.

Why did I decide to document Avast’s failure in so much detail? My goal is to spread appreciation for the task of data anonymization: it’s very hard to ensure that no conclusions about users’ identity are possible. So maybe whoever is toying with the idea of collecting anonymized data will better think twice whether they really want do go there. And maybe next time we see a vendor collecting data we’ll ask the right questions about how they ensure it’s a “completely anonymous” process.

The data

The data I saw was an example that Jumpshot provided to potential customers: an excerpt of real data for one week of 2019. Each record included an exact timestamp (milliseconds precision), a persistent user identifier, the platform used (desktop or mobile, which browser), the approximate geographic location (country, city and ZIP code derived from the user’s IP address), a guess for user’s gender and age group.

What it didn’t contain was “every click, on every site.” This data sample didn’t belong to the “All Clicks Feed” which has received much media attention. Instead, it was the “Limited Insights Pro Feed” which is supposed to merely cover user’s shopping behavior: which products they looked at, what they added to the cart and whether they completed the order. All of that limited to shopping sites and grouped by country (Germany, UK and USA) as well as product category such as Shoes or Men’s Clothing.

This doesn’t sound like there would be all too much personal data? But there is, thanks to a “referrer” field being there. This one is supposed to indicate how the user came to the shopping site, e.g. from a Google search page or by clicking an ad on another website. Given the detailed information collected by Avast, determining this referrer website should have been easy – yet Avast somehow failed this task. And so the supposed referrer is typically a completely unrelated random web page that this user visited, and sometimes not even a page but an image or JSON data.

If you extract a list of these referrers (which I did), you see news that people read, their web mail sessions, search queries completely unrelated to shopping, and of course porn. You get a glimpse into what porn sites are most popular, what people watch there and even what they search for. For each user, the “limited insights” actually contain a tiny slice of their entire browsing behavior. Over the course of a week this exposed way too much information on some users however, and Jumpshot customers watching users over longer periods of time could learn a lot about each user even without the “All Clicks Feed.”

What about anonymization?

Some parameters and address parts have been censored in the data. For example, you will see an entry like the following:

http://example.com/email/edit-details/[PII_EMAIL_abcdef1234567890]

A heuristic is at work here and will replace anything looking like an email address with a placeholder. Other heuristics will produce placeholders like [PII_USER_abcdef1234567890] and [PII_NM_abcdef1234567890] – these seem to be more rudimentary, applying based on parameter names. This is particularly obvious in entries like this one:

https://www.ancestry.co.uk/name-origin?surname=[PII_NM_abcdef1234567890]

Obviously, the surname parameter here is merely a search query. Given that search queries aren’t being censored elsewhere, it doesn’t make much sense to censor them here. But this heuristic isn’t terribly clever and cannot detect whether the parameter refers to the user.

Finally, the generic algorithm described in the previous article seems to apply, this one will produce placeholders like [PII_UNKWN_abcdef1234567890].

Failures to censor user-specific parameters

It isn’t a big surprise that heuristic approaches will miss some data. The generic algorithm seemed sane from its description in the patent however and should be able to recognize most user-specific data. In reality, this algorithm appears misimplemented, censoring only few of the relevant parameters and without an apparent system. So you will see addresses like the following without any censoring applied:

https://nolp.dhl.de/nextt-online-public/set_identcodes.do?zip=12345&idc=9876543210987654321

Residents of Germany will immediately recognize this as a DHL package tracking link. The idc parameter is the package identifier whereas the sometimes present zip parameter is the recipient’s ZIP code. And now you’d need to remember that DHL only requires you to know these two pieces of information to access the “detailed view,” the one that will show you the name of whoever received the package. Yes, now we have a name to associate the browsing history with. And even if the zip parameter isn’t in the tracking link – remember, the data contains a guess for it based on the user’s IP address, a fairly accurate one in fact.

Want more examples? Quite a few “referrers” are related to the authentication process of websites. A search for keywords like “oauth”, “openid” or “token” will produce lots of hits, usually without any of the parameters being censored. Worst-case scenario here: somebody with access to Jumpshot data could hijack an already authenticated session and impersonate this user, allowing them to access and modify user’s data. One has to hope that larger websites like Facebook and Google use short enough expiration intervals that such attacks would be impracticable for Jumpshot customers.

JWT tokens are problematic even under ideal conditions however. JWT is an authentication approach which works without server-side state, all the relevant information is encoded in the token itself. These tokens are easily found by searching for the “.ey” keyword. There are some issued by Facebook, AOL, Microsoft and other big names. And after reversing Base64 encoding you get something like:

{"instanceId":"abcd1234","uid":12345,"nonce":"dcba4321","sid":"1234567890"}

Most values here are implementation-specific and differ from site to site. But usually there is some user identifier, either a numerical one (can likely be converted into a user name somewhere on the website), occasionally an email or even an IP address. It also often contains tokens related to the user’s session and potentially allowing hijacking it: session identifier, nonce, sometimes even OAuth tokens.

Last but not least, there is this:

https://mail.yandex.ru/u1234/?uid=987654321&login=myyandexname#inbox

This address also wasn’t worth censoring for Avast. Now I never used Yandex Mail but I guess that this user’s email address is myyandexname@yandex.ru. There are quite a few addresses looking like this, most of them contain only the numerical user identifier however. I strongly suspect that some Yandex service or API allows translating these numerical IDs into user names however, thus allowing to deduce user’s email address.

Shortcomings of the heuristics

Now let’s have a look at the heuristic removing email addresses, the last line of defense. This one will reliably remove any URL-encoded email addresses, so you won’t find anything like me%40example.com in the data. But what about unusual encodings? Heuristics aren’t flexible, so these won’t be detected.

It starts with the obvious case of URL encoding applied twice: me%2540example.com. The Avast data contains plenty of email addresses encoded like this, for example:

https://m.facebook.com/login.php?next=https%3A%2F%2Fm.facebook.com%2Fn%2F
%3Fthread_fbid%3D123456789%26source%3Demail%26cp%3Dme%2540example.com

Did you notice what happened here? The email address isn’t a parameter to Facebook’s login.php. The only parameter here is next, it’s the address to navigate to after a successful login. And that address just happens to contain the user’s email address as a parameter, for whatever reason. Hence URL encoding applied twice.

Another scenario:

https://www.google.com/url?q=http://example.com/
confirm?email%3dme%2540example.com&source=gmail

What’s that, a really weird Google query? The source=gmail parameter indicates that it isn’t, it’s rather a link that somebody clicked in Gmail. Apparently, Gmail will will send such links as “queries” to the search engine before the user is redirected to their destination. And the destination address contains the email address here, given how the link originated from an address confirmation email apparently. Links from newsletters will also frequently contain the user’s email address.

And then there is this unexpected scenario:

https://mail.yahoo.com/d/search/name=John%2520Smith&emailAddresses=me%2540example.com

I have no idea why search in Yahoo Mail will encode parameters twice but it does. And searches of Yahoo Mail users contain plenty of names and email addresses of the people they communicate with.

Note that I only mentioned the most obvious encoding approach here. Some websites encode their parameters using Base64 encoding for some reason, and these also contain email addresses quite frequently.

Where do these users live?

So far we have names, email and IP addresses. That’s interesting of course but where do these users actually live? Jumpshot data provides only a rough approximation for that. Luckily (or unfortunately – for the users), Google Maps is a wildly popular service, and so it is very present in the data. For example:

https://www.google.de/maps/@52.518283,13.3735008,17z

That’s a set of very precise geographical coordinates, could it be the user’s home? It could be, but it also might be a place where they wanted to go, or just an area that they looked at. The following entry is actually way more telling:

https://www.google.de/maps/dir/Platz+der+Republik+1,+10557+Berlin/
Museum+f"ur+Kommunikation,+Leipziger+Strasse,+Berlin/@52.5140286,13.3774848,16z

By Avast’s standards, a route planned on Google Maps isn’t personally identifiable information – any number of people could have planned the same route. However, if the start of the route is an address and the end a museum, a hotel or a restaurant, it’s a fairly safe bet that the address is actually the user’s home address. Even when it isn’t obvious which end of the route the user lives at, the ZIP code in the Jumpshot data helps one make an educated guess here.

And then you type “Platz der Republik 1, Berlin” into a search engine and in quite a few cases the address will immediately map to a name. So your formerly anonymous user is now called Angela Merkel.

Wasn’t it all aggregated?

In 2015 Avast’s then-CTO Ondrej Vlcek promised:

These aggregated results are the only thing that Avast makes available to Jumpshot customers and end users.

Aggregation would combine data from multiple users into a single record, an approach that would make conclusions about individual users much harder. Sounds quite privacy-friendly? Unfortunately, Jumpshot’s marketing already cast significant doubt on the claims that aggregation is being used consistently.

What was merely a suspicion in my previous blog post is now a fact. I don’t want to say anything about Jumpshot data in general, I haven’t seen all of it. But the data I saw wasn’t aggregated at all, each record was associated with exactly one user and there was a unique user identifier to tell records from different users apart. Also, I’ve seen marketing material for the “All Clicks Feed” suggesting that this data isn’t aggregated either.

The broken promises here aren’t terribly surprising, aggregated data is much harder to monetize. I already quoted Graham Cluley before with his prediction from 2015:

But let’s not kid ourselves. Advertisers aren’t interested in data which can’t help them target you. If they really didn’t feel it could help them identify potential customers then the data wouldn’t have any value, and they wouldn’t be interested in paying AVG to access it.

Conclusions

I looked into a week’s worth of data from a “limited insights” product sold by Jumpshot and I was already able to identify a large number of users, sometimes along with their porn watching habits. The way this data was anonymized by Avast is insufficient to say the least. Companies with full access to the “every click, on every site” product were likely able to identify and subsequently stalk the majority of the affected users. The process of identifying users was easy to automate, e.g. by looking for double encoded email addresses or planned Google Maps routes.

The only remaining question is: why is it that Avast was so vehemently denying selling any personally identifiable data? Merely a few days before deciding to shut down Jumpshot Avast’s CEO Ondrej Vlcek repeated in a blog post:

We want to reassure our users that at no time have we sold any personally identifiable information to a third party.

So far we only suspected, now we can all be certain that this statement isn’t true. To give them the benefit of doubt, how could they have not known? The issues should have been pretty obvious to anybody who took a closer look at the data. The whole scandal took months to unwind. Does this mean that throughout all that time Avast kept repeating this statement, giving it to journalists and spreading it on social media, yet nobody bothered to verify it? If we follow this line of thought then the following statement from the same blog post is clearly a bold lie:

The security and privacy of our users worldwide is Avast’s priority

I for my part removed all the raw and processed Jumpshot data in presence of witnesses after concluding this investigation. Given the nature of this data, this seems to be the only sensible course of action.

https://palant.de/2020/02/18/insights-from-avast/jumpshot-data-pitfalls-of-data-anonymization/

Комментарии (0)

Mozilla Reps Community: Mozilla Reps in 2020 Berlin All Hands

Понедельник, 17 Февраля 2020 г. 17:30 + в цитатник

14 Reps were invited to participate in this year’s All Hands in Berlin.

At the All-Hands Reps learned some easy German words (Innovationsprozess-swischenstands-schreihungsskizze), did some art (see here X artistic endeavor during a group activity), and learned about cultural differences in communication.

Of course, it was not all going around, spray painting walls, and trying to understand the German sense of humor. The reps went down to serious Reps business.

To be sure that issues that were relevant to the Reps would be discussed in Berlin, all the invited Reps were asked to fill out a survey. The answers from the survey revealed a series of community issues that limit participation, as well as issues with the infrastructures of the Reps program. Putting together these answers with the themes that the Reps council worked on their OKRs last year, these issues were prioritized for discussion:

campaign and activities,
communication between Mozilla and Reps,
how to grow and activate communities,
mentors program, onboarding,
Reps program’s infrastructure.

So, to discuss how to tackle these issues and how to bring forward the program in 2020 the reps had meetings (lots and lots of meetings).

On Tuesday Reps were divided into small groups to discuss the above-mentioned issues. In every group, they were asked to discuss the current state and to imagine the ideal state for each issue by the end of 2020.

On Wednesday Reps were asked, “are we community coordinators yet?”. We agreed to prioritize the themes we discussed on Tuesday based on the above question and thinking around which one of them can strengthen their role as a community coordinator and, therefore, help to grow healthy communities.

Based on that prioritization two themes emerged as the leading themes for 2020: communication between Mozilla and the reps program and campaign and activities. That, of course, doesn’t mean that the rest of the themes are not equally important. The Reps Council decided to put the focus on the first two while at the same time open the participation for the other themes amongst the Reps.

On Wednesday the Reps Council met to discuss how to turn the 2 major topics into objectives and key results of the program in 2020. This conversation is still going and the Reps Council will publish this year’s OKRs soon.

On Thursday all reps discussed the campaigns pipeline and how volunteers can contribute more actively to it.

So, what was the conclusion from this tour-de-force?

Communications:

Regarding communication, the reps established that there was a need for more consistent communication and transparency. So that they set this goal to be reached by the end of 2020:

By the end of 2020, Reps have all the information about Mozilla’s goals and direction and they have the effective tools and processes to pass that information to their communities.

More in detail this would mean, between other things, that there will be a clear channel of communication for Mozilla to reach reps, consistent communication around Mozilla’s top goals, and the capability to update communities in their own languages. This will require establishing resources dedicated to improving communication.

Activities:

Reps established that there is a need to have more activities and campaigns in which communities can participate and that these campaigns need to be all in one place (the community portal). The objective is that:

By the end of 2020, Reps can suggest activities or campaigns to volunteers at any given time

Reps will also become more active in all the stages of a campaign pipeline, from the ideation to the implementation phase. To this end, communication and information between reps and projects/staff should become simpler.

Let us know what you think in the comments.

On behalf of the Community Development Team

Francesca and Konstantina

https://blog.mozilla.org/mozillareps/2020/02/17/mozilla-reps-in-2020-berlin-all-hands/

Комментарии (0)

Daniel Stenberg: curl ootw: –mail-from

Понедельник, 17 Февраля 2020 г. 16:55 + в цитатник

(older options of the week)

--mail-from has no short version. This option was added to curl 7.20.0 in February 2010.

I know this surprises many curl users: yes curl can both send and receive emails.

SMTP

curl can do “uploads” to an SMTP server, which usually has the effect that an email is delivered to somewhere onward from that server. Ie, curl can send an email.

When communicating with an SMTP server to send an email, curl needs to provide a certain data set and one of the mandatory fields of data that is necessary to provide is the email of the sender.

It should be noted that this is not necessary the same email as the one that is displayed in the From: field that the recipient’s email client will show. But it can be the same.

Example

curl --mail-from from@example.com --mail-rcpt to@example.com smtp://mail.example.com -T body.txt

Receiving emails

You can use curl to receive emails over POP3 or IMAP. SMTP is only for delivering email.

Related options

--mail-rcpt for the recipient(s), and --mail-auth for the authentication address.

https://daniel.haxx.se/blog/2020/02/17/curl-ootw-mail-from/

Комментарии (0)

Zibi Braniecki: JavaScript Internationalization in 2020

Суббота, 15 Февраля 2020 г. 01:28 + в цитатник

https://diary.braniecki.net/2020/02/14/js-intl-in-2020/

Комментарии (0)

The Firefox Frontier: Resolve data breaches with Firefox Monitor

Пятница, 14 Февраля 2020 г. 22:37 + в цитатник

Corporate data breaches are an all too common reality of modern life. At best, you get an email from a company alerting you that they have been hacked, and then … Read more

The post Resolve data breaches with Firefox Monitor appeared first on The Firefox Frontier.

https://blog.mozilla.org/firefox/resolve-data-breaches/

Комментарии (0)

Firefox Nightly: These Weeks in Firefox: Issue 69

Пятница, 14 Февраля 2020 г. 19:18 + в цитатник

https://blog.nightly.mozilla.org/2020/02/14/these-weeks-in-firefox-issue-69/

Комментарии (0)

Alessio Placitelli: Extending Glean: build re-usable types for new use-cases

Пятница, 14 Февраля 2020 г. 17:41 + в цитатник

An overview of the process to add new metric types to Glean or change existing ones.

https://www.a2p.it/wordpress/tech-stuff/mozilla/extending-glean-build-re-usable-types-for-new-use-cases/

Комментарии (0)

Mark Banner: ESLint now turned on for all of the Firefox/Gecko codebase

Пятница, 14 Февраля 2020 г. 13:08 + в цитатник

About 4 years and 2 months ago, Dave Townsend and I landed a couple of patches on the Mozilla codebase that kick-started rolling out ESLint across our source code. Today, I’ve just landed the last bug in making it so that ESLint runs across our whole tree (where possible).

ESLint is a static analyser for JavaScript that helps find issues before you even run the code. It also helps to promote best practices and styling, reducing the need for comments in reviews.

Several Mozilla projects had started using ESLint in early 2015 – Firefox’s Developer Tools, Firefox for Android and Firefox Hello. It was clear to the Firefox desktop team that ESLint was useful and so we put together an initial set of rules covering the main desktop files.

Soon after, we were enabling ESLint over more of desktop’s files, and adding to the rules that we had enabled. Once we had the main directories covered, we slowly started enabling more directories and started running ESLint checks in CI allowing us to detect and back out any failures that were introduced. Finally, we made it to where we are today – covering the whole of the Firefox source tree, mozilla-central.

Along the way we’ve filed over 600 bugs for handling ESLint roll-out and related issues, many of these were promoted as mentored bugs and fixed by new and existing contributors – a big thank you to you all for your help.

We’ve also found and fixed many bugs as we’ve gone along. From small bugs in rarely used code, to finding issues in test suites where entire sections weren’t being run. With ESLint now enabled, it helps protect us against mistakes that can easily be detected but may be hard for humans to spot, and reduces the work required by both developer and reviewer during the code-review-fix cycle.

Although ESLint is now running on all the files we can, there’s still more to do. In a few places, we skipped enabling rules because it was easier to get ESLint just running and we also wanted to do some formatting on them. Our next steps will be to get more of those enabled, so expect some more mentored bugs coming soon.

Thank you again to all those involved with helping to roll out ESLint across the Firefox code-base, this has helped tremendously to make Firefox’s and Gecko’s source code to be more consistently formatted and contain less dead code and bugs. ESLint has also been extremely helpful in helping switch away from older coding patterns and reduce the use of risky behaviour during tests.

https://www.thebanners.uk/standard8/2020/02/14/eslint-now-turned-on-for-all-of-the-firefox-gecko-codebase/

Комментарии (0)

The Firefox Frontier: Data detox: Four things you can do today to protect your computer

Четверг, 13 Февраля 2020 г. 19:00 + в цитатник

From the abacus to the iPad, computers have been a part of the human experience for longer than we think. So much so that we forget the vast amounts of … Read more

The post Data detox: Four things you can do today to protect your computer appeared first on The Firefox Frontier.

https://blog.mozilla.org/firefox/data-detox-computer/

Комментарии (0)

Daniel Stenberg: curl is 8000 days old

Четверг, 13 Февраля 2020 г. 09:46 + в цитатник

Another pointless number that happens to be round and look nice so I feel a need to highlight it.

When curl was born WiFi didn’t exist yet. Smartphones and tablets weren’t invented. Other things that didn’t exist include YouTube, Facebook, Twitter, Instagram, Firefox, Chrome, Spotify, Google search, Wikipedia, Windows 98 or emojis.

curl was born in a different time, but also in the beginning of the explosion of the web and Internet Protocols. Just before the big growth wave.

In 1996 when I started working on the precursor to curl, there were around 250,000 web sites (sources vary slightly)..

In 1998 when curl shipped, the number of sites were already around 2,400,000. Ten times larger amount in just those two years.

In early 2020, the amount of web sites are around 1,700,000,000 to 2,000,000,000 (depending on who provides the stats). The number of web sites has thus grown at least 70,000% over curl’s 8000 days of life and perhaps as much as 8000 times the amount as when I first working with HTTP clients.

One of the oldest still available snapshots of the curl web site is from the end of 1998, when curl was just a little over 6 months old. On that page we can read the following:

That “massive popularity” looks charming and possibly a bit naive today. The number of monthly curl downloads have also possibly grown by 8,000 times or so – by estimates only, as most users download curl from other places than our web site these days. Even more users get it installed as part of their OS or bundled with something else.

Thank you for flying curl.

(This day occurs only a little over a month before curl turns 22, there will be much more navel-gazing then, I promise.)

Image by Annie Spratt from Pixabay

https://daniel.haxx.se/blog/2020/02/13/curl-is-8000-days-old/

Комментарии (0)

The Firefox Frontier: What watching “You” on Netflix taught us about privacy

Среда, 12 Февраля 2020 г. 21:26 + в цитатник

We’re not sure if we can consider “You” a guilty pleasure considering how many people have binged every episode (over 43 million), but it certainly ranks up there right next … Read more

The post What watching “You” on Netflix taught us about privacy appeared first on The Firefox Frontier.

https://blog.mozilla.org/firefox/you-netflix-privacy-tips/

Комментарии (0)

Henri Sivonen: IME Smoke Testing

Среда, 12 Февраля 2020 г. 16:08 + в цитатник

https://hsivonen.fi/ime/

Комментарии (0)

Karl Dubost: Week notes - 2020 w06 - worklog - Finishing anonymous reporting

Среда, 12 Февраля 2020 г. 09:00 + в цитатник

Monday

I came back yesterday at home from Berlin All Hands at noon. Today will be probably tough on jetlag. Almost all Japanese from Narita Airport to home were wearing a mask (as I did). The coronavirus is in the mind: "deaths at 361 and confirmed infections in China at 17,238".

Cleaning up emails. And let's restart coding for issue #3140 (PR #3167). Last week, I discussed with mike, if I should rebase the messy commits so we have a cleaner version. On one hand, the rebase would create a clean history with commits by specific sections, but the history of my commits also document the thought process. For now I think I will keep the "messy informative" commits.

Unit tests dependency

When unit tests are not failing locally but failing on CircleCI, it smells a dependency on the environment. Indeed, here the list of labels returned behaved differently on CircleCI because locally, in my dev environment, I had a populated data/topsites.db. This issue would not have been detected if my local topsites.db was empty like on circleCI. A unittest which depends on an external variability is bad. For now I decided to mock the priority in the test and I opened an issue to create a more controlled environment.

Tuesday

Finishing the big pull request for the new workflow. I don't like usually to create huge pull request. I prefer a series of smaller ones, tied to specific issues. The circumstances pushed me to do that. But I think it's an interesting lesson. How much do we bend our guidelines and process rules in an emergency situation.

23:00 to midnight, we had a webcompat team video-meeting.

Wednesday

Code review today for kate's code on fetching the labels for the repos. She used GraphQL, which is cool. Small piece of codes creates opportunities to explore new ways of doing things.

I wonder if there is a client with a pythonic api for GraphQL, the same way that SQLAlchemy does for SQL.

I need to

finish the week notes from the All Hands while it is still fresh in my head. Probably today.
publish the meeting notes from last night. The meat is in the okr issues
publish the official minutes of our discussions.

Thursday

Big pull request… big bug. Big enough that it would create an error 500 on a certain path of the workflow. So more tests, and fixing the issue in a new pull request.

Restarting diagnosis too because the curve is going up. We are +50 above our minimum on January 2020. You can definitely help.

Friday

Such a big pull request created obviously more bugs. Anonymous reporting is activated with a flag and our flag didn't work as expected. So that was strange.

environment variables are always strings

After investigating a bit, we were using an environment variable for the activation with the value True or False in bash environment.

ANONYMOUS_REPORTING = True

and importing it in python with

ANONYMOUS_REPORTING = os.environ.get('ANONYMOUS_REPORTING') or False

then later on in the code it would be simple enough to do:

if ANONYMOUS_REPORTING:
    # do something clever here

Two mistakes here. Assumming that:

True in bash will carry the same meaning than True once read in python
True is not a boolean (and never was) but a string, which means that ANONYMOUS_REPORTING will always be True (in python boolean sense) because a string is true compared to the absence of a string.

>>> not ''
True
>>> not 'foo'
False

So to make it more explicit, we switched to ON or OFF values

ANONYMOUS_REPORTING = ON

In the process of doing this, we inadvertently (bad timing) into a regression because some flask modules have been adjusted to a different version of Werkzeug. So it is time for an upgrade.

So we are almost there but not yet.

By the end of this week, the epidemy is now close to 1000 deaths of Coronavirus. This is getting really serious.

Otsukare!

http://www.otsukare.info/2020/02/07/week-notes-2020-06

Комментарии (0)

The Talospace Project: Firefox 73 on POWER

Среда, 12 Февраля 2020 г. 08:18 + в цитатник

... seems to just work. New in this release is better dev tools and additional CSS features. This release includes the fix for certain extensions that regressed in Fx71, and so far seems to be working fine on this Talos II. The debug and optimized mozconfigs I'm using are, as before, unchanged from Firefox 67.

https://www.talospace.com/2020/02/firefox-73-on-power.html

LiveInternetLiveInternet

-Поиск по дневнику

-Подписка по e-mail

-Постоянные читатели

-Статистика

Перейти к полной формеБыстрая запись

new anonymous workflow reporting.

from string to boolean in python

coverage and pytest

Do NOT name the module, the directory and the blueprint with the same name.

changing circleCI

Friday : diagnosis.

Miscellaneous

What we know so far about the bug:

How can you help?

Challenge guidelines

The data

What about anonymization?

Failures to censor user-specific parameters

Shortcomings of the heuristics

Where do these users live?

Wasn’t it all aggregated?

Conclusions

SMTP

Example

Receiving emails

Related options

Monday

Unit tests dependency

Tuesday

Wednesday

Thursday

Friday

environment variables are always strings