Chris H-C: Firefox User Engagement |
I now know, which is to say that I can state with some degree of certainty, that Windows XP Firefox users are only a little less engaged with their Firefox installations as the Firefox user base as a whole.
To get here I needed to sanitize dates with more than four digits in their years (paging the Long Now Foundation: someone’s reporting Telemetry from the year 29634), with timezones not of this Earth (Wikipedia seems to think UTC+14:00 is the highest time zone, but I have seen data from +14:30), and with clones reporting the same session over and over again from different locations (this one might be because a user’s client_id is stored in their profile. If that profile is reused, then we will get data reported from all the locations that use that profile).
I also needed a rigorous definition of what it means for a user population to be “engaged”.
We chose to define an Engagement Ratio for a given day which is basically the number of people who ran Firefox that day divided by the number of people who ran Firefox the previous month. In other words: what proportion of users who could possibly be active actually were active on that day?
Of course, if you read my previous post, you know that the day of the week you choose is going to change your result dramatically. So instead of counting each user who used Firefox that exact day, we average it out over the previous week: if a user was active each day, they’re a full Daily Active User (DAU). If they’re active only one day, they’re 1/7 of a Daily Active User.
To see this in action, I chose to measure Engagement on March the 10th of this year, which was a Thursday. The number of users who reported data to the 1% Longitudinal Dataset who were active that day was 1,119,335. If we use instead the average of daily active users for the week ending on March the 10, we get 1,051,011.86 which is lower by over 6%. This is consistent with data we’ve collected in other studies that showed only 20% of Firefox users use it 7 days a week. Another 25% use it only on weekdays. So it makes sense that a weekday’s DAU count would be higher than a weekend day’s.
If you’ve ever complained about having to remember how many days there are in a month, you know that the choice of “month” is going to change things as well. So in this case, we just chose 28 days: four weeks. That there is no month that is always 28 days long (lookin’ at you, Leap Years) is irrelevant because we’re selecting values to make things easier for ourselves. So if a user was active on any of the previous 28 days, they are a Monthly Active User (MAU).
So you count your DAU and you divide it by your MAU count and you get your Engagement Ratio (ER) whose units are… unitless. Divide users by users and you get… something that’s almost a percent, in that it’s a value from 0 to 1 that represents a proportion of a population.
This number we can track over time. We expect it to trend downwards when we ship something that negatively impacts users. We expect it to trend upwards when we ship something that users like.
So long as we can get this data quickly enough and reliably enough, we can start determining from the numbers (the silent majority), not noisy users (the vocal minority), what issues actually matter to the user base at large.
:chutten
https://chuttenblog.wordpress.com/2016/03/23/firefox-user-engagement/
|
Luis Villa: Free as in … ? My LibrePlanet 2016 talk |
Below is the talk I gave at LibrePlanet 2016. The tl;dr version:
I did not talk about it in the talk (given the audience), but I think this approach is broadly applicable to every software developer who wants to make the world a better place (including usability-inclined developers, open web/standards folks, etc.), not just FSF members.
I was not able to use my speaker notes during the talk itself, so these may not match terribly well with what I actually said on Saturday – hopefully they’re a bit more coherent. Video will be posted here when I have it.
Most of you will recognize this phrase as borrowed from the Wikimedia Foundation. Think on it for a few seconds, and how it differs from the Four Freedoms.
I’d like to talk today about code freedom, and what it can learn from modern political philosophy.
Last time I was at Libre Planet, I was talking with someone in a hallway, and I mentioned that Libre Office had crashed several times while I was on the plane, losing some data and making me redo some slides. He insisted that it was better to have code freedom, even when things crashed in a program that I could not fix without reading C++ comments in German. I pointed out, somewhat successfully, that software that was actually reliable freed me to work on my actual slides.
We were both talking about “freedom” but we clearly had different meanings for the word. This was obviously unsatisfying for both of us – out common language/vocabulary failed us.
This is sadly not a rare thing: probably many of us have had the same conversation with parents, friends, co-workers, etc.
So today I wanted to dig into “freedom” – what does it mean and what frameworks do we hang around it.
So why do we need to talk about Freedom and what it means? Ultimately, freedom is confusing. When card-carrying FSF members use it, we mean a very specific thing – the four freedoms. When lots of other people use it, they mean… well, other things. We’ll get into it in more detail soon, but suffice to say that many people find Apple and Google freeing. And if that’s how they feel, then we’ve got a very big communication gap.
I’m not a political philosopher anymore; to the extent I ever was one, it ended when I graduated from my polisci program and… immediately went to work at Ximian, here in Boston.
My goal here today is to show you that when political philosophers talk about freedom, they also have some of the same challenges we do, stemming from some of the same historical reasons. They’ve also gotten, in recent years, to some decent solutions – and we’ll discuss how those might apply to us.
Apologies if any of you are actually political philosophers: in trying to cram this into 30 minutes, we’re going to take some very, very serious shortcuts!
Let’s start with a very brief introduction to political philosophy.
Philosophers of all stripes tend to end up arguing about what is “good”; political philosophers, in particular, tend to argue about what is “just”. It turns out that this is a very slippery concept that has evolved over time. I’ll use it somewhat interchangeably with “freedom” in this talk, which is not accurate, but will do for our purposes.
Ultimately, what makes a philosopher a political philosopher is that once they’ve figured out what justice might be, they then argue about what human systems are the best ways to get us to justice.
In some sense, this is very much an engineering problem: given the state of the world we’ve got, what does a better world look like, and how do we get there? Unlike our engineering problems, of course, it deals with the messy aspects of human nature: we have no compilers, no test-driven-development, etc.
So before Richard Stallman, who were the modern political philosophers?
Your basic “intro to political philosophy” class can have a few starting points. You can do Plato, or you can do Hobbes (the philosopher, not the tiger), but today we’ll start with John Locke. He worked in the late 1600s.
Locke is perhaps most famous in the US for having been gloriously plagiarized by Thomas Jefferson’s “life, liberty, and pursuit of happiness”. Before that, though, he argued that to understand what justice is, you have to look at what people are missing when they don’t have government. Borrowing from earlier British philosophers (mostly Hobbes), he said (in essence) that when people have no government, everyone steals from – and kills – everyone else. So what is justice? Well, it’s not stealing and killing!
This is not just a source for Jefferson to steal from; it is perhaps the first articulation of the idea that every human being (at least, every white man) is entitled to certain inalienable rights – what are often called the natural rights.
This introduces the idea that individual freedom (to live, to have health, etc.) is a key part of justice.
Locke was forward-thinking enough that he was exiled to the Netherlands at one point. But he was also a creature of his time, and concluded that monarchy could be part of a just system of government, as long as the people “consented” by, well, not immigrating.
This is in some sense pretty backwards, since in 1600s Europe, emigration isn’t exactly easy. But it is also pretty forward looking – his most immediate British predecessor, Hobbes, basically argued that Kings were great. So Locke is one of the first to argue that what the people want (another aspect of what we now think of as individual freedom) is important.
It is important to point out that Locke’s approach is what we’d now call a negative approach to rights: the system (the state, in this case) is obligated to protect you, but it isn’t obliged to give you anything.
Coming from the late 1600s, this is not a crazy perspective – most governments don’t even do these things. For Locke to say “the King should not take your stuff” is pretty radical; to have said “and it should also give you health care” would have also made him the inventor of science fiction. And the landed aristocracy are typically fans!
(Also, apologies to my typographically-sensitive friends; kerning of italicized fonts in Libre Office is poor and I got lazy around here about manually fixing it.)
But this is where Locke starts to fall down to modern ears: if you’re not one of the landed aristocracy; if you’ve got no stuff for the King to take, Locke isn’t doing much for you. And it turns out there are a whole lot of people in 1600s England without much stuff to take.
So let’s fast forward 150+ years.
You all know who Marx is; probably many of you have even been called Marxists at one point or another!
Marx is complicated, and his historical legacy even more so. Let’s put most of that aside for today, and focus on one particular idea we’ve inherited from Marx.
For our purposes, out of all of Marx, we can focus on the key insight that people other than the propertied class can have needs.(This is not really his insight; but he popularizes it.) I
Having recognized that humans have needs, Marx then goes on to propose that, in a just society, the individual might not be the only one who has a responsibility to provide those needs – the state, at least when we reach a “higher phase” of economic and moral development, should also provide.
This sounds pretty great on paper, but it is important to grok that Marx argues that his perfect system will happen only when we’ve reached such a high level of economic development that no one will need to work, so everyone will work only on what they love. In other words, he ignores the scarcity we face in the real world. He also ignores inequality – since the revolution will have washed away all starting differences. Obviously, taken to this extreme, this has led to a lot of bad outcomes in the world – which is what gives “marxism” its bad name.
But it is also important to realize that this is better than Locke (who isn’t particularly concerned with inequality), and in practice the idea (properly moderated!) has led to the modern social welfare state. So it is a useful tool in the modern philosophical toolkit.
Fast forward again, another 100 years. Our scene moves down the street, to Harvard. Perhaps the two most important works of political philosophy of the 20th century are written and published within four years of each other, further up Mass Avenue from MIT.
John Rawls publishes his Theory of Justice in 1971; Robert Nozick follows up with his Anarchy, the State, and Utopia in 1974.
Rawls and Nozick, and their most famous books, differ radically in what they think of as justice, and what systems they think lead to the greatest justice. (Nozick is the libertarian’s libertarian; Rawls more of a welfare-state type.) Their systems, and the differences between them, are out of our scope today (though both are fascinating!).
However, both agree, in their ways, that any theory of a just world must grapple with the core fact that modern societies have a variety of different people, with different skills, interests, backgrounds, etc. (This shouldn’t be surprising, given that both were writing in the aftermath of the 60s, which had made so clear to many that our societies were pretty deeply unjust to a lot of people.)
This marks the beginning of the modern age of political philosophy: Locke didn’t care much about differences between people; Marx assumed it away. Nozick and Rawls can be said, effectively, to mark the point when political philosophy starts taking difference seriously.
But that was 40 years ago – what has happened since then?
So that brings us to the 1990s, and also to 2016. (If you haven’t already figured it out, political philosophy tends to move pretty slowly.)
The new-ish hotness in political philosophy is something called capability theory. The first work is put forward by Amartya Sen, an Indian economist working with (among others) the United Nations on how to focus their development work. Martha Nussbaum then picked up the ball, putting in a great deal of work to systematize it.
When Sen starts working on what became capability theory, he’s a development economist trying to help understand how to help improve the lives of his fellow Indian citizens. And he’s worried that a huge focus on GDP is not leading to very good outcomes. He turns to political theory, and it doesn’t help him: it is focused on very abstract systems. John Locke saying “life, liberty, property” and “sometimes monarchs are OK” doesn’t help him target the UN’s investment dollars.
So his question becomes: how do I create a theory of What is Just that actually helps guide decisions in the real world? Capability theory, in other words, is ultimately pragmatic.
To put it another way, you can think of the capability approach as an attempt to figure out what effective freedom is: how do we take freedom out of textbooks and into something that really empowers people?
One of the key flaws for Sen of existing theories was that they talked about giving people at worst, negative rights (protecting their rights to retain property they didn’t have) and at best, giving them resources (giving them things or training they couldn’t take advantage of). He found this unconvincing, because in his experience India’s constitution gave all citizens those formal rights, but often denied them those rights in practice, through poverty, gender discrimination, caste discrimination, etc.
And so from this observation we have the name of the approach: it focuses on what, pragmatically, people need to be capable of acting freely.
Some examples may be helpful here to explain what Sen and Nussbaum are getting at.
For example, if all men and women have the same formal access to education, but women get fewer job callbacks after college than men with identical resumes, or men refuse to care for children and aging parents, then it seems unlikely that we can really claim to have a just society.
Somalia, circa 1995-2000, was, on the face of it, a libertarian paradise: it gave you a lot of freedom to start businesses! No minimum wage, no EPA.
But it turns out you need more than “freedom from government interference” to run a business: you have to have a lot of other infrastructure as well. (Remember, here, Locke’s “negative” rights: government not stopping you, v. government supporting you.)
These examples suggest that answering political philosopher question #1 (“what is justice?”) requires more than just measuring access to resources. What you want to know to understand whether a system is just, you have to measure whether all people have the opportunity to get to the important goals.
In other words, do they have the capability to act?
This is the core insight that the capabilities approach is grounded in: it is helpful, but not enough, to say “someone has the natural rights” (Locke) or “some time in the future everyone will have the same opportunity” (Marx).
(Is any of this starting to ring a bell?)
Capability approach is, again, very pragmatic, and comes from a background of trying to allocate scarce development resources in the real world, rather than a philosopher’s cozy university office. So if you’re trying to answer the political philosopher’s question (“what system”), you need to pick and choose a few capabilities to focus on, and figure out what system will support those capabilities.
Again, an example might be helpful here to show how picking the right things to focus on can be important when you’re aiming to build a system that supports human capability.
If you focus on only one dimension, you’re going to get things confused. When Sen was beginning his work, the development community tended to focus exclusively on GDP. Comparing the Phillippines and South Africa by this number would have told you to focus your efforts on the Philippines.
But one of the most basic requirements to effective freedom – to supporting people’s capability to act – is being alive! When we look at it through that lens, we pretty quickly see that South Africa is worth more energy. It’s critical to look through that broader lens to figure out whether your work is actually building human freedom.
This is, perhaps, the most contentious area of capability theory – it’s where writing is being done across a variety of disciplines, including economics, political philosophy, sociology, and development. This writing has split into two main areas: the pragmatists, who just want to figure out useful tools that help them improve the world, and the theorists, who want to ground the theory in philosophy (sometimes as far back as Aristotle).
This is a great place to raise Martha Nussbaum again: she’s done the most to bring theoretical rigor to the capability approach. (Some people call Sen’s work the “capability approach”, to show that it is just a way of thinking about the problem; and Nussbaum’s work “capability theory”, to show that it is a more rigorous approach.)
I have bad news: there is no one way of doing this. Some approaches can include:
Each of these can be seen as overlapping ways of identifying the best issues to identify – all of them will be useful and valid in different domains.
Shared theme of that last slide? Thinking primarily about people. Things are always a means to an end in the capability approach – you might still want to measure them as an important stepping stone to helping people (like GDP!) but they’re never why you do something.
There is no one right way to pick which capabilities to focus on, which drives lots of philosophers mad. We’ll get into this in more detail soon – when I talk about applying this to software.
Probably the bottom line: if you want to know how to get to a more just system, you want to ask about the capabilitiesof the humans who are participating in that system. Freedom is likely to be one of the top things people want – but it’s a means, not the end.
So now we’ve come to the end of the philosophy lecture. What does this mean for those of us who care about software?
So, again, what do political philosophers care about?
The FSF’s four freedoms try to do the right thing and help build a more just world.
If you don’t have some combination of time, money, or programming skills, it isn’t entirely clear the four freedoms do a lot for you.
The four freedoms are negative rights: things no one can take away from you. And that has been terrific for our elites: Locke’s landed aristocracy is our Software as a Service provider, glad the King can’t take away his right to run MySQL. But maybe not so much for most human beings.
This brings us to our second question – what system?
Inspired by the capability approach, what I would argue that we need is a focus on effective freedom. And that will need not just a change to our focus, but to our systems as well – we need to be pragmatic and inclusive.
So let me offer four suggestions for free software inspired by the capability approach.
We need to start by having empathy for all our users, since our goal should be software that liberates all people.
Like the bureaucrat who increases GDP while his people die young, if we write billions of lines of code, but people are not empowered, we’ve failed. Empathy for others will help us remember that.
Sen, Nussbaum, and the capability approach also remind us that to effectively provide freedom to people we need to draw opinions and information from the broadest possible number of people. That can simply take the form of going and listening regularly to why your friends like the proprietary software they use, or ideally listening to people who aren’t like you about why they don’t use free software. Or it can take the form of surveys or even data-driven research. But it must start with listening to others. Scratching our own itch is not enough if we want to claim we’re providing freedom.
Or to put it another way: our communities need to be as empowering as our licenses. There are lots of great talks this weekend on how to do that – you should go to them, and we should treat that as philosophically as important as our licenses.
I think it is important to point out that I think the FSF is doing a lot of great work in this area – this is the most diversity I’ve seen at Libre Planet, and the new priorities list covers a lot of great ground here.
But it is also a bad sign that at the new “Open Source and Feelings” conference, which is specifically aimed at building a more diverse FOSS movement, they chose to use the apolitical “open” rather than “free”. That suggests the FSF and free software more generally still have a lot of work to do to shed their reputation as being dogmatic and unwelcoming.
Which brings me to #2: just as we have to listen to others, we have to be self-critical about our own shortcomings, in order to grapple with the broad range of interests those users might have.
At the begining of this talk, I talked about my last visit to Libre Planet, and how hard it was to have a conversation about the disempowerment I felt when Libre Office crashed. The assumption of the very well-intentioned young man I was talking to was that of course I was more free when I had access to code. And in a very real way, that wasn’t actually true – proprietary software that didn’t crash was actually more empowering to me than libre software that did crash. And this isn’t just about crashing/not-crashing.
Ed Snowden reminded us this morning that Android is freely-licensed, but that doesn’t mean it gives them the capability to live a secure life.
Again, here, FSF has always done some of the right thing! You all recognize this quote: it’s from freedom zero. We often take pride in this, and we should!
But we also often say “we care about users” but only test what the license is. I’ve never seen someone say “this is not free, because it is impossible to use” – it is too easy, and too frequent, to say “well, the license says you can run the program as you wish, so it passes freedom zero”. We should treat that as a failure to be humble about.
Humility means admitting our current. unidimensional systems aren’t great at empowering people. The sooner we admit that freedom is complex, and goes beyond licensing, the quicker we can build better systems.
The third theme of advice I’d give is to think about impact. Again, this stems from the fundamental pragmatism of the capability approach. A philosophy that is internally consistent, but doesn’t make a difference for people, is not a useful philosophy. We need to take that message to heart.
Mako Hill’s quantitative research has shown us that libre code doesn’t necessarily mean quality code, or sucessful projects. If we want to impact users, we have to understand why our core development tools are no longer best-in-class, and fix them, or develop new models to replace them.
We built CVS, SVN, and git, and we used those tools to build some of the most widely-used pieces of software on earth. But it took the ease of use of github to make this accessible to millions of developers.
Netsplit.de is a search engine for IRC services. Even if both of these numbers are off by a factor of two (say, because of private networks missing from the IRC count, and if Slack is inflating user counts), it still suggests Slack will have more users than IRC this year. We need to think about why that is, and why free software like IRC hasn’t had the impact we’d like it to.
If we’re serious about spreading freedom, this sort of “post-mortem” of our successes and failures is not optional – it is a mandatory part of our commitment to freedom.
I’ve mentioned that democracy is one way of choosing what capabilities to focus on, and is typically presumed in serious analyses of the capability approach – the mix of human empowerment and (in Sen’s analysis) better pragmatic impact make it a no-brainer.
A free software focused on impact could make free licensing a similar no-brainer in the software world.
Dan Gillmor told us this morning that “I came for the technical excellence and stayed for the freedom”: as both he and Edward Snowden said this morning, we have to have broaden our definition of technical excellence to include usability and pragmatic empowerment. When we do that, our system – the underlying technology of freedom – can lead to real change.
This is the last, and hardest, takeaway I’ll have for the day.
We’ve learned from the capability approach that freedom is nuanced, complex, and human-focused. The four freedoms, while are brief, straightforward, and easy to apply, but those may not be virtues if our goal is to increase user freedom.
As I’ve said a few times, the four freedoms are like telling you the king can’t take your property: it’s not a bad thing, but it also isn’t very helpful if you don’t have any property.
We need to re-interpret “run the program as you wish” in a more positive light, expanding our definitions to speak to the concerns about usability and security that users have.
The capability approach provides us with questions – where do we focus? – but not answers. So it suggests we need to go past licensing, but doesn’t say where those other areas of focus might be. Here are some suggestions for what directions we might evolve free software in.
Learning from Martha Nussbaum and usability researchers, we could work with the next generation of software users to understand what they want, need, and deserve from effective software freedom.
We could learn from other organizations, like UNICEF, who have built design and development principles. The graphic here is from UNICEF’s design principles, where they talk about how they will build software that improves freedom for their audience.
It includes talk about source code – as part of a coherent whole of ten principles, not an end in and of itself.
Many parts of our community (including FSF!) have adopted codes of conduct or similar policies. We could draw on the consistent themes in these documents to identify key values that should take their place alongside the four freedoms.
Finally, we can vote with our code: we should be contributing where we feel we can have the most impact on user freedom, not just code freedom. That is a way of giving our impact: we can give our time only to projects that empower all users. In my ideal world, you come away determined to focus on projects that empower all people, not just programmers.
Ultimately, this is my vision, and why I remain involved in free software – I want to see people who are liberated. I hope after this talk you all understand why, and are motivated to help it happen.
Thanks for listening.
Further reading:
Image sources and licenses (deck itself is CC BY-SA 4.0):
http://lu.is/blog/2016/03/23/free-as-in-my-libreplanet-2016-talk/
|
Adam Roach: An Open Letter to Tim Cook: Apple and the Environment |
http://sporadicdispatches.blogspot.com/2016/03/an-open-letter-to-tim-cook-apple-and.html
|
Air Mozilla: Objets connect'es et BlockChain |
Premi`ere partie : Mozilla et les Objets Connect'es Il y a une quinzaine d'ann'ees, le web 'etait occup'e par une arm'ee de protocoles et d'utils...
|
Air Mozilla: The Joy of Coding - Episode 50 |
mconley livehacks on real Firefox bugs while thinking aloud.
|
Air Mozilla: SuMo Community Call 23rd March 2016 |
This is the sumo weekly call We meet as a community every Wednesday 17:00 - 17:30 UTC The etherpad is here: https://public.etherpad-mozilla.org/p/sumo-2016-03-23
https://air.mozilla.org/sumo-community-call-23rd-march-2016/
|
Byron Jones: mozreview and inline comments on the diff view |
when a comment is left on a review in review board/mozreview it is currently displayed as a small square in the left column.
our top reviewers have strongly indicated that this is suboptimal and would prefer to match what most other code review systems do in displaying comments as an inline block on the diff. i agree — review comments are important and deserve more attention in the user interface; they should be impossible to miss.
while the upstream review board team have long said that the current display is in need of fixing, there is minimal love for the inline comments approach.
recently we worked on a plan of attack to appease both our reviewers and upstream’s requirements. :smacleod, :mconley and i talked through a wide assortment of design goals and potential issues. we have a design document and i’ve started mocking up the approach in html:
we also kicked off discussions with the upstream review board development team, and are hopeful that this is a feature that will be accepted upstream.
https://globau.wordpress.com/2016/03/23/mozreview-and-inline-comments-on-the-diff-view/
|
Emily Dunham: Reducing SaltStack log verbosity for TravisCI |
Servo has some Salt configs, hosted on GitHub, for which changes are smoke-tested on TravisCI before they’re deployed. Travis only shows the first 10k lines of log output, so I want to minimize the amount of extraneous information that the states print.
My salt state looks like::
android-sdk: archive.extracted: - name: {{ common.homedir }}/android/sdk/{{ android.sdk.version }} - source: https://dl.google.com/android/android-sdk_{{ android.sdk.version }}-linux.tgz - source_hash: sha512={{ android.sdk.sha512 }} - archive_format: tar - archive_user: user - if_missing: {{ common.homedir }}/android/sdk/{{ android.sdk.version }}/android-sdk-linux - require: - user: user
The output in TravisCI is::
ID: android-sdk Function: archive.extracted Name: /home/user/android/sdk/r24.4.1 Result: True Comment: https://dl.google.com/android/android-sdk_r24.4.1-linux.tgz extracted in /home/user/android/sdk/r24.4.1/ Started: 17:46:25.900436 Duration: 19540.846 ms Changes: ---------- directories_created: - /home/user/android/sdk/r24.4.1/ - /home/user/android/sdk/r24.4.1/android-sdk-linux extracted_files: ... 2755 lines listing one file per line that I don't want to see in the log
https://docs.saltstack.com/en/latest/ref/states/all/salt.states.archive.html has useful guidance on how to increase the tar state’s verbosity, but not to decrease it. This is because the extra 2755 lines aren’t coming from tar itself, but from Salt assuming that we want to know.
The outputter takes several state_output setting options. The terse option summarizes the result of each state into a single line.
There are a couple places you can set this:
Setting the terse option in /etc/salt/minion dropped the output of a highstate from over 10,000 lines to about 2500.
http://edunham.net/2016/03/23/reducing_saltstack_log_verbosity_for_travisci.html
|
Daniel Pocock: GSoC 2016 opportunities for Voice, Video and Chat Communication |
I've advertised a GSoC project under Debian for improving voice, video and chat communication with free software.
Replacing Skype, Viber and WhatsApp is a big task, however, it is quite achievable by breaking it down into small chunks of work. I've been cataloguing many of the key improvements needed to make Free RTC products work together. Many of these chunks are within the scope of a GSoC project.
If you can refer any students, if you would like to help as a mentor or if you are a student, please come and introduce yourself on the FreeRTC mailing list. If additional mentors volunteer, there is a good chance we can have more than one student funded to work on this topic.
The student application deadline is 25 March 2016 19:00 UTC. This is a hard deadline for students. Mentors can still join after the deadline, during the phase where student applications are evaluated.
The Google site can be very busy in the hours before the deadline so it is recommended to try and complete the application at least 8 hours before the final deadline.
Action items for students:
When completing the application form for Google, the wiki page and writing the email to introduce yourself, consider including the following details:
Please also see my other project idea, for ham radio / SDR projects and my blog Want to be selected for Google Summer of Code 2016?.
We try to make contact with all students who apply and give some feedback, in particular, we will try to let you know what to do to increase your chances of selection in the next year, 2017. Applying for GSoC and being interviewed by mentors is a great way to practice for applying for other internships and jobs.
|
Hub Figui`ere: Exempi 2.3.0 and Rust... |
Last week I released Exempi 2.3.0. It adds a couple more APIs and fix a few bugs.
Also I have now released my first Rust crate, that provide a Rust API to Exempi: exempi-rs. Short of rewriting the whole parsing in Rust for safety — the core of Exempi is Adobe official XMP SDK written in C++ —, this will do.
https://www.figuiere.net/hub/blog/?2016/03/22/859-exempi-230-and-rust
|
Air Mozilla: Bay Area Rust Meetup March 2016 |
It's time for another meetup. This meetup is dedicated to some of the data science projects being developed in Rust. We have three speakers lined...
|
About:Community: Help Firefox Go Faster |
A wave of new contributors have been asking how they can help Firefox without (necessarily) having strong coding skills, and just at the right time. I’m writing this because Firefox engineering needs exactly that kind of help, right now.
There are a lot of ways you can help Firefox and the Mozilla project, and most of them don’t involve writing code at all. Living in our nightly builds of Firefox is a big deal, and using the Beta release of Firefox on Android, if you happen to be an Android user, helps improve the mobile experience as well.
There’s a lot that needs doing. But one thing we’re really looking for – that Firefox engineering could use your help with today – is bug triage and component ownership.
Developing Firefox in the open with a user base of our size means we get a lot of bug reports. A lot. Hundreds every day, and they come in all shapes and sizes. Some of them are clear, tightly-scoped reports that come with clear steps to reproduce, and those are great. Others are a vaguely-described, hard-to-interpret mess. Most of them are somewhere in between, real problems that are difficult to act on.
Whatever condition these bugs arrive in there’s always a passionate Firefox user behind them with a real problem they care about solving. We know that Bugzilla is not an easy mountain to climb; to respect those users’ efforts we want to give all our incoming bugs enough care and attention to get them from “a user has a problem with our product” to “an engineer has the information they need to make the right decision.”
This is where you come in.
We know what makes a bug a capital-G, capital-B Good Bug from an engineering standpoint – It’s assigned to the right component, it’s steps to reproduce or a regression range if it needs them, and its clear what the next steps are and who needs to take them. For the most part getting bugs from “new” to “good” doesn’t mean writing code – it’s all about organization, asking questions, following up and making sure things don’t get lost.
This kind of work – de-duplicating, cleaning up and clarifying where these bugs are and what’s next for them – is incredibly valuable. We’ve had a handful of people take up this work in the past, often growing into critical leadership roles at Mozilla in the process, and its hard to overstate how much their work has mattered to Mozilla and driven forward the Open Web.
We need people – we need you – to help us keep an eye on the bugs coming to different components, sort them out and ask the right questions to get them into actionable shape. This may seem like a big job, but it’s mostly about organization and persistence.
In the beginning just setting the right flags and ask some questions will help. As you gain experience you’ll learn how to turn unclear, ambiguous bug reports into something an engineer will be excited fix. In the long term someone who really knows their component, who can flag high-priority issues, clean them up and get them to the right engineers, will have an dramatic impact on the product, helping Mozilla make Firefox and the Web better for hundreds of millions of users.
You don’t need much more than a bugzilla.mozilla.org account and a computer than can run Firefox Nightly to get started. If you’ve got that, and you’re interested in taking this up, please email me so that we can match you up to one of the components that needs your help.
Thank you; I’m looking forward to hearing from you.
http://blog.mozilla.org/community/2016/03/22/help-firefox-go-faster/
|
L. David Baron: Security and Inequality |
It is sometimes easy for technology experts to think about computer security in terms of building technology that can allow them (the experts) to be secure. However, we need to think about the question of how secure all of the users of the technology will be, not the question of how secure the most skilled users can possibly be. (The same applies to usability as well, but I think it's more uniformly understood there.) We need to design systems where everybody (not just technology experts) can be secure. Designing software that is secure only when used by experts risks increasing inequality between an elite who are able to use software securely and a larger population who cannot.
We don't, for example, want to switch to using a cryptocurrency where only the 1% of most technically sophisticated people are capable of securing their wealth from theft (and where the other 99% have no recourse when their money is stolen).
Likewise, we don't want to create new and avoidable differences in employability of individuals based on their ability to use the technology we create to maintain confidentiality or integrity of data, or based on their ability to avoid having their lives disrupted by security vulnerabilities in connected (IoT) devices.
If we can build software that is usable and secure for as many of its users as possible, we can avoid creating a new driver for inequality. It's a form of inequality that would favor us, the designers of the technology, but it's still better for society as a whole if we avoid it. This would also be avoiding inequality in the best way possible: by improving the productivity of the bulk of the population to bring them up to the level of the technological elite.
|
Sean McArthur: async hyper |
It’s been a steady if somewhat slow march toward asynchronous IO in hyper. Over the past several months, I’ve tried out different strategies for implementing async IO. Eventually, after some useful conversations at last Mozilla All-Hands, I’ve settled on a performant approach using state machines. This reduces heap allocations, and eliminates all dynamic function calls inside hyper. This allows those who want the best possible performance to still be able to utilize hyper, while allowing various forms of asynchronous (and synchronous) programming styles to be built on top with minimal overhead.
The good news: it’s nearly complete! Of course, there’s bugs. And the API likely has rough edges. But hopefully framework developers can start working with it now, and help us have an excellent proper release real soon.
Here’s how it all works. A state machine type should be created, and should implement the Handler
trait, altering its internal state as events from the socket occur, and responding to hyper with what it should do next.
trait Handler {
fn on_request(&mut self, req: Request) -> Next;
fn on_request_readable(&mut self, decoder: &mut Decoder) -> Next;
fn on_response(&mut self, res: &mut Response) -> Next;
fn on_response_writable(&mut self, encoder: &mut Encoder) -> Next;
}
There is a similar trait for the Client
, but with the names and types adjusted.
The HTTP state is managed internally by hyper. The type that implements Handler
should keep track of its business-related state, such as the request route, related headers, if data should be read, and when and if data should be written. The Handler
conveys its desired intent by making use of the Next
type that is returned after acting on events.
The Next
type is used to declare what should be done with the current socket. The Next
type has variants for read()
, write()
, end()
, and less used forms.
The Handler
can declare how long the server should wait for an event before triggering a timeout error. This would be asynchronously waiting, not actual blocking on IO.
Declaring a timeout is done with a method on the Next
type:
Next::read().timeout(Duration::from_secs(30))
The server would wait for a readable event on the socket for up to 30 seconds. If the event never occurs, then the Handler
is notified with a Error::Timeout
. Timeout errors can be handled in the on_error
method of the Handler
. The default implementation of this method is to abort the request.
fn on_error(&mut self, err: Error) -> Next {
match err {
Error::Timeout => {
// we could try to be good and repond with a 408
self.code = hyper::StatusCode::TimedOut;
Next::write()
},
_ => {
// oh noes, just blow up
Next::remove()
}
}
}
So far, the described API works well when the server can respond immediately to each event on a socket. But eventually a server starts adding other parts of the puzzle that aren’t available right away. There could be database connections, reading or writing to a file, proxying to another URL, or any other thing that would block the thread and put our event loop in stasis. In these cases, a Handler
can receive events from the server, trigger other asynchronous actions, and notify the server when it is ready to finally do something. This is done using Next::wait()
and the Control
that is passed to every Handler
upon construction.
fn on_request(&mut self, req: Request) -> Next {
match route(req.uri()) {
Route::GetUser(id) => {
let ctrl = self.ctrl.clone();
User::find(id, move |user| {
// probably do something with `user`
ctrl.ready(Next::write());
});
Next::wait()
}
}
}
In this example, after parsing a route such as /user/33
, the Handler
tells the server to wait (indefinitely) while we ask the database to look up the the user by ID. Once the database returns, the Control
is alerted that the Handler
is ready to write, and the server will continue processing the request.
Due to the fundamental shift that exists when using non-blocking IO instead of blocking, there are several breaking changes. The most obvious is the change to the Handler
trait, and having to deal with io::ErrorKind::WouldBlock
errors when reading and writing.
Besides those biggies, several other parts of the API have been cleaned up for the better.
*res.status_mut() = NotFound
has become res.set_status(NotFound)
Request
now has private fields, with only immutable accessors. This prevents mistakes since mutating the Request
head would otherwise have no effect.HeaderFormat
trait has been merged back into the Header
trait, with the help of where Self: Sized
.Everyone hates breaking changes. As necessary as they are, they will still inhibit some people from upgrading. To reduce some of the pain, I’ve worked on a “synchronous API” that is built on top of the new async hyper.
It mimics blocking IO on read
and write
, and allows the original Handler
trait with the method handle(&self, req: Request, res: Response)
.
A hypersync
Server should be better protected from slowloris and DOS attacks, but it’s not going to be as performant as hyper
itself, since it uses threads to provide the blocking IO feel.
Depending on the use case, many Clients don’t gain much benenfit from async IO, and may wish to use the synchronous programming style provided by hypersync
.
It currently exists just as an example in the hyper repo, but I hope to break it out into a separate crate that mimics most of the API of pre-async hyper.
|
John O'Duinn: Human etiquette for more effective group chat |
Group chat is a tool that helps people communicate, just like email, phone calls, and meetings. Used correctly, these tools help people work more effectively. Use incorrectly, they hamper work. Jason Fried’s post about group chat as yet another interrupt generator started a lively discussion — some interesting posts are here and here, but there are many others. This is clearly a topic people care about.
Group chat, in various forms, has been used by specific groups for literally decades. However, as this technology goes more mainstream, the human etiquette is still evolving. Here are five guidelines on group chat etiquette that I found helpful me over the years, and which I hope help others:
1. Ensure everyone is using group chat. Email and phone calls are successful because they are ubiquitous and interoperable technologies. For group chat to work, everyone should be using the same shared group chat. After all, the idea is to reduce barriers to cross-organizational communications. Find a chat system that most people in your organization like and standardize on that. Once this is done, you can easily and quickly message someone in a totally different part of the organization, be confident they will see your chat message and be able to reply.
2. Carefully create a minimal viable set of channels. Having too many channels, or too few channels, encourages interruptions. Too few channels means each channel has a lot of unrelated cross-chatter noise. Too many channels make it hard to figure out where to post about a particular topic, leading people to use whatever channel feels close enough — which in turn means others cannot tell which channels to monitor.
Here is a “Goldilocks” arrangement (not too many, or not enough, but just the right number of channels) that has worked well for me:
3. Moderate the channels. Once these channels are created, they need active moderating. Small amounts of social banter is normal in any work environment (including meetings, conference calls or group chat) and helps us to remember we are all human, build a sense of community, and defuse tensions in high-pressure situations.
However, if social chatter starts to get in the way of doing work, politely and firmly move the off-topic chatter to another channel, so work can happen in the appropriate channel. Consistently moderating like this raises everyone’s overall awareness of group chat etiquette. Once the social norms are well understood, most people will do right by default, reducing the need for future moderation.
4. Remember that group chat is transient. If a discussion in a group chat channel reaches a decision that others may care about, that decision needs to be clearly communicated to all stakeholders. You’d do the same thing with an important decision reached over lunch, on a phone call while driving, or chatting at the coffee machine. Summarize the decision into a group-wide email, a project tracking system, or whatever is the single-source-of-truth for your organization. Sometimes, a quick copy-paste of the group chat discussion is good enough, but sometimes the act of summarizing a brilliant impromptu chat will uncover crucial missed assumptions. The important point is to keep everyone informed, without requiring everyone to continuously read every group chat channel for possible decisions that might be of interest later.
5. Mention people by name when appropriate. If you have a topic you want a specific human to read soon, then mention them by name. This will ensure they get notified and can quickly find where they are needed. However, be careful when you do this. This is almost like calling someone’s cellphone — you are choosing to interrupt without knowing the importance of what you are interrupting. Consider the urgency of your discussion, and consider if using another, less intrusive, medium might be best.
If you aren’t careful, group chat can become yet another endless stream of interruptions that people struggle to keep up with. However, with a careful combination of good technical organization and good human etiquette, group chat can speed up internal discussions, reduce email churn, and reduce the need for some meetings. A recurring daily gain for everyone, especially people in distributed organizations!
There are, of course, other things you can do to make group chat more effective… If you have suggestions for other ways to improve group chat, please let me know – in the comments below or by email. I’d be very eager to hear about them.
John.
(Modified from my post on medium.com, and my forthcoming book “Distributed”, published by O’Reilly later this year.)
http://oduinn.com/blog/2016/03/22/human-etiquette-for-more-effective-group-chat/
|
Air Mozilla: Connected Devices Weekly Program Review, 22 Mar 2016 |
Weekly project updates from the Mozilla Connected Devices team.
https://air.mozilla.org/connected-devices-weekly-program-review/
|
David Lawrence: Happy BMO Push Day! |
the following changes have been pushed to bugzilla.mozilla.org:
discuss these changes on mozilla.tools.bmo.
https://dlawrence.wordpress.com/2016/03/22/happy-bmo-push-day-10/
|
QMO: Firefox 46 Beta 2 Testday Results |
Hello Mozillians!
As you may already know, last Friday – March the 18th – we held a new Testday, for Firefox 46 Beta 2 and it was another successful event!
We’d like to take this opportunity to thank Iryna Thompson, Chandrakant Dhutadmal,Vuyisile Ndlovu, Ilse Mac'ias, Bolaram Paul, 'Angel Antonio and the people from our Bangladesh Community: Hossain Al Ikram, Amlan Biswas, Azmina Akter Papeya, Md. Rahimul Islam, Raihan Ali, Tabassum Binte Azad, Khalid Syfullah Zaman, Khalid Syfullah Zaman, John Sujoy, Kazi Nuzhat Tasnem, Sadia Chowdhury Ria, Saddam Hossain, Md. Ehsanul Hassan, Zannatul Ferdous, Aminul Islam Alvi, Mohammad Maruf Islam, Nazir Ahmed Sabbir, Jobayer Ahmed Mickey, Rakibul Islam, Maruf Rahman, Tanvir Rahman, Tovfikur Rahman, Farhadur Reja Fahim, Fazle Rabbi, Saddam Fci, NIaz Bhuiyan Asif, Mohammed Jawad Ibne Ishaque, Fariha Chowdhury, Tahsan Chowdhury Akash, Kazi Sakib Ahmad, Sauradeep Dutta, Sayed Ibn Masud, Sajedul Islam, Md.Majedul Islam, Meraj kazi, Asif Mahmud Shuvo, Wahiduzzaman Hridoy, and Tazin Ahmed for getting involved in this event and making Firefox as best as it could be (sorry if I misspelled your names).
Results:
Also a big thank you goes to all our active moderators.
Keep an eye on QMO for upcoming events!
https://quality.mozilla.org/2016/03/firefox-46-beta-2-testday-results/
|
Chris H-C: Data Science is Hard – Part 1: Data |
You’d think that categorizing and measuring populations would be pretty simple. You count them all up, divide them into groups… simple, arithmetic-sounding stuff.
To a certain extent, that’s all it is. You want to know how many people contribute to Firefox in a day? Add ’em up. Want to know what fraction of them are from Europe? Compare the subset from Europe against the entire population.
But that’s where it gets squishy:
So that leads us to Part 1 of “Data Science is Hard”: Data is Hard.
In a recent 1-on-1, my manager :bsmedberg and I thought that it could be interesting to look into Firefox users whose Telemetry reports come from different parts of the world at different times. Maybe we could identify users who travel (Firefox Users Who Travel: Where do they travel to/from?). Maybe they can help us understand the differing needs of Firefox users who are on vacation as opposed to being at home. Maybe they’ll show us Tor Browser users, or users using other anonymizing techniques and technologies: and maybe we should see if there’s some special handling we could provide for them and their data.
I used this topic as a way to learn how to use our new re:dash dashboard onto the prestodb instance of the Longitudinal Dataset. (which lets me run SQL queries against a 1% random sample of Firefox users’ Telemetry data from the past 180 days)
Immediately I ran into problems. First, with remembering all the SQL I had forgotten in the *mumblesomething* years since I last had to write interesting queries.
But then I quickly ran into problems with the data. I ran a query to boil down how many (and which) unique countries each client had reported Telemetry from:
SELECT cardinality(array_distinct(geo_country)) AS country_count , array_distinct(geo_country) AS countries FROM longitudinal_v20160314 ORDER BY country_count DESC LIMIT 5
Country_count | Countries |
---|---|
35 | [“CN”,”MX”,”GB”,”HU”,”JP”,”US”,”RU”,”IN”,”HK”,”??”,”CA”,”KR”,”TW”,”CM”,”DK”,”CH”,”ZA”,”PH”,”DE”,”VN”,”NL”,”CO”,”KZ”,”MA”,”TR”,”FR”,”AU”,”GR”,”IE”,”AR”,”BY”,”AT”,”TN”,”BR”,”AM”] |
34 | [“DE”,”RU”,”LT”,”UA”,”MA”,”GB”,”GI”,”AE”,”FR”,”CN”,”AM”,”NG”,”NL”,”PT”,”TH”,”PL”,”ES”,”NO”,”CH”,”IL”,”ZA”,”BY”,”US”,”UZ”,”HK”,”TW”,”JP”,”PK”,”LU”,”SG”,”FI”,”EU”,”IN”,”ID”] |
34 | [“US”,”BR”,”KR”,”NZ”,”RO”,”JP”,”ES”,”GB”,”TW”,”CN”,”UA”,”AU”,”NL”,”FR”,”FI”,”??”,”NO”,”CA”,”ZA”,”CL”,”IT”,”SE”,”SG”,”CH”,”RU”,”DE”,”MY”,”IN”,”ID”,”VN”,”PL”,”PH”,”KE”,”EG”] |
34 | [“GB”,”CN”,”??”,”DE”,”US”,”RU”,”AL”,”ES”,”NL”,”FR”,”KR”,”FI”,”IR”,”CA”,”JP”,”HK”,”AU”,”CH”,”RO”,”CO”,”IE”,”BR”,”SE”,”GR”,”IN”,”MX”,”RS”,”AR”,”TW”,”IT”,”SA”,”ID”,”VN”,”TN”] |
34 | [“US”,”GI”,”??”,”GB”,”DE”,”SA”,”KR”,”AR”,”ZA”,”CN”,”IN”,”AT”,”CA”,”KE”,”IQ”,”VN”,”TR”,”KZ”,”JP”,”BR”,”FR”,”TW”,”IT”,”ID”,”SG”,”RU”,”CL”,”BA”,”NL”,”AU”,”BE”,”LT”,”PT”,”ES”] |
35 unique countries visited? Wow.
The “Countries” column is in order of when they first appeared in the data, so we know that the first user was reporting from China then Mexico then Great Britain then Hungary then Japan then the US then Russia…
Either this is a globetrotting super spy, or we’re looking at some sort of VPN/Tor/anonymizing framework at play here.
( Either way I think it best to say, “Thank you for using Firefox, Ms. Super Spy!” )
Or maybe this is a sign that the geolocation service is unreliable, or that the data intake services are buggy, or something else that would be less than awesome.
Regardless: this data is hugely messy. But, 35 countries over 180 days? That’s just about doable in real life… except that it wasn’t over 180 days, but 2:
SELECT cardinality(array_distinct(geo_country)) AS country_count , cardinality(geo_country) AS subsession_count , cardinality(geo_country) / (date_diff('DAY', from_iso8601_timestamp(array_min(subsession_start_date)), from_iso8601_timestamp(array_max(subsession_start_date))) + 1) AS subsessions_per_day , date_diff('DAY', from_iso8601_timestamp(array_min(subsession_start_date)), from_iso8601_timestamp(array_max(subsession_start_date)) + 1) AS duration FROM longitudinal_v20160314 ORDER BY country_count DESC LIMIT 1
Country_count | Subsession_count | Subsessions_per_day | Duration |
---|---|---|---|
35 | 169 | 84 | 2 |
This client reported from 35 countries over 2 days. At least 17 countries per day (we’re skipping duplicates).
Also of note to Telemetry devs, this client was reporting 84 subsessions per day.
(Subsessions happen at a user’s local midnight and whenever some aspect of the Environment block of Telemetry changes (your locale, your multiprocess setting, how many addons you have installed). If your Firefox is registering that many subsession edges per day, there might be something wrong with your install. Or there might be something wrong with our data intake or aggregation.)
I still plan on poking around this idea of Firefox Users Who Travel. As I do so I need to remember that the data we collect is only useful for looking at Populations. Knowing that there’s one user visiting 35 countries in 2 days doesn’t help us decide whether or not we should release a special Globetrotter Edition of Firefox… since that’s just 1 of 4 million clients of a dataset representing only 1% of Firefox users.
Knowing that about a dozen users reported days with over 250 subsessions might result in some evaluation of that code, but without something linking these high-subsession-rate users together into a Population (maybe they’re machines running automated testing?), there’s nothing much we can do about it.
Instead I should focus on how, in a 4M user dataset, 112k (2.7%) users report from exactly 2 countries over the duration of the dataset. There are only 44k that report from more than 2, and the other 3.9M or so report exactly 1.
2.7% is a sliver of 1% of the Firefox population, but it is a Population. A Population is something we can analyse and speak meaningfully about, as the noise and mess of individual points of data has been smoothed out by the sheer weight of the Firefox user base.
It’s nice having a user base large enough to speak meaningfully about.
:chutten
https://chuttenblog.wordpress.com/2016/03/21/data-science-is-hard-part-1-data/
|
Botond Ballo: Trip Report: C++ Standards Meeting in Jacksonville, February 2016 |
Project | What’s in it? | Status |
C++17 | Filesystem TS, Parallelism TS, Library Fundamentals TS I, if constexpr , and various other enhancements are in. Concepts, coroutines, and unified call syntax are out. Default comparisons and operator. to be decided at next meeting. |
On track for 2017 |
Filesystems TS | Standard filesystem interface | Published! Merging into C++17 |
Library Fundamentals TS I | optional , any , string_view and more |
Published! Merging into C++17 |
Library Fundamentals TS II | source code information capture and various utilities | Resolution of comments from national standards bodies in progress |
Concepts (“Lite”) TS | Constrained templates | Published! NOT merging into C++17 |
Parallelism TS I | Parallel versions of STL algorithms | Published! Merging into C++17 |
Parallelism TS II | TBD. Exploring task blocks, progress guarantees, SIMD. | Under active development |
Transactional Memory TS | Transaction support | Published! NOT merging into C++17 |
Concurrency TS I | future::then() , latches and barriers, atomic smart pointers |
Published! NOT merging into C++17 |
Concurrency TS II | TBD. Exploring executors, synchronic types, atomic views, concurrent data structures | Under active development |
Networking TS | Sockets library based on Boost.ASIO | Wording review of the spec in progress |
Ranges TS | Range-based algorithms and views | Wording review of the spec in progress |
Numerics TS | Various numerical facilities | Under active development |
Array Extensions TS | Stack arrays whose size is not known at compile time | Withdrawn; any future proposals will target a different vehicle |
Modules TS | A component system to supersede the textual header file inclusion model | Initial TS wording reflects Microsoft’s design; changes proposed by Clang implementers expected. Not targeting C++17.. |
Graphics TS | 2D drawing API | Wording review of the spec in progress |
Coroutines TS | Resumable functions | Initial TS wording will reflect Microsoft’s await design; changes proposed by others expected. Not targeting C++17. |
Reflection | Code introspection and (later) reification mechanisms | Design direction for introspection chosen; likely to target a future TS |
Contracts | Preconditions, postconditions, etc. | Unified proposal reviewed favourably. Not targeting C++17. |
Last week I attended a meeting of the ISO C++ Standards Committee in Jacksonville, Florida. This was the first committee meeting in 2016; you can find my reports on the 2015 meetings here (May 2015, Lenexa) and here (October 2015, Kona). These reports, particularly the Kona one, provide useful context for this post.
This meeting was pointedly focused on C++17. With the window for getting new features into C++17 rapidly closing, it was (and continues to be) crunch time for all the committee subgroups. Many important decisions about what will make C++17 and what won’t were made this week, as I’ll relate below.
Work continued on Technical Specifications (TS’s) as well. As a reminder, TS’s are a mechanism to publish specifications for features that are not quite ready for full standardization; they give implementers and users a chance to try out features while leaving the door open for breaking changes if necessary. Progress was made on both TS’s already in flight (like Networking and Ranges), and new TS’s started for features that didn’t make C++17 (like Coroutines and Modules).
Recall that, since C++11, the committee has been operating under a release train model, where a new revision of the C++ International Standard (IS) ships every 3 years. For any given revision, whatever features are ready when “the train leaves” are in, and whatever is not ready is out.
For C++17, the train is in the process of leaving. According to the intended schedule, the end of this meeting was the deadline for completing design review for new features, and the end of the next meeting (this June, in Oulu, Finland) will be the deadline for completing wording-level review for the same. This means that, coming out of this meeting, C++17 is essentially feature-complete (although reality is bit more nuanced than that, so read on).
The set of candidate features vying to make C++17 was very large – too large to complete them all and still ship C++17 on time. As a result, hard decisions had to be made about what is ready and what is not. I will describe these decisions and the deliberations that went into them below. My personal opinion is that, while C++17 may not contain everything everyone hoped it might, it will still be an important language revision with a respectable feature set.
So, let’s see what will be in C++17:
I’ve listed these in my Kona report – see the list of features in C++17 coming into Kona, and the listed of features voted into C++17 at Kona; I won’t repeat them here.
Technical Specifications that have been merged into C++17 at this meeting:
namespace std
(in the TS they were in std::experimental::parallel
), and overload with the existing standard algorithms.namespace std
(in the TS they were in std::experimental
).namespace std::filesystem
(in the TS they were in std::experimental::filesystem
).namespace std
(in the IS, since it was targeted at both C and C++, they were in the global namespace).Other features that have been voted into C++17 at this meeting:
[[fallthrough]]
attribute, indicating that one switch case intentionally falls through to the next[[nodiscard]]
attribute, indicating that the return value of a function should not be ignored[[maybe_unused]]
attribute, indicating that a variable may be intentionally unusedconstexpr
contexts&&
, ||
, and ,
(comma) operators, and passed in this form.begin
and end
iterator to have different types. This is necessary to accomodate the design of the Ranges TS, where a range is modelled as an (iterator, sentinel) pair rather than a pair of two iterators.*this
by valueenum
types. This underwent additional revisions since Kona to ensure that no existing code changes meaning, and that the intent of existing anguage constructs, such as explicit
, is preserved under the new rules.shared_from_this
not_fn
, cherry-picked into C++17 from the second Library Fundamentals TSconstexpr atomic::is_always_lock_free
constexpr std::hardware_{constructive,destructive}_interference_size
std::hypot
(option #2 from the paper)constexpr
modifiers to reverse_iterator
, move_iterator
, array
, and range accessstd::string
a non-const data()
member functionis_callable
, the missing INVOKE-related traitAs I mentioned above, the last chance to vote new features into C++17 will be at the next meeting, in Oulu. Here I list the features that are expected to come up for such a vote.
It’s important to note that, even after a feature receives design approval from the relevant working group (Evolution for language features, Library Evolution for library features), as all features listed here have, it doesn’t get into C++17 until it’s voted in at a plenary meeting, and sometimes that vote fails (this happened to some features at this meeting). So, while all these features are targeted at C++17, there are no guarantees.
The majority of these features passed design review, or re-review, at this meeting. For the language ones, I discuss them in more detail in the Evolution Working Group section below.
if constexpr
(formerly known as constexpr_if
, and before that, static_if
)template
is_contiguous_layout
(really a library feature, but it needs compiler support)joining_thread
, the name chosen for the proposed wrapper around std::thread
whose destructor joins the thread if it’s still running, rather than terminating the programmake_from_tuple()
(like apply()
, but for constructors)order<>
without defining std::less<>
swap()
accepts unequal allocatorsshared_ptr::weak_type
gcd()
std::iterator
, redundant members of std::allocator
, and is_literal
std::variant<>
was put forward for potential inclusion into C++17. As a large feature whose design was completed only recently, it faces a higher hurdle than the others, but there’s a chance it might make it.Of the language features listed above, default comparisons and operator dot are slightly controversial; we’ll have to see how they fare at the vote in Oulu.
Unified call syntax failed to gain consensus, due to concerns that it has the potential to make future code brittle. Specifically, it was pointed out that, since member functions are only considered for a call with non-member syntax if there are no non-member matches, calls with non-member syntax written to rely on this feature to resolve to member functions can break if a matching non-member function is added later, even if the added function is a worse match than the member function. The feature may come up for a vote again at the next meeting if new arguments addressing these concerns are made, but for now it is not going into C++17.
The Concepts Technical Specification (also called “Concepts Lite” to reflect that, unlike C++0x concepts, it includes only interface checking, not both interface and definition checking) was published in October 2015.
A proposal to merge the Concepts TS into C++17 was brought forward at this meeting, but it proved fairy controversial and failed to gain consensus. I was going to summarize the reasons why, but then I discovered fellow meeting attendee Tom Honermann’s excellent summary; I can’t possibly put it better or in more detail than he did, so I’ll just refer you to his post.
I’ll share a personal view on this topic: given how important a feature Concepts is for C++, I think’s it’s critical that we get it right, and thus I think the caution displayed in not merging into C++17 is appropriate. I also don’t view Concepts not merging into C++17 as “not having Concepts yet” or “having to wait another N years for Concepts”. We have Concepts, in the form of the Concepts TS. That’s published, and implemented (and is on track to be supported by multiple implementations in the near future), and I don’t see much reason not to use it, including in large production codebases. Yes, it being in a TS means that it can change in backward-incompatible ways, but I don’t expect such changes to be made lightly; to the extent that such changes are made, I would expect them to fall into the following categories:
Recapping the status of coroutines coming out of the last meeting:
await
proposal) already undergoing wording review, but with its ship vehicle (C++17 or a TS) uncertain.Since then, there have been two interesting developments:
await
proposal should target a TS, because of a lack of sufficient implementation and deployment experience, and because the design space for alternative approaches (such as the unified approach being sought) is still fairly open. To avoid these annotations being viral like await
, a mechanism for inferring the annotations within a TU is proposed; this is then extended to handle cross-TU calls by having the compiler generate two versions of a function (one for coroutine execution, and one for normal execution), and having the linker hook up cross-TU calls among these (the Transactional Memory TS uses a similar mechanism to deal with transaction-safe and transaction-unsafe versions of functions).
The “unified” aspect of this approach comes from the fact that a compiler can use a stackful implementation for part or all of a coroutine execution, without any change to the syntax. While this proposal is obviously still in an early design stage, I got the impression that this was the first time a “unified coroutines” proposal was sufficiently fleshed out to make the committee view it as a serious contender for being the C++ coroutines design.
In light of these developments, a discussion was held on whether the await
proposal should target a TS or C++17. The outcome was that a TS had a much stronger consensus, and this was the direction chosen. I would imagine that the unified proposal will then be pursued as a set of changes to this TS, or as a separate proposal in the same TS (or in a different TS).
In Kona, the committee decided that Modules would target a TS rather than C++17. I said in my Kona report that there’s a chance the committee might change its mind at this meeting, but that I thought it was unlikely. Indeed, the committee did not change its mind; a discussion of this proposal from the authors of the Clang modules implementation to make changes to the design being proposed by Microsoft, made it clear that, while significant convergence has occurred between the two implementations, there still remain gaps to be bridged, and targeting the feature for C++17 would be premature. As a result, Modules will continue to target a TS.
The Modules TS was officially created at this meeting, with its initial working draft consisting of Microsoft’s proposal. (The Clang implementers will formulate updated versions of their proposal as a set of changes to this draft.) The creation of an initial working draft is usually just a procedural detail, but in this case it may be significant, as the GCC and EDG implementers may view it as unblocking their own implementation efforts (as they now have some sort of spec to work with).
I summarize the technical discussion about Modules below.
Due to its relatively recent publication (and thus not yet having a lot of time to gain implementation and use experience), the Concurrency TS was not proposed for merging into C++17.
Some people expressed disappointment over this, because the TS contains the very popular future::then()
extension.
For similar reasons, the Transactional Memory TS was not proposed for merging into C++17.
Significant design progress has been made on contracts at this meeting. What started out as two very different competing proposals, had by the beginning of the meeting converged (#1, #2) to the point where the remaining differences were largely superficial. During the meeting, these remaining differences were bridged, resulting in a truly unified proposal (paper forthcoming). The proposal is very simple, leaving a lot of features as potential future extensions. However, it’s still too early in the process to realistically target C++17.
Reflection was never targeted for C++17, I just mention it because of its popularity. The Reflection Study Group did make significant progress at this meeting, as I describe below.
Having given an overview of the state of C++17, I will now give my usual “fly on the wall” account of the deliberations that went on in the Evolution Working Group (EWG), which reviews the design of proposed language features. I wish I could give a comparably detailed summary of goings-on in the other subgroups but, well, I can only be in one place at a time :)
EWG categorizes incoming proposals into three rough categories:
Accepted proposals:
The implementation status of Concepts also came up: Clang’s implementation is in progress, and MSVC and EDG both have it on their shortlist of features to implement in the near future; moreover, they have both indicated that Concepts not making it into C++17 will not affect their implementation plans.
*this
by value proposal was accepted as well. This version only proposes *this
as a new kind of thing that can appear in the capture list; that results in the lambda capturing the enclosing object by value, and this
inside the lambda referring to the copied object. (The previous version also proposed *
to mean a combination of =
and *this
, but people didn’t like that.) A couple of extension ideas were floated around: allowing moving *this
into the lambda, and capturing an arbitrary object to become the this
of the lambda; those are left for future exploration.constexpr_if
proposal. The proposal has undergone two syntax changes since the last meeting. The original proposed syntax was:constexpr_if (...) {
...
} constexpr_else constexpr_if (...) {
...
} constexpr_else {
...
}
constexpr if (...) {
...
} constexpr else constexpr if (...) {
...
} constexpr else {
...
}
if constexpr (...) {
...
} else if constexpr (...) {
...
} else {
...
}
return
statements contained in it do not contribute to return type deduction.friend
of the class just before the closing brace of the class declaration. Among other things, this allows libraries to leave unspecified whether their classes use explicit or implicit comparison operators, because users cannot tell the difference. The generation of an implicit operator is triggered the first time the operator is used without an explicit operator being in scope; a subsequent explicit declaration makes the program ill-formed.
A notable change since the last version of the proposal is that comparing a base object to a derived object is now ill-formed. To keep comparison consistent with copy construction and assignment, slicing during copy construction and assignment will also become ill-formed, unless the derived class doesn’t add any new non-static data members or non-empty bases. This is a breaking change, but the overwhelming majority of cases that would break are likely to be actual bugs, so EWG felt this was acceptable.
The semantics of <=
and >=
also came up: should a <= b
be implemented as a < b || a == b
or as !(a > b)
? EWG confirmed its previous consensus that it should be the former, but current standard library facilities such as std::tuple
use the latter. EWG felt that the library should change to become consistent with the language, while recognizing that it’s not realistic for this to happen before STL2 (the upcoming non-backwards-compatible overhaul of the STL). There was also some disagreement about the behaviour of wrapper types like std::optional
, which wrap exactly one underlying object; some felt strongly that for such type a <= b
should be implemented as neither of the above choices, but as a.wrapped_object <= b.wrapped_object
.
std::byte
type was proposed for the standard library, representing a byte of data without being treated as a character or integer type (the way char
flavours are). It’s defined as enum class byte : unsigned char {};
. EWG was asked to weigh in on extending C++’s aliasing rules to allow accessing the object representation of any object via a byte*
, the way that’s currently allowed via a char*
; this extension was approved. An alternative suggestion of allowing this for any scoped enumeration with a character type as its underlying type (as opposing to singling out std::byte
) was rejected.template
vector(Iter b, Iter e) -> vector>
auto
can be used instead of omitting the argument to trigger deduction for that argument only.std::vector v{myAlloc}; // uses std::allocator (oops)
std::vector v{myAlloc}; // allocator type deduced from 'myAlloc'
[[using ns: attr1, attr2, attr3]]
a shorthand for [[ns::attr1, ns::attr2, ns::attr3]]
. In an attribute specifier using this shorthand, all attribute names are looked up in the specified attribute namespace, without fallback to the global attribute namespace. If you want attributes from different namespaces on the same entity, you can either not use the shorthand, or use multiple specifiers as in [[using ns1: a, b, c]] [[using ns2: d, e]]
.is_contiguous_layout
type trait, which returns true for types for which every bit in the object representation participates in the value representation. Such a trait can be used to implement a higher-level is_uniquely_represented
trait, which is true if two objects of the type are equal if and only if their object representations are equal; this is in turn used for hashing. (The difference between the two traits is that is_uniquely_represented
can be specialized by the author of a class type, since the author gets to determine whether two objects with the same value representation are considered equal.) is_contiguous_layout
requires compiler support to implement, so it was brought in front of EWG, which approved it.template
, where V is a non-type template parameter whose type is deduced. In Kona it was discovered that this clashes a bit with Concepts, because the natural extension template
(which, given the relationship between auto
and concept names in other contexts, ought to mean a non-type template parameter whose type is deduced but must satisfy ConceptName
) already has a meaning in the Concepts TS (namely, a type parameter which must satisfy ConceptName
). The proposal author discussed this with with the editor of the Concepts TS, and decided this isn’t an issue: template
can retain its existing meaning in the Concepts TS, while “a non-type template parameter whose type is constrained by a concept” can be expressed a bit more verbosely as template requires ConceptName
. The feature is targeting C++17, time permitting.Proposals for which further work is encouraged:
for
loop exit strategies, which uses catch break
and catch default
to introduce the blocks that run after an early exit and after a normal exit, respectively. EWG liked the direction, but didn’t like the use of the catch
keyword, which is strongly associated with exceptions. More generally, there was a reluctance to reuse existing keywords (thus disqualifying the alternative of case break
and case default
). Other ideas such as context-sensitive keywords (allowing on break
and on default
, with on
being a context-sensitive keyword), and ugly keywords (like on_break
and on_default
) were proposed, but neither gained consensus. The proposal author said he would think offline about a better spelling.auto {x, y, z} = expr;
, which declares new names x
, y
, and z
, which bind to the “components” of the object expr
evaluates to. The notion of “components” is defined for three categories of types:
get(o)
which returns the N
th component of o
, for which the components are whatever these get
calls return. Notable examples are std::pair
and std::tuple
.get<>()
, that takes precedence.
Several notable points came up during this discussion:
auto __o = expr;
. Mandatory copy elision ensures that no actual copying is done. Alternatively, the user can use a reference syntax, such as auto& {x, y, z} = expr;
, resulting in the unnamed variable correspondingly having reference type, as in auto& __o = expr;
.x
, y
, and z
are not variables, they are just aliases to components of the unnamed object. The mental model here is to imagine that the compiler re-writes uses of x
to be uses of __o.field1
instead. (The alternative would be to make x
, y
, and z
references to the components of the unnamed object, but this wouldn’t support bit-field components, because you can’t have a reference to a bit-field.) I say “where possible”, because this can only be reasonably done for aggregates in categories (1) and (3) above.x
, y
, and z
are variables whose type is deduced as if declared with decltype(auto)
. (They can’t be aliases in this case, because there’s no knowing what get<>()
returns; for example, it could return an object constructed on the fly, so there would be nothing persistent to alias.)get<>()
they need to support something like tuple_size::value
. Some people also felt that using re-using get<>()
as the customization point is a mistake, and a new function meant specifically for use with this feature should be introduced instead.auto [x, y, z] = expr;
was proposed by some who argued that it looks more natural, because in a declaration context, having comma-separated things inside braces is unexpected.auto { X x, Y y, Z z } = expr;
, where X
, Y
, and Z
are type names (or, with the Concepts TS, concept names). This desire sort of clashes with the semantics of having x
, y
, and z
be aliases to the components, because we don’t have the freedom to give them types other than the types of the components. (One can imagine a different semantics, where x
, y
, and z
are copies of the components, where with explicit types we can ask for a components to be converted to some other type; the current semantics do not admit this sort of conversion.) However, it might still make sense to support explicit types and require that they match the types of the components exactly.The proposal authors intend to iterate on the proposal, taking the above feedback into account. It’s unlikely to make C++17.
short float
. EWG liked the direction. Some details remain to be worked out, such as the promotion rules (the proposal had short float
promote to float
, but it was pointed out that because of varags, they need to promote to double
instead), and accompanying standard library additions.std::optional
, except that it also allows storing a separate status object, whether or not a value object is present. This is useful for environments where exceptions cannot be used, or where immediately throwing an exception is undesirable for some other reason (for example, if we want an exception to be thrown in a delayed fashion, after a return value has made it to another thread or something and we actually try to access it). This was brought in front of EWG, because it’s an attempt to shape the future of error handling in C++. EWG liked the approach, and provided two specific pieces of guidance: first, the class should be marked [[nodiscard]]
, to indicate that it should not be ignored when used as a return value; and second, that the copyability/movability of the type should reflect that of the underlying value type.0o
, for octal literals. The existing prefix of just 0
makes it too easy to accidentally write an octal literal when you meant a decimal literal. The proposal author would like to eventually deprecate using just 0
as a prefix, though people pointed out this will be difficult due to the heavy use of octal literals in POSIX APIs (for example, for file permission modes).X::A*
to appear in a header without requiring a definition for X
to also appear in the header (forward-declarations of X
and X::A
will be sufficient). EWG found the use case compelling, because currently a lot of class definitions to appear in headers only because interfaces defined in the header use pointers or references to nested classes of the type. Several details still need to be worked out. (For example, what happens if a definition of X
does not appear in any other translation unit (TU)? What happens if a definition of X
appears in another TU, but does not define a nested class A
? What happens if it does define a nested class A
, but it’s private? The answer to some or all of these may have to be “ill-formed, no diagnostic required”, because diagnosing errors of this sort would require significant linker support.)Rejected proposals:
explicit
constructor has one parameter with a default argument. Such a constructor serves as both a converting constructor and a default constructor, and the explicit
applies to both, disqualifying the class from certain uses that require a non-explicit
default constructor. EWG felt that this is appropriate, and that if the class author intends for only the conversion constructor to be explicit
, they can write a separate default constructor instead of using a default argument.enum class
without qualification in a switch
statement was rejected, mostly because people felt that complicating C++’s name lookup rules in this way was not worth the win.void foo(int...);
would declare a template with a variadic number of int
parameters. The problem is, right now void foo(int...)
is accepted as an alternative spelling for void foo(int, ...)
which is a C-style varags function. This proposal would have disallowed the first spelling, leaving that syntax open for eventually denoting a variadic template. Notably, C already disallows the first spelling, which means this proposal would have improved compatibility with C; it also means any breakage would be limited to C++-only code using C-style varargs functions. Nonetheless, EWG members expressed concerns about code breakage, and the proposal didn’t gain consensus. It’s possible this may be revisited if the author comes back with data about the amount of real-world code breakage (or lack thereof) this would cause.0
or NULL
) should prefer to convert to std::nullptr_t
over void*
, failed to achieve consensus. The motivation was to support a proposed library change. Some people didn’t find this compelling enough, and felt we shouldn’t be catering to legacy uses of 0
and NULL
as null pointer constants.dynarray
class to wrap them, we want language syntax for what is really a smarter class type. Withdrawing the TS is just a procedural move that implies that any future proposals will target a different ship vehicle (a new TS, or the IS).[a..b)
, which would denote a range of values from a
up to (but not including) b
. The natural use case is integer ranges, although the syntax would have worked with any type that supported operator++
. The [a..b)
notation was inspired by the mathematical notation for half-open ranges, but implementers were strongly against this use of unbalanced brackets, pointing out that even if they didn’t make the grammar ambiguous strictly speaking, they would break tools’ ability to rely on bracket balancing for heuristics and performance. Alternative syntaxes were brought up, such as a..code>, but overall EWG felt that inventing a syntax for this is unnecessary, when a simple library interface (like range(a, b)
) would do. The proposal author was encouraged to work with Eric Niebler, editor of the Ranges TS, to collaborate on such a library facility.
for (auto val1 : range1; auto val2 : range2) { ... }
. Rejected because structured bindings in combination with a library facility for “zipping” ranges into a single range of tuple-like objects will allow this to be accomplished with syntax like for (auto {val1, val2} : zip(range1, range2))
. Concerns were raised about the ability of such a library facility to achieve the same performance as a first-class language feature could, but optimizer developers in the room said that this can be done.auto operator=(const A&) { ... }
, that is, an assignment operator whose return type is deduced using the auto
placeholder. Since the auto
placeholder always deduced a by-value return type, this is usually a bug (since assigment operators generally return by reference). EWG didn’t feel this was a problem worth solving.
EWG spent most of an entire day discussing Modules. The developers of Clang’s modules implementation described their experience deploying this implementation across large parts of Google’s codebase, with impressive results (for example, an average 40% reduction of compile times as a result of modularizing the most heavily used 10% of libraries). They then presented the most salient differences between their Modules design and Microsoft’s, formulated as a set of proposed changes to Microsoft’s design.
EWG discussed these proposed changes in depth. The following changes were reviewed favourably:
module { /* declarations here live in the global module */ }
is proposed. The main motivations are to allow tools to associate files with modules more easily (by not having to parse too far from the beginning of the file), and to avoid imposing a requirement that declarations in the global module must appear before everything else.module implementation ModuleName;
, to begin a module implementation unit (while module interface units use module ModuleName;
). The compiler already needs to know whether a translation unit is a module interface unit or a module implementation unit. It currently gets this information from metadata, such as compiler flags or the filename extension. This proposal places that information into the file contents.
The following changes did not yet gain a clear consensus in terms of direction. They will need to be iterated on and discussed further at future meetings:
import ModuleName;
to import
(or import "ModuleName"
or import to/ModuleName>
, or any other variation permitted by the #include
syntax), with the intention being that it would allow the compiler to find the module based only on its name (together with some per-translation implementation-defined metadata, such as the “#include search path” or an analogue), rather than requiring a separate mapping from module names to their locations. EWG found the motivation to be reasonable, but expressed caution about using strings in this context, and suggested instead using a hierarchical notation that can correspond to a filesystem path rather than being one, such as import x.y.z;
standing for the relative path x/y/z
(much as with Java imports).import
the module and #include
the side header. The Clang implementers argued that this was too heavy a burden for module authors and users, particularly in cases where a macro is added to a component’s interface, necessitating a change in how the component is imported. After a lengthy discussion, a basic consensus emerged that there was a lot of demand for both (1) a way to import a module that imported any macros exported by it, and (2) a way to import a module that did not import any macros, and that a Modules proposal would need to support both. The possibility of standardizing these as separate TS’s was brought up, but discarded because it would be confusing for users. No consensus was reached on which of (1) or (2) would be the default (that is, which would simply writing import ModuleName;
do), nor on what semantics (2) might have if the module does in fact export macros (the options were to given an error at the import site, and to import the module anyways, silently stripping macros from the set of imported entities).import header "header.h";
), which would cause the module to export all macros defined at the end of the header. The motivation for this is that, when transitioning a large codebase to use modules in some order other than strictly bottom-up, a contents of a legacy header can be brought in both through a direct #include
, and via a module. Direct inclusion relies, as ever, on an include guard to determine whether the file has been previously included. An inclusion via a module must, therefore, cause that include guard to become defined, to avoid a subsequent direct inclusion bringing in a second copy of the declared entities. The Clang implementers originally tried having this mechanism export just the header guard, but that requires heuristically detecting which macro is the header guard; this can be done in most cases, but is practically impossible for certain headers, such as standard C library headers, which have more complicated inclusion semantics than simple include guards. Another alternative that was tried was allowing multiple inclusion, and performing equivalence merging (a deep structural comparison of two entities) to weed out duplicate definitions, but this was found to be slow (undoing much of the compile-time performance benefit that Modules bring) and error-prone (slight differences in inclusion contexts could lead to two instances of a definition being just different enough not to be merged).
EWG has a pretty good track record of getting through most if not all of the proposals on its plate in a given meeting, but at this meeting, due to the amount of time spent discussing Concepts, Coroutines, and Modules, there were 17 papers it did not get a chance to look at. I won’t list them all (see the list of papers on the committee’s website if you’re interested), but I’ll call out three that I was particularly excited about and would have liked to see presented:
get(t)
, tuple_size::value
, and tuple_element::type
, to be used on structures whose data members are all public (roughly; the exact conditions are spelled out in the paper). I was particularly excited about this, because it would have unlocked a simple form of reflection (iteration over members) for such structures. However, the proposal probably would have encountered opposition for the same reason
|