Добавить любой RSS - источник (включая журнал LiveJournal) в свою ленту друзей вы можете на странице синдикации.
Исходная информация - http://planet.mozilla.org/. Данный дневник сформирован из открытого RSS-источника по адресу http://planet.mozilla.org/rss20.xml, и дополняется в соответствии с дополнением данного источника. Он может не соответствовать содержимому оригинальной страницы. Трансляция создана автоматически по запросу читателей этой RSS ленты. По всем вопросам о работе данного сервиса обращаться со страницы контактной информации.[Обновить трансляцию]
Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.
The Rust team is happy to announce a new version of Rust, 1.43.0. Rust is a
programming language that is empowering everyone to build reliable and
efficient software.
If you have a previous version of Rust installed via rustup, getting Rust
1.43.0 is as easy as:
This release is fairly minor. There are no new major features. We have some
new stabilized APIs, some compiler performance improvements, and a small
macro-related feature. See the detailed release notes to learn about
other changes not covered by this post.
item fragments
In macros, you can use item fragments to interpolate items into the body of traits,
impls, and extern blocks. For example:
The type inference around primitives, references, and binary operations was
improved. A code sample makes this easier to understand: this code fails to
compile on Rust 1.42, but compiles in Rust 1.43.
let n: f32 = 0.0 + &0.0;
In Rust 1.42, you would get an error that would say "hey, I don't know how to add
an f64 and an &f64 with a result of f32." The algorithm now correctly decides
that both 0.0 and &0.0 should be f32s instead.
This is easiest to explain by example: let's say we're working on a command
line project, simply named "cli". If we're writing an integration test, we want
to invoke that cli binary and see what it does. When running tests and
benchmarks, Cargo will set an environment variable named CARGO_BIN_EXE_cli,
and I can use it inside my test:
let exe = env!("CARGO_BIN_EXE_cli");
This makes it easier to invoke cli, as we now have a path to it directly.
There is a new primitive
module that re-exports Rust's
primitive types. This can be useful when you're writing a macro and want to make
sure that the types aren't shadowed.
Will things like digital contact tracing leave a legacy of better privacy norms, or worse ones?
The conversation about privacy and the pandemic — and about the idea of digital contact tracing in particular — has shifted a great deal in the last few weeks. It’s moved from an understandable ‘we don’t want to live in this dystopian science fiction novel’ gut response (my initial gut reaction) to a vigorous debate about whether privacy-by-design and good data governance make it possible to trace COVID contacts in a way that we can all trust (I’m still trying to sort through all this). A recent tweet from @hackylawyER summed up the state of the conversation nicely:
I’m not happy things are headed this way but if they are, we’d damn well better have safeguards.
Watching all of this, I’ve tried to step back and ask myself: what exactly are we worried about? And, amidst the rush to tech solutions, is this all downside for privacy and data governance, or is there an upside? Could we actually use this moment to set new and better norms for privacy?
What are we worried about?
The gut reaction worries are obvious: tracking everyone (or the bulk of people) with a smartphone to monitor COVID exposure could go wrong in myriad ways if done poorly or if the data falls into the wrong hands. And, on top of it, many have called into question whether this kind of tracking is even effective at stopping the spread of the virus. There is a legitimate worry that we could quickly find ourselves inside a huge mass surveillance experiment that has limited or no return in terms of public health and safety. Questions about efficacy and privacy are step one in considering whether or not to roll out contact tracing. While it’s far from universal, a fair number of governments are digging into these questions in earnest.
There is also a longer term, potentially more serious worry that seems to be getting less consideration: that governments, tech platforms and telcos working together to track citizens at scale becomes normalized in democracies. And, that this kind of surveillance gets used for reasons other than tackling the pandemic. Earlier this week, an open letter on digital contact tracing by scientists and researchers from 26 countries noted that:
… some “solutions” to the crisis may, via mission creep, result in systems which would allow unprecedented surveillance of society at large.
A few weeks ago, I was worried that this was where we were headed: increased government and tech company surveillance as the new norm. The rallying of the privacy and data governance communities has me cautiously hopeful that we could go in the opposite direction. There may be a chance to use this moment to set new and better norms for privacy.
Embracing privacy-by-design
One source of this hope has been the rapid momentum that has grown behind decentralized, Bluetooth-based approaches to contact tracing. This general approach was initially proposed by academic groups like DP3T in Europe and PACT in the US, and was picked up by Apple and Google as something they could roll out across all their smartphones. The idea is to use Bluetooth to collect contact data locally on phones, leaving it there (and private) unless a person tests positive for COVID. In that case, a set of ‘beacons’ informs possible contacts that they may want to self isolate and get tested. Governments don’t get access to any of the contact data, striking a balance between public health and privacy. This comic explains the concept better than I can.
The hopeful piece here is not just the decentralized approach itself — it has pros and cons — but more importantly the quick embrace of privacy-by-design by governments, tech platforms and academics. As a Chaos Computer Club blog post notes:
In principle, the concept of a “Corona App” involves an enormous risk due to the contact and health data that may be collected. At the same time, there is a chance for “privacy-by-design” concepts and technologies that have been developed by the crypto and privacy community over the last decades. With the help of these technologies, it is possible to unfold the epidemiological potential of contact tracing without creating a privacy disaster.
As the post notes, the idea that privacy should be a foundational part of any digital product and service design has been around for decades — but it has been an uphill battle to make this way of thinking mainstream. The current setting may offer a chance for this way of thinking to make a leap forward, and to nudge governments and tech companies towards the idea that privacy-by-design should be the norm.
Good ideas for governing tech and data
While receiving less attention, there has also been a wave of constructive work on how to govern contact tracing efforts. As a leading proponent of decentralized contact tracing said in a recent tweet:
Contact tracing apps, even private ones like #DP3T, need more than technical safeguards. @lilianedwards has been leading on our effort to draft a Coronavirus (Safeguards) Bill for the UK Parliament — which limits what these apps can be used for in practice.
Ensuring digital privacy is not only a matter of technology, but also a matter of rules, policy, oversight and stewardship. Getting privacy right — and making sure we don’t slide into ‘unprecedented surveillance of society at large’ that the recent letter from scientists warns of — will require us to develop smart approaches to governing any technology we put in place to tackle the pandemic. Unfortunately, smart data governance is even less commonplace than privacy-by-design.
The good news is that thoughtful and practical data governance proposals are quickly emerging. For example, the draft Coronavirus (Safeguards) Bill mentioned above would place strict purpose, access and time constraints on any technology that was rolled out to manage the pandemic. It also addresses topics related to inclusion, ensuring that no one is penalized for not having a phone. Others have called for the creation of independent ‘data trusts’ or trust-like mechanisms to ensure the interests of citizens are represented in the design of tracking technology and the handling of data. Like the decentralized technology approaches outlined above, these data governance proposals would allow us to meet both privacy and public health goals if governments are motivated to listen.
We can make good decisions now, or bad ones
It’s heartening to see how engineers, lawyers and activists who have long championed privacy have stepped up in creative and constructive ways to answer the question: if we end up building this stuff, how do we make sure it has the right guardrails?
Much of the thinking and evidence from this work has been summarized in theExit Through the App Store report that Ada Lovelace Institute released earlier this week. The report shows that we have the technical and policy tools that we need to make good decisions about technologies like digital contact tracing (e.g. use a decentralized tech approach). It also points out that we could easily rush into this and make bad decisions (e.g. use a centralized approach).
The design decisions we make will have a huge impact whether we move into an era where privacy-by-design and good data governance are the norm, or end up laying the data gathering foundations for the dystopian science fiction future that many of us imagined when we first heard the term ‘contact tracing’ a few short weeks back.
It’s up to us — and, in particular, our governments — to decide which way we go.
About the Firefox User Research (UR) Team at Mozilla
Firefox User Research is a distributed team within Mozilla dedicated to conducting mixed methods research to define and support work related to Firefox products and services, present and future. Currently, the team consists of 11 people across North America: a director, a research operations specialist, and nine researchers with different backgrounds, training, and experiences.
Some members of the Firefox User Research team in Berlin, Germany in January 2020 (Thank you to Gemma Petrie for sharing the photo.)
In early 2019, then Firefox UR team members decided to develop a team charter, a living document containing information about team members, individual and team learning goals, and team operating principles and guidelines. At that time, the team and Mozilla were undergoing some big changes, and both the act of composing a charter and the charter itself were meant to help us:
Take care of one another
Individually and collectively reflect on what we want to learn and the kind of team we want to be
Define and discuss values and norms
Make goals and aspirations explicit
How we made our first team charter
Back in 2019, the team held a team charter kick-off meeting over video conference to develop a shared understanding of what team charters are and to discuss an approach for creating a charter.
Title slide from the Firefox User Research team’s team charter kick-off (Photo by Randy Fath on Unsplash)
The charter format we used includes the following key areas:
Individual learning goals: What each team member wants to learn from being a member of the team
Team learning goals: What the team collectively wants to accomplish
Team operating principles, expectations, and ways of working: Specific team values with related expectations and example behaviors
Personal profiles: An overview of each member of the team including background, key personality traits, preferred modes of communication, work style and habits, work time availability, and anything else the team member believes teammates should know
Charter governance: How the team expects to develop and maintain the team charter
The original team charter template also included two other sections, which the team has foregone for the time being. There was a section for defining team roles, which the team decided does not apply to us given how we work with different product teams. There was also a section on how we give developmental feedback, which the team decided to delay since we were in the middle of hiring a director, who we assumed may want to define a feedback process.
Team culture considerations
The Firefox User Research team charter is a document private to the people currently on the team since we discuss very specific aspects of our working styles and professional goals in the charter. For this post, the team has given me permission to share a few examples from our charter to illustrate what a UR team charter might include.
Team charter goals are not used to measure performance. One of the questions I got when introducing the team charter structure to the team in 2019 was: “Should the individual and team goals to be listed in the charter be the same as our individual and team OKRs that we use bi-annually as a Mozilla-wide practice?” For the time being, the team decided that charter goals and OKRs should be distinct, the former being time-bound and with the qualities of SMART goals and the latter longer-term and perhaps more aspirational. An individual OKR may revolve around socializing user research in new ways and publishing a blog post about a past study, whereas an individual learning goal in the charter might be to gain experience with research methods that are less familiar.
Sharing and celebrating each other’s work. The section of the charter we re-visit, discuss, and revise most often is what is called Team Operating Principles, Expectations, and Ways of Working. This section is meant to make explicit the team’s core values and behavioral norms. One team value or principle with nuances specific to user research is: Share and celebrate each other’s work.
We define this principle as “recognizing when team members hit milestones important to them (e.g., presentations, publications, breakthroughs with project teams, research proposal for new kind of study, etc.).” A few of the concrete examples of this principle in action, included in the charter, are:
Take the time in 1:1s, team meetings, Slack, etc. to understand which milestones are meaningful to teammates.
If you hear something good being said about a team member’s work, tell that team member what you heard.
If you have information or past experience that could support a teammate’s success, share that information with the teammate in a timely manner.
The Firefox User Research Team Charter contains three other principles that follow a similar structure with examples.
What we’ve learned from having a team charter
Nine current members of the Firefox User Research team in our weekly team meeting this week
Earlier this year, I reached out to some of my teammates to get their thoughts on the charter, including its value and limitations. One person who joined the team after the first version of the charter was drafted explained:
“When I was brand new to the team, it was helpful to learn about each team member’s…communication styles, schedules, life challenges, and accessibility issues. It was a digital document, but it felt incredibly intimate…and it helped me feel closer to my distributed teammates.”
Another person, who has been on the team for several years noted that the personal profiles can help with empathy in-the-moment, saying that the personal profile,
“gives me a better ‘read’ on why my remote team members might be acting the way they are. It helps me understand them and give them the benefit of the doubt when they respond to something in a way that I would never do, ha!”
We have also felt some limitations to the charter, especially as the team has grown in size. One team member noted that having someone champion the idea of a charter and explain it is important. Who that person should be is an open question. At the time our team started its charter, I happened to be a team charter enthusiast who wanted to pilot the use of a charter for our team. Since then, we have gained a director for our team, and our team structure has changed, making for other potential charter champions. Additionally, 11 team members makes for a lot of content, and the document itself can feel lost in the shuffle of Google documents we work in everyday. We have also felt the challenge of updating the charter. For governance, our charter dictates that we revisit the charter:
Whenever a new person joins the team
Whenever a team member proposes changes to collectively-authored sections
Whenever a team member requests team review of the charter
At the start of UR team work weeks
At minimum two times per year
That frequency may not sound particularly difficult, but staff turnover and big organizational changes, including asking team members who have been shuffled around in “re-orgs” to compose lengthy sections of the charter, can necessitate more flexibility with and even brief distance from charter work.
Another lesson I have learned about using team charters over the years and was reminded of recently — perhaps the most important lesson — is that charters can be valuable for fostering psychological safety, team cohesion, and operational effectiveness. However, no matter how rich your team charter is, no matter how frequently you update it, the charter’s benefits will be limited if there are other elements on the team or in the broader organization that endanger psychological safety and/or are otherwise exclusionary. The team charter can be a powerful tool, but it is only one tool.
Thanks to a suggestion of a researcher on the team, Firefox User Research is in the process of revisiting our charter to discuss whether we need to adjust any of our operating principles and norms given the COVID-19 pandemic. This opportunity to review the charter is a reminder that the charter is meant to be dynamic and ultimately serve the team’s needs.
Give it a try
If our experience with a team charter has piqued your interest, give it a try with your team. You can use the template we use.
As we’re all online more these days, the Firefox browser privacy features have never been more important. For instance, as you’re hopping through different sites searching for pantry recipes or … Read more
This option is -4 as the short option and --ipv4 as the long, added in curl 7.10.8.
IP version
So why would anyone ever need this option?
Remember that when you ask curl to do a transfer with a host, that host name will typically be resolved to a list of IP addresses and that list will contain both IPv4 and IPv6 addresses. When curl then connects to the host, it will iterate over that list and it will attempt to connect to both IPv6 and IPv4 addresses, even at the same time in the style we call happy eyeballs.
The first connect attempt to succeed will be the one curl sticks to and will perform the transfer over.
When IPv6 acts up
In rare occasions or even in some setups, you may find yourself in a situation where you get back problematic IPv6 addresses in the name resolve, or the server’s IPv6 behavior seems erratic, your local config simply makes IPv6 flaky or things like that. Reasons you may want to ask curl to stick to IPv4-only to avoid a headache.
Then --ipv4 is here to help you.
--ipv4
First, this option will make the name resolving only ask for IPv4 addresses so there will be no IPv6 addresses returned to curl to try to connect to.
Then, due to the disabled IPv6, there won’t be any happy eyeballs procedure when connecting since there are now only addresses from a single family in the list.
It could perhaps be worth to stress that if the host name you target then doesn’t have any IPv4 addresses associated with it, the operation will instead fail when this option is used.
Example
curl --ipv4 https://example.org/
Related options
The reversed request, ask for IPv6 only is done with the --ipv6 option.
There are also options for specifying other protocol versions, in particular for example HTTP with --http1.0, --http1.1, --http2 and --http3 or for TLS with --tslv1, --tlsv1.2, --tlsv1.3 and more.
TenFourFox Feature Parity Release 22 beta 1 is now available (downloads, hashes, release notes). I have abandoned trying to write that AltiVec GCM routine because it really needs 64-bit elements, something that 32-bit AltiVec obviously doesn't have, and the 32-bit version I attempted to throw together ended up being not much faster (if at all) than a scalar approach. Since this seemed like a lot of risk for no gain, I just threw in the towel. Instead, this release has a syntactic update to JavaScript and also improves the performance of H.264 streams using the MP4 Enabler, especially on multiprocessor systems. This now by default uses the lower "fast" quality mode of ffmpeg and because it is not spec-compliant may cause odd behaviour on a few videos. If you notice this, advise which URL, and then set tenfourfox.mp4.high_quality to true (you may need to add this preference). TenFourFox FPR22 final will come out parallel with Firefox 76/68.8 on or around May 5.
Neither a visa or a rejection yet, exactly two years since I completed my US visa application. Not a lot more to say that I haven’t already said before on this subject.
Of course I’m not surprised that I won’t get an approval in these travel-restricted Covid-19 times – as it would be a fine irony to get a visa and then not be allowed to travel anyway due to a general travel ban – but it also seems like the US immigration authorities haven’t yet used the pandemic as an excuse to (finally) just deny my application.
I was first prevented from traveling to the US on June 26 2017 (on ESTA) but it wasn’t until the following spring that I applied for a visa in an attempt to rectify the situation.
We are happy to present the results of our fourth annual survey of our Rust community. Before we dig into the analysis, we want to give a big "thank you!" to all of the people who took the time to respond. You are vital to Rust continuing to improve year after year!
Let's start by looking at the survey audience.
Survey Audience
The survey was available in 14 different languages and we received 3997 responses.
Here is the language distribution of the responses we received.
English: 69.6%
Chinese: 10.8%
German: 4.3%
French: 3.3%
Japanese: 3.0%
Polish: 1.2%
Portuguese: 1.2%
Spanish: .9%
Korean: .8%
Italian: .6%
Swedish: .5%
Vietnamese: .2%
In the 2019 survey, 82.8% of responders indicated they used Rust, 7.1% indicated they did not currently use Rust but had used it in the past, and 10.1% indicated that they had never used Rust.
If we compare this to the 2018 survey (where 75% of responders indicated they used Rust, 8% indicated the did not currently use Rust but had used it in the past, and 8% indicated they had never used Rust) more responders were using Rust in 2019.
Looking Back on Rust 2018
In December 2018 we released the Rust 2018 edition - Rust 1.31.0. In the 2019 survey, 92% of Rust users indicated they were using the new edition. 85% said that upgrading to the Rust 2018 edition was easy.
Next, we asked users to rate the improvement of key aspects of the Rust language.
Overall, many aspects of the Rust language were perceived as "somewhat better" in the 2018 edition.
Conferences and Community
We noticed some differences between English language and other language results. Within the non-English language survey subset, the majority of the issues and concerns identified are the same as those within the English language. However, one concern/trend stands out among the non-English speaking subset - a desire for Rust documentation in their native language, or the language they took the survey in. This was particularly notable within the Chinese-language group, though that is likely due to the higher representation.
We received a lot of feedback on how we can improve Rust and make it feel more welcoming to more people. We can't include all of it here, so here is a summary of some of the feedback that stood out to us.
People are in general asking for more learning material about Rust. In terms of expertise it's mainly beginner and intermediate level material being requested. A lot of these requests also asked for video content specifically.
The common blockers that people mention to participating is that they have social anxiety, and accessibility. One of the common reasons mentioned was that some resources are hard to read for people with dyslexia.
Here are some specific responses to the question "What action could we take to make you feel more welcome?"
I feel too inexperienced and under skilled to participate in the Rust community
Advertise more ways for newcomers to contribute/participate
More organized mentorship, online classes
Do video tutorials on how to contribute to the compiler. I'd love to contribute but I feel intimidated
It's not easy to find resources for newcomers to see how Rust is being used in open source projects, so that they see the action as they're learning the language.
More tutorials/blogs that explain simple rust & coding concepts like the reader is a complete beginner
More intermediate level tutorials. We already have a million "Introductions to Rust".
Smaller groups of helping people - social anxiety is making it hard to talk in the Discord, for example
Don't have synchronous meetings at late EU hours. Have fewer synchronous meetings and/or more consistently publish and aggregate meeting notes for team meetings.
These issues are definitely ones we want to address in 2020 and beyond.
Who is using Rust and what for?
Rust daily usage has trended slightly upward at 27.63% (it was just under 25% last year and 17.5% on 2017). Daily or weekly usage has also continued to trend slightly upward. This year it was 68.52%, last year it was 66.4%, and in 2017 it was 60.8%.
We also asked users how they would rate their Rust expertise - there is a clear peak around "7".
To dig deeper into this, we correlated users' self-rated Rust expertise with how long they had been using Rust.
For some larger context, we examined what titles users working with Rust full time tend to have in their organization (survey respondents could select more than one).
By far the most common title for a Rust user is, unsurprisingly, Programmer/Software Engineer.
To get even more context, we asked Rust survey respondents to identify what industry they work in.
For users who use Rust full time, the most common industry by far is backend web applications.
The majority of Rust projects (43%) are 1,000-10,000 lines of code. Rust projects of medium to large size (those totaling over 10k lines of code) continue to trend higher. They have grown from 8.9% in 2016, to 16% in 2017, to 23% in 2018, to 34% in 2019.
Why not use Rust?
A big part of a welcoming Rust community is reaching out to non-users as well.
When we asked why someone had stopped using Rust, the most common response was "My company doesn't use Rust" - which indicates Rust adoption is still the biggest reason. After that, learning curve, lack of needed libraries, being slowed down by switching to Rust, and lack of IDE support were the most common reasons a user stopped using Rust.
For users who indicated they had never used Rust before, most indicated either "I haven't learned Rust yet, but I want to" or "My company doesn't use Rust" - again pointing to adoption as the main hurdle.
For more context, we also examined what title non-Rust users feel best matches their role.
Like with Rust users, by far the most common title is Programmer/Software Engineer.
Also like with Rust users, the most common industry by far is backend web applications.
We also asked users what would lead them to use Rust more often. Most indicated they would use Rust more if their company adopted it, if Rust had more libraries that they need, and if IDE support was better. The most common reasons after those pointed to a need to improve the learning curve and interoperability.
As adoption seemed to be the biggest problem preventing some respondents from using Rust, let's dive deeper into it.
Rust Adoption - a Closer Look
First, we asked what would we could do to improve adoption of Rust.
Several users gave specific examples:
"Smoothest learning curve as possible, as a small business even 4-6 weeks to become productive is a lot to ask"
"Higher market penetration"
"More stable libraries"
"A full-stack web framework like Rails, Django and Phoenix"
"Better documentation, more examples, recommendation on what crates to use"
"More emphasis on how it is a safer alternative to C or C++ (and really should be the default usually).”
"Improve compile times. Compiling development builds at least as fast as Go would be table stakes for us to consider Rust. (Release builds can be slow."
"Better platform support"
"Security and performance, cost efficient and "green" (low carbon footprint) language"
"Embedded development targeting ARM"
"Better GUI framework, similar to Qt or directly using Qt via bindings."
Most indicated that Rust maturity - such as more libraries and complete learning resources and more mature production capabilities - would make Rust more appealing.
Let's take a closer look at each of these, starting with the need for more mature libraries.
Libraries - a Closer Look
When we asked users what libraries they consider critical to the Rust ecosystem, these were the top ten responses:
serde
rand
tokio
async
clap
regex
log
futures
hyper
lazy_static
We also asked how many dependencies users were using were 1.0 or above.
0.8% indicated "All"
6.7% indicated "Most"
65.9% indicated "Some"
5.2% indicated "None"
21.4% indicated "I don't know"
IDEs and Tooling - a Closer Look
IDE support for Rust was also cited as a barrier to adoption.
When we asked users what editors they use, Vim and VSCode were the most popular by far, followed by Intellij.
We also asked what IDE setups users used:
43.3% indicated RLS
21.7% indicated Intellij
15.2% indicated Rust-analyzer
12.4% indicated No (or CTAGS)
4.2% indicated Only Racer
As for platforms that users develop on - Linux and Windows continue to dominate.
55% of Rust users develop on Linux
24% develop on Windows
23% develop on macOS
We found that the vast majority of all users use the current stable version of Rust (63%). It should be noted that the survey allowed respondents to select more than one option for what Rust version they use.
30.5% use the nightly version
2.5% use the Beta release
63% use the current stable version
3.1% use a previous stable release
0.6% use a custom fork
0.3% don't know
Surprisingly, the number of users using the Nightly compiler in their workflow is down at 20%. Last year it was at over 56%.
Learning Curve - a Closer Look
Rust is well known for its significant learning curve.
About 37% of Rust users felt productive in Rust in less than a month of use - this is not too different from the percentage last year (40%). Over 70% felt productive in their first year. Unfortunately, like last year, there is still a struggle among users - 21% indicated they did not yet feel productive.
As a point of interest, we took the subset of users who don't feel productive yet and plotted their ratings of their Rust expertise. This indicates that people who don't feel productive had low to intermediate levels of expertise - which are the groups that need the most support from our learning materials, documentation, and more.
Interoperability - a Closer Look
Over the years some users have expressed a desire for Rust to be more interoperable with other languages.
When we asked users what languages they would want to be interoperable with Rust, there was a wide spread of answers, but C dominates, followed (somewhat surprisingly) by R, which is followed very closely behind by C++. It should be noted that respondents were able to select more than one language in response to this question - these percentages are based on total responses.
When it comes to what platforms using are targeting for their applications Linux remains the first choice with 36.9%, with Windows as second at 16.3%. Following close behind Windows are macOS and Web Assembly at 14% each. We are also seeing more users targeting Android and Apple iOS.
Conclusions
Overall our users indicated that productivity is still an important goal for their work (with or without using Rust). The results show the overriding problem hindering use of Rust is adoption. The learning curve continues to be a challenge - we appear to most need to improve our follow through for intermediate users - but so are libraries and tooling.
Thank you to all who participated in this survey - these results are immensely informative to us - especially how we can improve both Rust the language and the entire Rust ecosystem. We look forward to continuing working for and with you for 2020 and beyond!
As so many other events in these mysterious times, the foss-north conference went online-only and on March 30, 2020 I was honored to be included among the champion speakers at this lovely conference and I talked about how to “curl better” there.
The talk is a condensed run-through of how curl works and why, and then a look into how some of the more important HTTP oriented command line options work and how they’re supposed to be used.
As someone pointed out: I don’t do a lot of presentations about the curl tool. Maybe I should do more of these.
curl is widely used but still most users only use a very small subset of options or even just copy their command line from somewhere else. I think more users could learn to curl better. Below is the video of this talk.
Doing a talk to a potentially large audience in front of your laptop in completely silence and not seeing a single audience member is a challenge. No “contact” with the audience and no feel for if they’re all going to sleep or seem interested etc. Still I have the feeling that this is the year we all are going to do this many times and hopefully get better at it over time…
A design sprint focused on privacy and browsing on mobile devices
Eight people from the Firefox for iOS team spent four days last week in a Google Ventures-style, remote design sprint. The team was inspired to gather for a sprint by existing Firefox user research about privacy and mobile devices and some business challenges that Firefox for iOS is facing.
In many ways, the sprint was traditional in its format. The two-year goal we set for the sprint was for Firefox to be the iOS browser people choose first for privacy. Related to that goal, we surfaced the following key questions:
Can we convert people into privacy-conscious browser users?
Can we solve the problems facing an unsavvy majority, rather than those of a savvy minority?
Can we connect this feature to revenue, conversion, and retention?
Using Miro as our primary tool (building on JustMad’sMiro template), our sprint team, comprised of engineering, design, content strategy, marketing, product, program management, and user research expertise, devised a solution that was a combination of sketches made by sprint team members. We initially named the final solution Private by Default and then re-named it to PandaBrowser for the purposes of user research. (We chose “panda” for our made-up product because pandas’ faces remind us of the mask used to signify Private Browsing Mode in Firefox today.)
Launch screen for our design sprint concept, the PandaBrowser
The PandaBrowser was meant to be an iOS browser with private browsing mode on by default that included direct value-driven messages and other signals of privacy, including a straightforward way to clear one’s browser activity.
On the last day of the sprint, we ran five unmoderated sessions on the usertesting.com platform to evaluate a basic InVision prototype of the concept and learned the following:
Participants were familiar with the concepts of private browsing mode and incognito and confidently expressed what they thought those modes achieved, whether or not those perceptions were accurate for existing private browsing mode offerings by Mozilla and other companies.
Participants also mentioned privacy-related concepts like cookies, geo-location, and ad tracking, but may or may not have accurate mental models of those items.
Participants found value in their phone browser history, tabs, and/or bookmarks for task continuity and returning to previously visited sites, which for some participants meant that a browser defaulting to a private browsing mode that clears activity often would be inconvenient.
Participants assumed that our Private by Default concept was similar to Firefox Focus in that there was not a “regular” browsing mode to which one could switch. While the prototype did not include any UI elements for switching to a “non-private” mode, the original concept did envision a toggle for switching out of the default private mode.
Participants described specific use cases for a private mode which included watching pornography, searching for niche products like fake Juul pods, location-based searches, and monitoring prices for things like flights.
The team is now in the process of identifying which sprint learnings we want to pursue with further design iterations, experimentation, and user research in 2020.
Also a design sprint against the backdrop of a global pandemic
Mozilla has a long history of having a distributed workforce with close to half of employees working remotely. While this design sprint was originally planned to take place in-person in our Toronto office, the COVID-19 pandemic forced everyone to stay home and participate in the sprint over Zoom. Remote design sprints are not uncommon, but we did the following beyond typical remote working to be sensitive to the trying times in which everyone is living and working.
Rewrote the sprint guidelines to prioritize “Be kind to yourself”
We emphasized at the start of each day that “be kind to yourself” was the most important guideline for the sprint. This included making explicit that people could take breaks as they needed without questions from the team, determine any other accommodations they needed to practice self-care during the sprint, and communicate with the facilitator privately via Slack at any point. We also added the caveat to the “no devices” guideline that we expected sprint team members to focus as much as possible on their primary desktop/laptop device, where we used the Miro board filled with activities that requested active participation from across the team, but trusted that team members would monitor any other devices they needed in order to take care of loved ones.
Sprint Guidelines: 1. Be kind to yourself 2. Full attention 3. No [other] devices 4. Turn off alerts (set “away” status) 5. Everything is time-boxed w/ breaks 6. Help us stay on time 7. Share pertinent feedback as we go
Made an effort to be more generous with breaks
Given all teammates currently sheltering in place across North America, including three time zones, we considered extra time for wellness and rest. We made sure we did not work more than two hours without a break and made sure our twice daily breaks (30 minutes and then 90 minutes) were considered sufficient for team members’ needs. The team was most crunched for time on Thursday given the prototype testing work, but the team worked together on each day of the sprint to ensure we got our sprint activities done in order to adjourn for our breaks on time.
Checked-in with teammates after each break
The world is changing so quickly, and major news is transpiring throughout each day. Additionally, team members were experiencing major life events, some very sudden and emotional, during the sprint week not directly related to the pandemic. Each time the team convened, at the start of the day and after each break, we made sure to check on what if anything notable transpired during the break time that might merit discussion or at least team acknowledgement. These check-ins took 5-10 minutes.
One of our sprint team members moved across the United States just before our sprint began. He had to supervise movers who arrived with his furniture while the sprint was taking place.
Collected suggestions for improvement at the end of each sprint day
Gathering team feedback on the sprint at the end of each sprint day is not particularly unusual, but emphasis for daily feedback during this sprint was placed on any accommodations that we might need to add before the following sprint day. Changes we made as a result of the daily feedback was to allow more time, even with our expedited sprint schedule, to hear team members’ rationale for their voting during activities.
Some other changes we will consider for next time:
If five consecutive days are available for the sprint, taking five instead of four days would lessen the workload toward the end of the week, especially for major sprint activities such as the prototype-making, user research planning, and group watching of the usertesting sessions. (Our team was limited to four days due to the Easter holiday.)
Perhaps lengthen some activities and shorten other activities. The team noted that more time with the Monday experts and for team discussions about important sprint decisions would be valuable.
Set higher standards for the accessibility of our sprint tools. For example, while we never used color as the only signifier for activities on our Miro board, there were some instances where color was privileged over other signifiers, even with one of our sprint team members having color-blindness. A more accessible sprint would not prioritize a single type of affordance (e.g., dependent on vision).
Long-lasting lessons
A Mozilla researcher who was not on the sprint team mentioned recently that perhaps one upside of the current pandemic is that we may all learn to be more empathetic to the people around us, perhaps more fully aware that we can never know what or the magnitude of what someone is experiencing in the current moment or even due to long-standing circumstances, some even systemic. While current world events inspired some changes we made to the way we work in the case of our recent design sprint, one could argue that these types of accommodations should be the norm. Our experience last week was a reminder that remote design sprints can be a valuable approach for bringing together a diverse group of people to do focused problem-understanding.
One of the sprint team members added: “…while we cannot physically be present with our friends, families, and coworkers, it felt really good to work together towards a common goal. Even if this goal isn’t going to change the state of our world right now, the togetherness and team-building warm fuzzies are ever more important during these times.” The idea of working toward an inclusive common goal is what we try to do at Mozilla on good days and, if recent times are any indication, on the toughest, too.
Thank you to the sprint team for your hard work last week and to the other Mozillians who provided feedback on an early draft of this post.
(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.)
I’ve written before about data, but never tackled the business perspective. To a business, what is data? It could be considered an asset, I suppose: a tool, like a printer, to make your business more efficient.
But like that printer and other assets, data has a cost. We can quite easily look up how much it costs to store arbitrary data on AWS (less than 2.3 cents USD per GB per month) but that only provides the cost of the data at rest. It doesn’t consider what it took for the data to get there or how much it costs to be useful once it’s stored.
So let’s imagine that you come across a decision that can only be made with data. You’ve tried your best to do without it, but you really do need to know how many Live Bookmarks there are per Firefox profile… maybe it’s in wide use and we should assign someone to spruce it up. Maybe almost no one uses it and so Live Bookmarks should be removed and instead become a feature provided by extensions.
This should be easy, right? Slap the number into an HTTP payload and send it to a Mozilla-controlled server. Then just count them all up!
As one of the Data Organization’s unofficial mottos puts it: Counting is Harder Than It Looks.
Let’s look at the full lifecycle of the metric from ideation and instrumentation to expiry and deletion. I’ll measure money and time costs, being clear about the assumptions guiding my estimates and linking to sources where available.
For a rule of thumb, time costs are $50 per hour. Developers and Managers and PMs cost more than $100k per year in total compensation in many jurisdictions, and less in many others. Let’s go with this because why not. I considered ignoring labour costs altogether because these people are doing their jobs whether they’re performing their part in this collection or not… but that’s assuming they have the spare capacity and would otherwise be doing nothing. Everyone I talk to is busy, so everyone’s doing this data collection work instead of something else they could be doing: so there is an opportunity cost.
Fixed costs, like the cost of building and maintaining a data collection library, data collection pipeline, bug trackers, code review tooling, dev computers are all ignored. We could amortize that per data collection… but it’d probably work out to $0 anyway.
Also, for the purposes of measuring data we’re counting only the size of the data itself (the count of the number of Live Bookmarks). To be more complete we’d need to amortize the cost of sending the data point (HTTP headers, payload metadata, the data point’s identifier, etc.) and factor in additional complexity (transfer encoding, compression, etc.). This would require a lot of words, and in the present Firefox Telemetry system this amortizes to 0 because the “main” ping has many data points in it and gzip compression is pretty good.
Also, this is a Best Case Estimate. I make many assumptions small in order to make this a lower-bound cost if everything goes according to plan and everyone acts the way they should.
Ideation – Time: 30min, Cost: $25
How long does it take you to figure out how to measure something? You need to know the feature you’re measuring, the capabilities of the data collection library you’re using to do the measuring, and some idea of how you’ll analyse it at the other end. If you’re trying to send something clever like the state of a customizable UI element or do something that requires custom analysis, this will take longer and take more people which will cost more money.
But for our example we know what we’re collecting: numbers of things. The data collection library is old and well understood. The analysis is straightforward. This takes one person a half hour to think through.
Instrumentation – Time: 60min, Cost: $50
Knowing the feature is not the same as knowing the code. You need a subject matter expert (developer who knows the feature and the code as well as the data collection library’s API) to figure out on exactly which line of code we should call exactly what method with exactly which count. If it’s complicated, several people may need to meet in order to figure out what to do here: are the input event timestamps the same format on Windows and Mac? Does time when the computer is asleep count or not?
For our example we have questions: Should we count the number of Live Bookmarks in the database? The number in the Bookmark Menu? The Bookmark Toolbar? What if the user deletes one, should we count before or after the delete?
This is small enough that we can find a single subject matter expert who knows it all. They read some documentation, make some decisions, write some code, and take an hour to do this themselves.
Review – Time: 30min, Cost $25
Both the code and the data collection need review. The simplicity of the data collection and the code make this quick. Mozilla’s code review tooling helps a lot here, too. Though it takes a day or two for the Module Peer and the Data Steward to find time to get to the reviews, it only takes a combination of a half hour for them to okay it to ship.
Storage (user) – Cost: $0
Data takes up space. Its definition takes up some bytes in the Firefox binary that you installed. It takes up bytes in your computer’s memory. It takes up bytes on disk while it waits to be sent and afterwards so you can look at it if you type about:telemetry into your address bar. (Try it yourself!)
The marginal cost to the user of the tens of bytes of memory and disk from our single number of Live Bookmarks is most accurately represented as a zero not only because memory and disk are excitingly cheap these days but also because there was likely some spare capacity in those systems.
Bandwidth (user) – Cost: $0.00 (but not zero)
Data requires network bandwidth to be reported, and network bandwidth costs money. Many consumer plans are flat-rate and so the marginal cost of the extra bytes is not felt at all (we’re using a little of the slack), so we can flatten this to zero.
But hey, let’s do some recreational math for fun! (We all do this in our spare time, right? It’s not just me?)
The data collection is a number, which is about 4 bytes of data. We send it about three times per day and individual profiles are in use by Firefox on average 12 days a month (engagement ratio of 0.4). (If you’re interested, this is due to a bunch of factors including users having multiple profiles at school, work, and home… but that’s another blog post).
4 bytes x 3 per day x 12 days in a month ~= 144 bytes per month
Thus a more accurate cost estimate of user bandwidth for this data would be 4 ten-thousandths of a cent (in Canadian dollars). It would take over 200 years of reporting this figure to cost the user a single penny. So let’s call it 0 for our purposes here.
Though… however close the cost is to 0, It isn’t 0. This means that, over time and over enough data points and over our full Firefox population, there is a measurable cost. Though its weight is light when it is but a single data point sent infrequently by each of our users, put together it is still hefty enough that we shouldn’t ignore it.
Bandwidth (Mozilla) – Cost: $0
Internet Service Providers have a nice gig: they can charge the user when the bytes leave their machine and charge Mozilla when the bytes enter their machine. However, cloud data platform providers (Amazon’s AWS, Google’s GCP, Microsoft’s Azure, etc) don’t charge for bandwidth for the data coming into their services.
You do get charged for bandwidth _leaving_ their systems. And for anything you do _on_ their systems. If I were feeling uncharitable I guess I’d call this a vendor lock-in data roach motel.
At any rate, the cost for this step is 0.
Pipeline Processing – Cost: $15.12
Once our Live Bookmarks data reaches the pipeline, there’s a few steps the data needs to go through. It needs to be decompressed, examined for adherence to the data schema (malformed data gets thrown out), and a response written to the client to tell it that we received it all okay. It needs to be processed, examined, and funneled to the correct storage locations while also being made available for realtime analysis if we need it.
For our little 4-byte number that shouldn’t be too bad, right?
Well, now that we’re on Mozilla’s side of the operation we need to consider the scale. Just how many Firefox profiles are sending how many of these numbers at us? About 250M of them each month. (At time of writing this isn’t up-to-date beyond EOY2019. Sorry about that. We’re working on it). With an engagement ratio of about 0.4, data being sent about thrice a day, and each count of Live Bookmarks taking up 4 bytes of space, we’re looking at 12GB of data per month.
At our present levels, ingestion and processing costs about $90 per TB. This comes out to $1.08 of cost for this step, each month. Multiplied by 14 “months”, that’s $15.12.
About Months
In saying “14 months” for how long the pipeline needs to put up with the collection coming from the entire Firefox population I glossed over quite a lot of detail. The main piece of information is that the default expiry for new data collections in Firefox is five or six Firefox versions (which should come out to about six months).
To calculate 14 months I looked at the total data collection volumes for five versions of Firefox: Firefox 69-73 (inclusive). This avoids Firefox ESR 68 gumming up the works (its support lifetime is much longer than a normal release, and we’re aiming for a best-case cost estimate) and is both far enough in the past that Firefox 69 ought to be winding down around now _and_ is recent enough that we’ll not have thrown out the data yet (more on retention periods later) and it is closer in behaviour to releases we’re doing this year.
Here’s what that looks like:
So I said this was far enough in the past that Firefox 69 ought to be winding down around now? Well, if you look really closely at the bottom-right you might be able to see that we’re still receiving data from users still on that Firefox version. Lots of them.
But this is where we are in history, and I’m not running this query again (it only cost 15 cents, but it took half an hour), so let’s do the math. The total amount of data received from these five releases so far divided by the amount of data I said above that the user population would be sending each month (12GB) comes out to about 13.7 months.
To account for the seriously-annoying number of pings from those five versions that we presumably will continue receiving into the future, I rounded up to 14.
Storage (Mozilla) – Cost: $84
Once the data has been processed it needs to live somewhere. This costs us 2 cents per gigabyte stored, per month we decide to store it. 12GB per month means $0.24, right?
Well, no. We don’t have a way to only store this data for a period of time, so we need to store it for as long as the other stuff we store. For year-over-year forecasting we retain data for two years plus one month: 25 months. (Well, we presently retain data a bit longer than that, but we’re getting there.) So we need to take the 12GB we get each month and store it for 25 months. When we do that for each of the 14 “months” of data we get:
12GB/”month” x 14 “months” x $0.02 per GB per month x 25 months retention = $84
Now if you think this “2 cents per GB” figure is a little high: it is! We should be able to take advantage of lower storage costs for data we don’t write to any more. Unfortunately, we do write to it all the time servicing Deletion Requests (which I’ll get to in a little bit).
Analysis (Mozilla) – Time: 30min, Cost: $25.55
Data stored on some server someplace is of no use. Its value is derived through interrogating it, faceting its aggregations across interesting dimensions, picking it apart and putting it back together.
If this sounds like processing time Mozilla needs to pay for, you are correct!
On-demand analyses in Google’s BigQuery cost $5 per TB of data scanned. Mozilla’s spent some decent time thinking about query patterns to arrange data in a way that minimizes the amount of data we need to look at in any given analysis… but it isn’t perfect. To deliver us a count of the number of Live Bookmarks across our user base we’re going to have to scan more than the 12GB per month.
But this is a Best Case Estimate so let’s figure out how much a perfect query (one that only had to scan the data we wanted to get out of it) would cost:
12GB / 1000GB/TB * 5 $/TB = $0.06
That gives you back a sum of all the Live Bookmarks reported from all the Firefox profiles in a month. The number might be 5, or 5 million, or 5 trillion.
In other words, the number is useless. The real question you want to answer is “How much is this feature used?” which is less about the number of Live Bookmarks reported than it is Live Bookmarks stored per Firefox profile. If the 5 million Live Bookmarks are five thousand reports of 1000 Live Bookmarks all from one fellow named Nick, then we shouldn’t be investing in a feature used by one person, however much that one person uses it.
If the 5 million Live Bookmarks are one hundred thousand profiles reporting various handfuls of times a moderate number of bookmarks, then Live Bookmarks is more likely a broadly-used feature and might just need a little kick to be used even more.
So we need to aggregate the counts per-client and then look at the distribution. We can ask, over all the reports of Live Bookmarks from this one Firefox profile, give us the maximum number reported. Then show us a graph (like this). A perfect query of a month’s data will not only need to look at the 12GB of the month’s Live Bookmarks count, but also the profile identifier (client_id) so we can deduplicate reports. That id is a UUID and is represented as a 36-byte string. This adds another 8x data to scan compared to the 4B Live Bookmarks count we were previously looking at, ballooning our query to 108GB and our cost to $0.54.
But wait! We’re doing two steps: one to crunch these down to the 250M profiles that reported data that month and then a second to count the counts (to make our graph). That second step needs to scan the 250M 4B “maximum counts”, which adds another half a cent.
So our Best Case Estimate for querying the data to get the answer to our question is: $0.55 cents (I rounded up the half cent).
But don’t forget you need an analyst to perform this analysis! Assuming you have a mature suite of data analysis tooling, some rigorous documentation, and a well-entrenched culture of everyone helping everyone, this shouldn’t take longer than a half-hour of a single person’s time. Which is another $25, coming to a grand total of $25.55.
Deletion – Cost: $21
The data’s journey is not complete because any time a user opts their Firefox profile out of data collection we receive an order to delete what data we’ve previously received from that profile. To delete we need to copy out all the not-deleted data into new partitions and drop the old ones. This is a processing cost that is currently using the ad hoc $5/TB rate every time we process a batch of deletions (monthly).
Our Live Bookmarks count is adding 4 bytes of data per row that needs to be copied over. Each of those counts (excepting the ones that are deleted) needs to be copied over 25 times (retention period of 25 months). The amount of deleted data is small (Firefox’s data collection is very specifically designed to only collect what is necessary, so you shouldn’t ever feel as though you need to opt out and trigger deletion) so we’ll ignore its effect on the numbers for the purposes of making this easier to calculate.
12 GB/”month” x 14 “months” x 25 deletions / 1000GB/TB x 5 $/TB = $21
The total lifetime cost of all the deletion batches we process for the Live Bookmarks counts we record is $21. We’re hoping to knock this down a few pegs in cost, but it’ll probably remain in the “some dollars” order of magnitude.
The bigger share of this cost is actually in Storage, above. If we didn’t have to delete our data then, after 90 days, storage costs drop by half per month. This means that, if you want to assign the dollars a little more like blame, Storage costs are “only” $52.08 (full price for 3 months, half for 22) and Deletion costs are $52.92.
Grand Total: $245.67
In the best case, a collection of a single number from the wide Firefox user base will cost Mozilla almost $246 over the collection’s lifetime, split about 50% between labour and data computing platform costs.
So that’s it? Call it a wrap? Well… no. There are some cautionary tales to be learned here.
Lessons
0) Lean Data Practices save money. Our Data Collection Review Request form ensures that we aren’t adding these costs to Mozilla and our users without justifying that the collection is necessary. These practices were put into place to protect our users’ privacy, but they do an equally good job of reducing costs.
1) The simplest permanent data collection costs $228 its first year and $103 every year afterwards even if you never look at it again. It costs $25 (30min) to expire a collection, which pays for itself in a maximum of 2.9 months (the payback period is much shorter if the data collection is bigger than 4B (like a Histogram) because the yearly costs are higher). The best time to have expired that collection was ages ago: the second-best time is now.
2) Spending extra time thinking about a data collection saves you time and money. Even if you uplift a quick expiry patch for a mis-measured collection, the nature of Firefox releases is such that you would still end up paying nearly all of the same $245.67 for a useless collection as you would for a correct one. Spend the time ahead of time to save the expense. Especially for permanent collections.
3) Even small improvements in documentation, process, and tooling will result in large savings. Half of this cost is labour, and lesson #2 is recommending you spend more time on it. Good documentation enables good decisions to be made confidently. Process protects you from collecting the wrong thing. Tooling catches mistakes before they make their way out into the wild. Even small things like consistent naming and language will save time and protect you from mistakes. These are your force multipliers.
4) To reduce costs, efficient data representations matter, and quickly-expiring data collections matter more.
5) Retention periods should be set as short as possible. You shouldn’t have to store Live Bookmarks counts from 2+ years ago.
Where Does Glean Fit In
Glean‘s focus on high-level metric types, end-to-end-testable data collections, and consistent naming makes mistakes in instrumentation easier to find during development. Rather than waiting for instrumentation code to reach release before realizing it isn’t correct, Glean is designed to help you catch those errors earlier.
Also, Glean’s use of per-application identifiers and emphasis on custom pings allows for data segregation that allows for different retention periods per-application or per-feature (e.g. the “metrics” ping might not need to be retained for 25 months even if the “baseline” ping does. And Firefox Desktop’s retention periods could be configured to be of a different length than Firefox Lockwise‘s) and reduces data scanned per analysis. And a consistent ping format and continued involvement of Data Science through design and development reduces analyst labour costs.
Basically the only thing we didn’t address was efficient data transfer encodings, and since Glean controls its ping format as an internal detail (unlike Telemetry) we could decide to address that later on without troubling Product Developers or Data Science.
There’s no doubt more we could do (and if you come up with something, do let us know!), but already I’m confident Glean will be worth its weight in Canadian Dollars.
:chutten
(( Special thanks to :jason and :mreid for helping me nail down costs for the pipeline pieces and for the broader audience of Data Engineers, Data Scientists, Telemetry Engineers, and other folks who reviewed the draft. ))
Both browsers have paused or conditioned their efforts to not take the final steps during the Covid-19 outbreak, but they will continue and the outcome is given: FTP support in browsers is going away. Soon.
curl
curl supported both uploads and downloads with FTP already in its first release in March 1998. Which of course was many years before either of those browsers mentioned above even existed!
In the curl project, we work super hard and tirelessly to maintain backwards compatibility and not break existing scripts and behaviors.
For these reasons, curl will not drop FTP support. If you have legacy systems running FTP, curl will continue to have your back and perform as snappy and as reliably as ever.
FTP the protocol
FTP is a protocol that is quirky to use over the modern Internet mostly due to its use of two separate TCP connections. It is unencrypted in its default version and the secured version, FTPS, was never supported by browsers. Not to mention that the encrypted version has its own slew of issues when used through NATs etc.
To put it short: FTP has its issues and quirks.
FTP use in general is decreasing and that is also why the browsers feel that they can take this move: it will only negatively affect a very minuscule portion of their users.
Legacy
FTP is however still used in places. In the 2019 curl user survey, more than 29% of the users said they’d use curl to transfer FTP within the last two years. There’s clearly a long tail of legacy FTP systems out there. Maybe not so much on the public Internet anymore – but in use nevertheless.
Alternative protocols?
SFTP could have become a viable replacement for FTP in these cases, but in practice we’ve moved into a world where HTTPS replaces everything where browsers are used.
I recently undertook a project to improve the stack fixing tools used for Firefox. This has resulted in some large performance wins (e.g. 10x-100x) and a significant improvement in code quality. The story involves Rust, Python, executable and debug info formats, Taskcluster, and many unexpected complications.
What is a stack fixer?
Within the Firefox code base, a stack fixer is a program that post-processes (“fixes”) the stack frames produced by MozFormatCodeAddress(), which often lack one or more of: function name, file name, or line number. It reads debug info from binaries (libraries and executables) to do so. It reads from standard input and writes to standard output. Lines matching the special stack frame format are modified appropriately. For example, a line like this in the input that names an executable or library:
#01: ???[tests/example +0x43a0]
is changed to a line in the output that names a function, source file, and line number:
#01: main (/home/njn/moz/fix-stacks/tests/example.c:24)
Lines that do not match the special stack frame format are passed through unchanged.
This process is sometimes called “symbolication”, though I will use “stack fixing” in this post because that’s the term used within the Firefox code base.
Stack fixing is used in two main ways for Firefox.
When tests are run on debug builds, a stack trace is produced if a crash or assertion failure happens.
The heap profiling tool DMD records many stack traces at heap allocation points. The stack frames from these stack traces are written to an output file.
A developer needs high-quality stack fixing for the stack traces to be useful in either case.
That doesn’t sound that complicated
The idea is simple, but the reality isn’t.
The debug info format is different on each of the major platforms: Windows (PE/PDB), Mac (Mach-O), and Linux (ELF/DWARF).
We also support Breakpad symbols, a cross-platform debug info format that we use on automation. (Using it on local builds is something of a pain.)
Each debug info format is complicated.
Firefox is built from a number of libraries, but most of its code is in a single library called libxul, whose size ranges from 100 MiB to 2 GiB, depending on the platform and the particular kind of build. This stresses stack fixers.
Before I started this work, we had three different Python scripts for stack fixing.
fix_linux_stack.py: This script does native stack fixing on Linux. It farms out most of the work to addr2line, readelf, and objdump.
fix_macosx_stack.py: This script does native stack fixing on Mac. It farms out most of the work to atos, otool, and c++filt.
fix_stack_using_bpsyms.py: This script does stack fixing using Breakpad symbols. It does the work itself.
Note that there is no fix_windows_stack.py script. We did not have a native stack-fixing option for Windows.
This was an inelegant mishmash. More importantly, the speed of these scripts was poor and highly variable. Stack fixing could take anywhere from tens of seconds to tens of minutes, depending on the platform, build configuration, and number of stack frames that needed fixing. For example, on my fast 28-core Linux box I would often have to wait 20 minutes or more to post-process the files from a DMD run.
One tool to rule them all
It would be nice to have a single program that could handle all the necessary formats. It would also be nice if it was much faster than the existing scripts.
Fortunately, the Symbolic Rust crate written by Sentry provided the perfect foundation for such a tool. It provides the multi-platform debug info functionality needed for stack fixing, and also has high performance. In November last year I started a project to implement a new stack fixer in Rust, called fix-stacks.
Implementing the tool
First I got it working on Linux. I find Linux is often the easiest platform to get new code working on, at least partly because it’s the platform I’m most familiar with. In this case it was also helped by the fact that on Linux debug info is most commonly stored within the binary (library or executable) that it describes, which avoids the need to find a separate debug info file. The code was straightforward. The Symbolic crate did the hard part of reading the debug info, and my code just had to use the APIs provided to iterate over the parsed data and build up some data structures that could then be searched.
Then I got it working on Windows. I find Windows is often the hardest platform to get new code working on, but that wasn’t the case here. The only complication was that Windows debug info is stored in a PDB file that is separate from the binary, but Symbolic has a function for getting the name of that file from the binary, so it wasn’t hard to add code to look in that separate file.
Then I got it working on Mac. This was by far the hardest platform, for two reasons. First, the code had to handle fat binaries, which contain code for multiple architectures. Fortunately, Symbolic has direct support for fat binaries so that wasn’t too bad.
Second, the normal approach on Mac is to read debug info from the files produced by dsymutil, in which the debug info is neatly packaged. Unfortunately, dsymutil is very slow and we try to avoid running it in the Firefox build system if possible. So I took an alternative approach: read the binary’s symbol table and then read debug info from the object files and archive files it mentions. I knew that atos used this approach, but unfortunately its source code isn’t available, so I couldn’t see exactly what it did. If I couldn’t get the approach working myself the whole project was at risk; a one-tool-to-rule-them-all strategy falls short it if doesn’t work on one platform.
I spent quite some time reading about the Mach-O file format and using the MachOView utility to inspect Mach-O binaries. Symbolic doesn’t provide an API for reading symbol tables, so I had to use the lower-level goblin crate for that part. (Symbolic uses goblin itself, which means that fix-stacks is using goblin both directly and indirectly.) First I got it working on some very small test files, then on some smaller libraries within Firefox, and finally (to great relief!) on libxul. At each step I had to deal with new complications in the file format that I hadn’t known about in advance. I also had to modify Symbolic itself to handle some edge cases in .o files.
After that, I got fix-stacks working on Breakpad symbols. This was more straightforward; the only tricky part was navigating the directory structure that Firefox uses for storing the Breakpad symbols files. (I found out the hard way that the directory structure is different on Windows.)
One final complication is that DMD’s output, which gets run through the stack fixer, is in JSON format. So fix-stacks has a JSON mode (enabled with --json) that does the appropriate things with JSON escape characters on both input and output. This tookthreeattempts to get completely right.
The end result is a single program that can fix stacks on all four of the formats we need. The stack traces produced by fix-stacks are sometimes different to those produced by the old stack fixing scripts. In my experience these differences are minor and you won’t notice them if you aren’t looking for them.
Code size
The source code for the first version of fix-stacks, which only supported Linux, was 275 lines (excluding tests). The current version, with support for Windows, Mac, Breakpad symbols, and JSON handling, is 891 lines (excluding tests).
In comparison, the Symbolic crate is about 20,000 lines of Rust code in total (including tests), and the three sub-crates that fix-stacks uses (debuginfo, demangle, and common) are 11,400 lines of Rust code. goblin is another 18,000 lines of code. (That’s what I call “leveraging the ecosystem”!)
Beyond Symbolic and goblin, the only other external crates that fix-stacks uses are fxhash, regex, and serde_json.
Testing
Testing is important for a tool like this. It’s hard to write test inputs manually in formats like ELF/DWARF, PE/PDB, and Mach-O, so I used clang to generate inputs from some simple C programs. Both the C programs and the binary files generated from them are in the repository.
Some of the generated inputs needed additional changes after they were generated by clang. This is explained by the testing README file:
The stack frames produced by `MozFormatCodeAddress()` contain absolute paths
and refer to build files, which means that `fix-stacks` can only be sensibly
run on the same machine that produced the stack frames.
However, the test inputs must work on any machine, not just the machine that
produced those inputs. Furthermore, it is convenient when developing if all the
tests works on all platforms, e.g. the tests involving ELF/DWARF files should
work on Windows, and the tests involving PE/PDB files should work on Linux.
To allow this requires the following.
- All paths in inputs must be relative, rather than absolute.
- All paths must use forward slashes rather than backslashes as directory
separators. (This is because Windows allows both forward slashes and
backslashes, but Linux and Mac only allow forward slashes.) This includes the
paths in text inputs, and also some paths within executables (such as a PE
file's reference to a PDB file).
To satisfy these constraints required some hex-editing of the generated input files. Quoting the README again:
`example-windows.exe` and `example-windows.pdb` were produced on a Windows 10
laptop by clang 9.0 with this command within `tests/`:
```
clang -g example.c -o example-windows.exe
```
`example-windows.exe` was then hex-edited to change the PDB reference from the
absolute path `c:\Users\njn\moz\fix-stacks\tests\example-windows.pdb` to the
relative path `tests/////////////////////////////example-windows.pdb`. (The use
of many redundant forward slashes is a hack to keep the path the same length,
which avoids the need for more complex changes to that file.)
A hack, to be sure, but an effective one.
The steps required to produce the Mac test inputs were even more complicated because they involve fat binaries. I was careful to make that README file clearly describe the steps I took to generate all the test inputs. The effort has paid off multiple times when modifying the tests.
Integrating the tool
Once I had the fix-stacks working well, I thought that most of the work was done and integrating it into the Firefox build and test system would be straightforward. I was mistaken! The integration ended up being a similar amount of work.
First, I added three new jobs to Mozilla’s Taskcluster instance to build fix-stacks and make it available for downloading on Windows, Mac, and Linux; this is called a “toolchain”. This required making changes to various Taskcluster configuration files, and writing a shell script containing the build instructions. All of this was new to me, and it isn’t documented, so I had to cargo-cult from similar existing toolchains while asking lots of questions of the relevant experts. You can’t test jobs like these on your own machine so it took me dozens of “try” pushes to Mozilla’s test machines to get it working, with each push taking roughly 10 minutes to complete.
Then I added a wrapper script (fix_stacks.py) and changed the native stack fixing path in DMD to use it instead of fix_linux_stack.py or fix_macosx_stack.py. This took some care, with numerous try pushes to manually check that the stacks produced by fix_stacks.py were as good as or better than the ones produced by the old scripts. To do this manual checking I first had to deliberately break the DMD test, because the stacks produced are not printed in the test log when the test passes. I also had to update mach bootstrap so it would install a pre-built fix-stacks executable in the user’s .mozbuild directory, which was another unfamiliar part of the code for me. Plus I fixed a problem with the fix-stacks toolchain for Mac: the fix-stacks executable was being cross-compiled on a Linux machine, but some errors meant it was not actually cross-compiling, but simply building a Linux executable. Plus I fixed a problem with the fix-stacks toolchain for Windows: it was building a 64-bit executable, but that wouldn’t work on our 32-bit test jobs; cross-compiling a 32-bit Windows executable on Linux turned out to be the easiest way to fix it. Again, these toolchain fixes took numerous trial-and-error try pushes to get things working. Once it was all working, native stack fixing on Windows was available for DMD for the first time.
Then I changed the native stack fixing path in tests to use fix_stacks.py. This required some minor changes to fix_stacks.py‘s output, to make it more closely match that of the old scripts, to satisfy some tests. I also had to modify the Taskcluster configuration to install the fix-stacks executable in a few extra places; again this required some trial-and-error with try pushes. (Some of those modifications I added after my initial landing attempt was backed out due to causing failures in a tier 2 job that doesn’t run by default on try, *cough*.) At this point, native stack fixing on Windows was available for test output for the first time.
Then I re-enabled stack-fixing for local test runs on Mac. It had been disabled in December 2019 because fixing a single stack typically took at least 15 minutes. With fix_stacks.py it takes about 30 seconds, and it also now prints out a “this may take a while” message to prepare the user for their 30 second wait.
Along the way, I noticed that one use point of the old stack fixing scripts, in automation.py.in, was dead code. Geoff Brown kindly removed this dead code.
And then Henrik Skupin noticed that the fix-stacks executable wasn’t installed when you ran mach bootstrap for artifact builds, so I fixed that.
And then I was told that I had broken the AWSY-DMD test jobs on Windows. This wasn’t noticed for weeks because those jobs don’t run by default, and to run them on try you must opt into the “full” job list, which is unusual. The problem was some gnarly file locking caused by the way file descriptors are inherited when a child process is spawned on Windows in Python 2; working this out took some time. (It wouldn’t be a problem on Python 3, but unfortunately this code is Python 2 and that cannot be easily changed.) I thought I had a fix, but it caused other problems, and so I ended up disabling stack fixing on Windows for this job, which was a shame, but put us back where we started, with no stack fixing on Windows for that particular job.
And then I changed the Breakpad symbols stack fixing path in tests to use fix_stacks.py, which seemed simple. But it turns out that tests on Android partly run using code from the current Firefox repository, and partly using code from the “host utils”, which is a snapshot of the Firefox repository from… the last time someone updated the snapshot. (This has something to do with part of the tests actually taking place on Linux machines; I don’t understand the details and have probably mis-described the setup.) The host utils in use at the time was several months old and lacked the fix_stacks.py script. So Andrew Erickson kindly updated the host utils for me. And then I fixed a few more Taskcluster configuration issues, and then the “simple” fix could land. And then I fixed another configuration issues that showed up later, in a follow-up bug.
And then I found a better solution to the Windows + Python 2 file descriptor issue, allowing me to re-enable stack fixing for the Windows AWSY-DMD job. (With another host utils update, to keep the Android tests working.)
And then I updated all the online documentation I could find that referred to the old scripts, all of it on MDN.
And then I closed the meta-bug that had been tracking all of this work. Hooray!
And then I was told of another obscure test output issue relating to web platform tests, which I have not yet landed a fix for. One lesson here is that changing code that potentially affects the output of every test suite is a fraught endeavour, with the possibility of a long tail of problems showing up intermittently.
Performance
I did some speed and peak memory measurements on the two common use cases: fixing many stack frames in a DMD file, and fixing a single stack trace from an assertion failure in a test. The machines I used are: a fast 28-core Linux desktop machine, a 2019 16-inch 8-core MacBook Pro, and an old Lenovo ThinkPad Windows laptop. The fix-stacks executable is compiled with LTO, because I found it gives speed-ups of up to 30%.
First, the following measurements are for fixing a DMD output file produced by an optimized Firefox build, old vs. new.
Linux native: 4m44s / 4.8 GB vs. 21s / 2.4 GB
Mac native: 15m47s / 1.0 GB vs. 31s / 2.4 GB
Windows native: N/A vs. 29s / 2.6 GB
Linux Breakpad symbols: 25s / 2.1 GB vs. 13s / 0.6 GB
(Each platform had a different input file, with some variations in the sizes, so cross-platform comparisons aren’t meaningful.)
On Linux we see a 13x speed-up, and I have seen up to 100x improvements on larger inputs. This is because the old script started quickly, but then each additional stack frame fixed was relatively slow. In comparison, the new script has a slightly higher overhead at start-up but then each additional stack frame fixed is very fast. Memory usage is halved, but still high, because libxul is so large.
On Mac the new script is 30x faster than the old script, but memory usage is more than doubled, interestingly. atos must have a particularly compact representation of the data.
On Windows we couldn’t natively fix stacks before.
For Breakpad symbols we see a 2x speed-up and peak memory usage is less than one-third.
Second, the following measurements are for fixing a single stack trace produced by a debug Firefox build, old vs. new.
Linux native: 9s / 1.5 GB vs. 13s / 2.5 GB
Mac native: 15m01s / 1.1 GB vs. 27s / 2.6 GB
Win native: N/A vs. 30s / 3.2 GB
Linux Breakpad symbols: 27s / 3.5 GB vs. 13s / 1.1 GB
On Linux, both speed and peak memory usage are somewhat worse. Perhaps addr2line is optimized for doing a small number of lookups.
On Mac the new script is again drastically faster, 33x this time, but memory usage is again more than doubled.
On Windows, again, we couldn’t natively fix stacks before.
For Breakpad symbols we again see a 2x speed-up and peak memory usage of less than one-third.
You might have noticed that the memory usage for the single stack trace was generally higher than for the DMD output. I think this is because the former is an optimized build, while the latter is a debug build.
In summary:
The speed of native stack fixing is massively improved in many cases, with 10x-100x improvements typical, and slightly slower in only one case. This represents some drastic time savings for Firefox developers.
The peak memory usage of native stack fixing is sometimes lower, sometimes higher, and still quite high in general. But the amount of memory needed is still much less than that required to compile Firefox, so it shouldn’t be a problem for Firefox developers.
Native stack fixing is now possible on Windows, which makes things easier for Firefox developers on Windows.
For Breakpad symbols stack fixing is 2x faster and takes 3x less memory. This represents some significant savings in machine time on automation, and will also reduce the chance of failures caused by running out of memory, which can be a problem in practice.
My experience with Rust
Much of my work using Rust has been on the Rust compiler itself, but that mostly involves making small edits to existing code. fix-stacks is the third production-quality Rust project I have written from scratch, the others being Firefox’s new prefs parser (just under 1000 lines of code) and counts (just under 100 lines of code).
My experience in all cases has been excellent.
I have high confidence in the code’s correctness, and that I’m not missing edge cases that could occur in either C++ (due to lack of safety checks) or Python (due to dynamic typing).
The deployed code has been reliable.
Rust is a very pleasant language to write code in: expressive, powerful, and many things just feel “right”.
I have been writing C++ a lot longer than Rust but I feel more competent and effective in Rust, due to its safety and expressiveness.
Performance is excellent.
As mentioned above, the entire fix-stacks project wouldn’t have happened without the third-party Symbolic crate.
Rust gives me a feeling of “no compromises” that other languages don’t.
Conclusion
Stack fixing is much better now, and it took more work than I expected!
Many thanks to Mike Hommey, Eric Rahm, and Gabriele Svelto for answering lots of questions and reviewing many patches along the way.
Starting in version 75, Firefox can be configured to use client certificates provided by the operating system on Windows and macOS.
Background
When Firefox negotiates a secure connection with a website, the web server sends a certificate to the browser for verification. In some cases, such as corporate authentication systems, the server requests that the browser send a certificate back to it as well. This client certificate, combined with a signature from the private key corresponding to that certificate, allows the user to authenticate to the website.
These client certificates and private keys are often stored in hardware tokens or in storage provided by the operating system.
Using Firefox to access a client certificate stored on a hardware token typically involves loading a shared library written by either the vendor of the token or another third party into Firefox’s process. These third party libraries can cause stability issues with Firefox and are concerning from a security perspective. For instance, a vulnerability in one of these libraries can potentially put Firefox users at risk.
Alternatively, Firefox can use client certificates that have exportable keys if they are manually saved to a file and then imported into a Firefox profile. Though this storage mechanism can be protected by a password, this option increases the potential for a private key to be compromised. Additionally, this method does not work at all for unexportable keys.
A New Approach
To address these issues, we have developed a library that allows Firefox to interface with certificate storage provided by the operating system. Rather than loading third-party libraries to communicate with hardware tokens, Firefox can delegate this task to the operating system. Also, instead of forcing the user to export client certificates and re-import them into their Firefox profile, Firefox can look for these certificates directly. In addition to protecting private keys, this new mechanism allows Firefox to make use of client certificates with unexportable keys.
Because this library is entirely new, we took the opportunity to select an implementation language that would allow us to access the low-level operating system APIs we needed while enforcing strong safety properties. Rust was the obvious choice to fill those needs.
Availability
This library is shipping as part of Firefox Desktop on Windows and macOS, starting with version 75. To enable it, set the about:config preference “security.osclientcerts.autoload” to true.
We expect this feature to be of great benefit to our enterprise users who have previously gone to great lengths to configure Firefox to work in their environment.
Back in February, we announced support for the first extension for Firefox Preview, the new and rebuilt mobile browser for Android that is set to replace Firefox for Android later this year.
We’ve since expanded support for more add-ons from the Recommended Extensions program that we’d like to introduce to you. These add-ons will be available in Firefox Preview within the next 2 weeks.
With Dark Reader, websites on mobile will be easy to read when the lights are dim. The extension automatically inverts bright colors on web pages to offer an eye-pleasing dark mode. There are a number of configuration options allowing you to customize your experience.
When you are on the go, you don’t want people eavesdropping on your browsing behavior. HTTPS Everywhere automatically enables website encryption for pages that default to unencrypted communications. This is especially helpful if you are surfing via a shared wifi connection.
If you are worried about potentially malicious web content, NoScript protects against a number of web security exploits by disabling potentially malicious scripts from running on websites. You can fine-tune the configuration of NoScript and permit scripts to run only on sites you trust.
Concerned about advertisers and other third-party trackers from following you around the web? Privacy Badger nicely complements Firefox’s built-in tracking protection. The extension automatically learns when websites start tracking you and will put an end to the privacy invasion. It also includes additional privacy protections like block link tracking.
If you’ve said “now where did I see that picture before” once too often, then Search by Image is the right extension for you. With the help of this extension you can select images and feed them into reverse image searches from more than 20 search engines.
We’d like to thank the developers of these add-ons for supporting Firefox Preview. The developers have made some great adjustments to optimize their extensions for mobile and have been a pleasure to talk to.
While we’re pleased to offer these six highly recommended add-ons as a starting point, it’s clear that add-on developers have more great ideas for extensions that can enhance the mobile browsing experience. We intend to enable more add-ons from the Recommended Extensions program within the next few months and will be reaching out to developers soon.
One of the questions that the Hubs team is often asked is about the benefits of shared virtual environments compared to traditional video conferencing. While Hubs was built to support virtual reality devices, and there are a number of benefits that a VR headset can provide for meeting with people online, we’ve been interested in understanding the different ways that people connect in Hubs even when they’re on a desktop or mobile device. As we think about the future of mixed reality, it’s important to recognize that the device form factors that people will use will vary from handheld and standalone devices as well as headsets. In this post, we’ll share a few thoughts about how meeting in shared 3D environments (even without a virtual reality device) can provide an alternative to video conferencing, and when it does - or doesn’t - work effectively.
Creating a shared context for conversation
We, as humans, are spatial creatures. As members of different societies throughout history, we have developed to remember and react to different physical locations that we’re in, and to use cues from our surroundings to provide context and guidance about what is expected of us. Architectural decisions are made to enforce or guide specific expectations when we enter a space, and provide cues around whether the expectation is for us to be serious, playful, formal or informal, and anywhere in between - and that applies to virtual spaces, too.
When we meet in groups digitally, we always bring our own context to the virtual meeting spaces. With video conferencing, we are each grounded, spatially, to our own physical location. Video calls lack a sense of shared place, where we are bringing ourselves to a single, shared environment. In video calls, we get a small sense of the other environments that others are part of, but we do not place ourselves cognitively into those spaces, for the most part. Having a shared 3D place that we can connect in allows us to be grounded in a similar place, and cognitively places each meeting participant into the same virtual location as everyone else. This effect can be felt even when the 3D place is experienced through a 2D window - you get the benefits even without a VR headset! This is especially beneficial when the meeting itself involves discussing a form of spatial content. If you need to collectively view and discuss a model of a building, for example, being able to gather around and point at, annotate, and collectively experience that model can result in a more productive conversation.
3D environments can be especially beneficial for working with models
In addition to the environmental cues, having a shared context for conversation in a virtual space means that people may be able to better understand hierarchies of conversations. It also allows for more natural groupings, since people can easily break off from a larger group, organize in smaller groups to have separate conversations (without leaving the shared place), and convene.
By being able to have a shared awareness of which participant is speaking, or indicate via simulated gaze who should react to something, some users may find avatar-based communication more equitable and conversational than video conversations. The tradeoff - at least at this stage of virtual chat apps - is that video is still superior in contexts where eye contact or facial expressions are critical to the content of the conversation being had, but those calls have their own cognitive load.
Three avatars in Hubs stand in front of a virtual whiteboard with a graph on it
Preserving identity and anonymity
That said, there are meeting contexts where having anonymity and not showing facial expressions (or the inside of your apartment) may be desirable. With video conferencing, you’re showing up as your full self - albeit, maybe with a filter or a green screen behind you. This tends to work best in virtual meetings where you know the other participants, and are comfortable with how you’re appearing on camera. However, being on video camera may not be the desirable option when there are other people in your physical location who you may not want to expose. Turning the camera off is always an option, but it removes a degree of presence for both you and other participants on the call. Virtual spaces, where you’re represented digitally, can offer a spectrum of privacy for your physical location and identity, which is especially important for virtual meetings that involve a wider range of people who may not all know one another so closely that they’re willing to share that degree of information about themselves.
Encouraging shared spontaneity
Finally, one of the key areas that 3D social applications can provide a new framing for meetings and events is in their ability to encourage spontaneity and delight. When you can create objects, bring in shared content, change your appearance and your environment on a whim, draw socially, and communicate via voice, you open up opportunities for spontaneous, creative thought to occur naturally within the conversation. While this might not be appropriate for all kinds of meetings or events, there are many types of collaboration, social, and brainstorming scenarios that benefit from the less-structured opportunities that can arise from meeting in a shared virtual environment.
We’ve tried to keep these principles in mind as we’ve built Hubs. In our current times, we know that people are seeking alternative ways to stay connected and we encourage them to experiment and explore the value that shared 3D spaces can bring to their lives. If you’re curious about exploring shared 3D spaces, try it out today.
This is the 25th transfer protocol added to curl. The first new addition since we added SMB and SMBS back in November 2014.
Background
Back in early 2019, my brother Bj"orn Stenberg brought a pull request to the curl project that added support for MQTT. I tweeted about it and it seemed people were interested in seeing this happen.
Time passed and Bj"orn unfortunately didn’t manage to push his work forward and instead it grew stale and the PR eventually was closed due to that inactivity later the same year.
Roadmap 2020
In my work trying to go over and figure out what I want to see in curl the coming year and what we (wolfSSL) as a company would like to see being done, MQTT qualified as a contender for the list. See my curl roadmap 2020 video.
It’s happening again
I grabbed Bj"orn’s old pull-request and rebased it onto git master, fixed a few minor conflicts and small cleanups necessary and then brought it further. I documented two of my early sessions on this, live-streamed on twitch. See MQTT in curl and MQTT part two below:
Polish
Bj"orn’s code was an excellent start but didn’t take us all the way.
I wrote an MQTT test server, created a set of test cases, made sure the code worked for those test cases, made it more solid and more. It is still early days and the MQTT support is basic and comes with several caveats, but it’s slowly getting there.
MQTT – really?
When I say that MQTT almost fits the curl concepts and paradigms, I mean that you can consider what an MQTT client does to be “sending” and “receiving” and you can specify that with a URL.
Fetching an MQTT URL with curl means doing SUSCRIBE on a topic and waiting for that to arrive and get the payload sent to the output.
Doing the equivalent of a HTTP POST with curl, like with the command line’s -d option makes an MQTT PUBLISH and sends a payload to a topic.
Rough corners and wrong assumptions
I’m an MQTT rookie. I’m sure there will be mistakes and I will have misunderstood things. The MQTT will be considered experimental for a time forward so that people will get a chance to verify the functionality and we have a chance to change and correct the worst decisions and fatal mistakes. Remember that for experimental features in curl, we reserve ourselves the right to change behavior, API and ABI so nobody should ship such features enabled anywhere without first thinking it through very carefully!
If you’re a person who think MQTT in curl would be useful, good or just fun and you have use cases or ideas where you’d want to use this. Please join in and try and let us know how it works and what you think we should polish or fix to make it truly stellar!
The code is landed in the master branch since PR 5173 was merged. The code will be present in the coming 7.70.0 release, due to ship on April 29 2020.
TODO
As I write this, the MQTT support is still very basic. I want a first version out to users as early as possible as I want to get feedback and comments to help verify that we’re in the right direction and then work on making the support of the protocol more complete. TLS, authentication, QoS and more will come as we proceed. Of course, if you let me know what we must support for MQTT to make it interesting for you, I’ll listen! Preferably, you do the discussions on the curl-library mailing list.
We’ve only just started.
Credits
The initial MQTT patch that kicked us off was written by Bj"orn Stenberg. I brought it forward from there, bug-fixed it, extended it, added a test server and test cases and landed the lot in the master branch.