-Поиск по дневнику

Поиск сообщений в rss_planet_mozilla

 -Подписка по e-mail

 

 -Постоянные читатели

 -Статистика

Статистика LiveInternet.ru: показано количество хитов и посетителей
Создан: 19.06.2007
Записей:
Комментариев:
Написано: 7

Planet Mozilla





Planet Mozilla - https://planet.mozilla.org/


Добавить любой RSS - источник (включая журнал LiveJournal) в свою ленту друзей вы можете на странице синдикации.

Исходная информация - http://planet.mozilla.org/.
Данный дневник сформирован из открытого RSS-источника по адресу http://planet.mozilla.org/rss20.xml, и дополняется в соответствии с дополнением данного источника. Он может не соответствовать содержимому оригинальной страницы. Трансляция создана автоматически по запросу читателей этой RSS ленты.
По всем вопросам о работе данного сервиса обращаться со страницы контактной информации.

[Обновить трансляцию]

Emily Dunham: Are we 'are we' yet?

Пятница, 26 Февраля 2016 г. 11:00 + в цитатник

Are we ‘are we’ yet?

The Rust community, being founded and enjoyed by a variety of Mozilians, seems to have inherited the tradition of tracking top-level progress metrics using are we sites.

  • Are we concurrent yet? tracks the progres of Rust’s concurrency ecosystem
  • Are we web yet? tracks the status of Rust’s HTTP stack, web frameworks, and related libraries
  • Are we IDE yet? provides a list of what features are supported for Rust per IDE, and links to the relevant tracking issues and RFCs

If this blog post was an ‘are we’ page itself, the big text at the top would probably say “Getting There”.

http://edunham.net/2016/02/26/are_we_building_are_we_sites_yet.html


James Long: Moving Breakpoints Intelligently

Пятница, 26 Февраля 2016 г. 03:00 + в цитатник

It all started with a tweet.

Please someone tell me they know how to fix this issue with breakpoints in @FirefoxDevTools pic.twitter.com/4OxaLXzI08

— gregwhitworth (@gregwhitworth) December 3, 2015

In most debuggers, a breakpoint will "slide" if the clicked line doesn't have any code. This is supposed to be a helpful feature, but it becomes infuriating if it behaves wrongly, as seen in the image.

There is no excuse for that happening. When we saw that tweet, we tried to explain it but we also knew that we had to fix it. We've known about this problem and mitigated it with various solutions, but this time I was determined to make it go away completely.

Our initial reaction was just to completely remove breakpoint sliding. It was far too infuriating to justify the feature and maintenance cost. But I felt like this would be too much of a regression; even if it's not that big of a feature, it's something nice that should be done if we know it's safe to do.

Luckily, I figured out a way to safely implement breakpoint sliding so that it only happens when you'd expect it to. This new algorithm will be available in Firefox 46. This post recounts my research from bug 1230345 and explains what's so hard about it. (Read the bug title for how frustrated we were getting.)

How Breakpoints Work

Breakpoints are way more complicated than you think. This is rooted in the fact that, well, executing programs is complicated.

Imagine that you wrote the following code:

1 for(var i=0; i<10; i++) {
2   // Log the value
3   console.log(i); 
4 }  

Now you want to set a breakpoint on line 1. Where does the JavaScript engine set the breakpoint? There are multiple "entry points" on that line: the initial entry, the i<10 check, and the i++ expression. The line can be "re-entered" at various times in your program.

First you must realize that at the lowest of levels, the JavaScript engine is executing this as bytecode (ignoring optimized JIT-ed modes). Bytecodes are mapped to a line and column in the original source (although sometimes it's not even clear where it maps back to). We can tell the engine to notify us whenever any bytecode is run by setting "breakpoints" on bytecode (our handler will do the engine pausing). So we need to insert multiple breakpoints in all the places that a line can be "entered".

The SpiderMonkey debugger API has a nice function called getLineOffsets that returns all the bytecode instruction offsets that represent entry points for a specific line. Using this, we can map over all these offsets and call setBreakpoint with each offset and we will be notified whenever that line is hit, no matter which part of it.

That's all well and good. What if I set a breakpoint on line 2 instead? It's just a comment and there is no actual code on that line, so we won't get any bytecode offsets. This is when we want to try to slide the breakpoint to "help" the user (and potentially infuriating them).

Here's a simple algorithm for doing that. Assume L is the line we are trying to set a breakpoint on:

  1. If L is greater than the number of lines in the script, stop
  2. Try to set a breakpoint in line L
  3. If bytecode offsets exist, set breakpoints on all of them and stop
  4. Otherwise, L = L+1 go to 1

We simply walk forward through the script until we find a line with bytecodes to actually set breakpoints on.

Script Lifetimes

Take a moment and think: are there any problems with the above algorithm? I will be very impressed if you guessed it right, because it's quite subtle.

There's a very important thing at play here: script lifetimes. To explain this, we need to explain the difference between a SpiderMonkey "script" and "source". A "source" represents an entire JavaScript unit (a file, eval-ed code, etc), while all functions within it are represented as "scripts".

1 var x = 1;
2
3 setTimeout(function() {
4  console.log("hi");
5 }, 1000);
6
7 function foo() {
8  return 5;
9 }

The above code is 1 source, but has 2 scripts: foo and the anonymous callback. There is actually a 3rd script that represents the top-level code (which is everything in the file), but don't worry about that. A script is not a function instance, it literally represents the set of bytecodes to run the code. Multiple function instances may exist from a single script.

Now here's the important part: scripts can be garbage collected. The anonymous function above? It's gonna be gone after a few GCs because once it executes, nothing holds a reference to it.

And guess what! Once a script is GCed, it's as if it never existed. If we try to set a breakpoint on line 4, we won't get any bytecode offsets! And since our sliding algorithm is so na"ive, it'll walk forward through the script until it finds some. Guess where that is? Line 8, in a completely different function!

We've always known this, but we've tried various heuristics that failed under certain circumstances. We need a way to make sure that we only slide if there is not and never will be a function on a line. (It's useful to set a "pending" breakpoint on a line with no code because when you refresh it will hit the breakpoint. I won't go into pending breakpoints here.)

It gets more complex when you consider nested scripts. Scripts can be arbitrarily nested and we don't want to slide across nested scripts either.

It turns out there's a simple property of our script objects that we can use to determine when to do breakpoint sliding. Let's take a look at code with nested scripts:

1 function foo() {
2   setTimeout(function() {
3     // Say hi!
4     console.log("hi");
5   }, 1000);
6  
7   return 5;
8 }
9
10 (function() {
11   var x = 10;
12  
13   window.bar = function () {
14     // Do somethin'
15     return x;
16   }
17 })()

There are several potential pitfalls here: we don't want to slide from line 3 to 7 (because the function passed to setTimeout is GCed), and the same for 11 to 15 (because the self-executing function was GCed, but bar is not because it's attached to window).

These scripts are nested, and SpiderMonkey's script objects have properties which represent this nesting. For example, script.parent will return the parent script. Fortuntely for us, this has a very important property: parent scripts always keep their child scripts alive. (Conversely, child scripts do not keep their parents alive.)

If a parent script has not been GCed, we know that all of its child scripts are alive as well. Let's take a look at what the above code looks like after everything has been GCed:

function foo() { setTimeout(function() { // Say hi! console.log("hi"); }, 1000); return 5; }
(function() { var x = 10;
window.bar = function () { // Do somethin' return x; }
})()

The red blocks represent live code, and everything else has been GCed. All the lines in the red blocks have scripts associated with them; even if we can't find code on a specific line, we can check if there are live scripts for that line. Note that the anonymous function passed to setTimeout is still alive! Although it looks like nothing holds a reference to it, the parent function is keeping it alive.

This means that we can modify our breakpoint sliding algorithm with a simple step: only slide if at least 1 script exists on the line, and only consider the lines that the script covers. We will only slide within the red blocks above, and nowhere else. That means we will never slide in the global scope [1], no matter if it truly is global scope or if it's a function that previously existed but has been GCed.

Note in the original gif that the code is executing in the global scope.

Sourcemaps and Columns

That's not the only case to consider. A considerably more complex case is sourcemapping.

Sourcemapping breakpoints is very complicated. Remember how we needed to set breakpoints on all entry points for a single line? With sourcemaps, a single line can map to several lines in the generated code, so now we also need to set breakpoints across all those lines.

For example, let's take some basic JavaScript code:

for(var i=0; i<10; i++) {
  console.log("hi!");
}

And assume that, for some reason, you really wanted to write a babel plugin to break this up. I don't know, maybe you really like code on multiple lines:

1 var _initial_i = 0;
2 for(var i = _initial_i;
3         i<10;
4         i++) {
5   console.log("hi!");
6 }

If you are debugging the original code, if you set a breakpoint on the for loop we need to set breakpoints on all lines 1-4, because a single line actually runs across all of those.

Because of this, it becomes more ambiguous when and where to slide breakpoints. We decided that the additional complexity for sliding with sourcemaps is not worth the potential infuriation, so we completely removed sliding breakpoints when sourcemapping. We feel this is the right decision because we made multiple attempts at it and it never worked great enough.

Column breakpoints are yet another use case, and previously we attempted to even slide column breakpoints, but due to various ambiguities we removed column breakpoint sliding as well.

Setting breakpoints in the Firefox debugger should be a lot more stable these days, particularly because of these changes which will be available in Firefox 46. Hopefully it does what you expect it to, and if not, please file a bug!

Thank you Greg for complaining! Even if it's critical, we need to know what pains you. You can get things fixed by complainging about it! As long you keep it respectful, you should be vocal about how projects can improve.

[1] This isn't entirely true; there is a script that represents top-level code which gets GCed after a source runs, so we will slide at the top-level if that script has not been GCed yet, but it's still safe because of the parent-keeping-child-scripts alive property. Unfortunately, that means that sometimes it will slide and sometimes it won't, but that's the nature of working with a GC.

http://tracking.feedpress.it/link/9494/2699845


The Mozilla Blog: Mozilla Introduces Surveillance Principles for a Secure, Trusted Internet

Четверг, 25 Февраля 2016 г. 22:33 + в цитатник

#encryption

Security is paramount to a trusted Internet. Encryption is a critical part of how that trust is made real. The recent events around Apple and the FBI set a dangerous precedent. Our position on these issues is simple: the FBI should not be able to require a technology company to create code that “undoes” years of security enhancements by creating additional vulnerabilities.

Even when legitimate, government surveillance can cause massive harm to user security and the Internet. Governments don’t always take this harm into account when conducting their surveillance activities. The Apple case is just the latest example. We propose that governments adopt basic principles that guide the scope of their surveillance activities, balancing their legitimate needs with the broader good:

  • User Security: Governments need to strengthen user security, including the best encryption, not weaken it.
  • Minimal Impact: Government surveillance should minimize impact on user trust and security.
  • Accountability: Surveillance activities need empowered, independent, and transparent oversight.

These principles were not proposed in a vacuum. At Mozilla, we believe that user privacy and security is fundamental, that the Internet is a global public resource, and that transparent processes promote trust and accountability. Those ideas shouldn’t just apply to the way Mozilla builds its products. They can help all of us, including governments, create a safer, more trusted Internet.

So what can you do? Help advocate by being a voice for these principles. As a member of the public, talk about these issues (#encryption), share the principles and encourage your policymakers and governments to get serious about protecting users from the harms of surveillance. If you are a policymaker, you can go even further by implementing basic principles that help us all create a more secure and trusted Internet.

https://blog.mozilla.org/blog/2016/02/25/mozilla-introduces-surveillance-principles-for-a-secure-trusted-internet-2/


Support.Mozilla.Org: What’s up with SUMO – 25th February

Четверг, 25 Февраля 2016 г. 21:54 + в цитатник

Hello, SUMO Nation!

February is almost over, get ready for March! Spring is coming… and so are the latest updates from the world of SUMO.

Welcome, new contributors!

If you just joined us, don’t hesitate – come over and say “hi” in the forums!

Contributors of the week

Don’t forget that if you are new to SUMO and someone helped you get started in a nice way you can nominate them for the Buddy of the Month!

Most recent SUMO Community meeting

The next SUMO Community meeting…

  • …is happening on WEDNESDAY the 2nd of March – join us!
  • Reminder: if you want to add a discussion topic to the upcoming meeting agenda:
    • Start a thread in the Community Forums, so that everyone in the community can see what will be discussed and voice their opinion here before Monday (this will make it easier to have an efficient meeting).
    • Please do so as soon as you can before the meeting, so that people have time to read, think, and reply (and also add it to the agenda).
    • If you can, please attend the meeting in person (or via IRC), so we can follow up on your discussion topic during the meeting with your feedback.

Developers

Community

  • Brazilian SUMO members are getting ready to rock the spring – more details soon, thanks to Marco Aur'elio!
  • …so are our Tunisian friends, under the expert guidance of Synergy
  • Ongoing reminder: if you want to write a guest blog post for the SUMO blog, let us know in the comments! Two posts are in the pipeline, but we do have a lot of space for yours, trust me.
  • Ongoing reminder: if you think you can benefit from getting a second-hand device to help you with contributing to SUMO, you know where to find us.

Social

Support Forum

  • *a few more crickets*

Knowledge Base

Localization

100pour100SUMOfevrier2016

sumo_l10n_mozilla

Firefox

And that’s it for this week! So… Any plans for March? Any suggestions for a great way to get out of the winter slowness and get a good start into spring freshness? Let us know in the comments! SEE YOU ON WEDNESDAY AT OUR COMMUNITY MEETING!

https://blog.mozilla.org/sumo/2016/02/25/whats-up-with-sumo-25th-february/


Air Mozilla: Building Products with Partners: Interview with Slack's April Underwood

Четверг, 25 Февраля 2016 г. 21:00 + в цитатник

Building Products with Partners: Interview with Slack's April Underwood April Underwood, head of all product & partnerships at Slack, will draw from her experiences at Google, Twitter, Travelocity and more to help us navigate...

https://air.mozilla.org/building-products-with-partners-interview-with-slacks-april-underwood/


Air Mozilla: Web QA Weekly Meeting, 25 Feb 2016

Четверг, 25 Февраля 2016 г. 20:00 + в цитатник

Web QA Weekly Meeting This is our weekly gathering of Mozilla'a Web QA team filled with discussion on our current and future projects, ideas, demos, and fun facts.

https://air.mozilla.org/web-qa-weekly-meeting-20160225/


Andrew Truong: Experience, Learn, Revitalize and Share: Elementary School Bullying

Четверг, 25 Февраля 2016 г. 19:50 + в цитатник
Bullying, as we know, takes place in multiple forms. However, the way bullying takes place starting from elementary and moving on towards high school is different, as our brain and body develop. The level of understanding, processing, and experience of bullying is different throughout the multiple stages of schooling. The way that bullying comes about and originates differs upon how the victim is able to absorb or defend the bullying. As children, the way we saw the world was completely different when compared to how we see the world now. As we progress, we develop many different ways to think and react differently, and also to cope differently.

Keep this in mind as I discuss my journey over the next couple of blog posts. In Alberta, Canada, Elementary is defined as K-6, Junior high as 7-9 and High School as 10-12 in terms of grade levels.

When bullying occurs, we are encouraged to tell our teachers and/or parents. The result of that are discussions that take place between students, parents and teachers. Such that, we are told that it isn't nice, it's not something we should do and that if it does occur, to report it. However, one thing that they didn't teach us or inform us about is retaliation. It is possible, that at a young age, we may not fully understand what retaliation is, but to discuss this concept with children, I believe, is very crucial. We all know that in elementary, we move up every grade with the same classmates, if not, that's how it occurred at my school. By about grade 2 to 3, I was picked on, so the opportunity to be bullied again by the same individuals was present. Believe me, I was not the only victim. They also alluded to name calling and assigning those names to certain classmates. For example, 'tattle tale' was being overused in the class by so many, yet the teachers didn't do anything to stop it as they told themselves that we were just being kids. It didn't break me, nor did it strike me to act this time to report it, as I found my classmates to actually be accepting, they worked with me, and last but not least, they agreed with me at times. That feeling of acceptance and acknowledgement was all I needed. It really was all I needed because, by about grade 4, I learned to ignore the hateful things that certain individuals spewed not only towards me but towards others as well. Teachers were present and nearby in many cases where such acts took place; they obviously were not deaf to what some students had said as they would even look over at times. In the end, they chose to let it happen naturally. I felt at times, if a teacher had stepped in and actually and said something, it would've shown that they cared for their students and that they were there to help us if something similar were to occur again in the future. In the end, by ignoring all the nasty things that these individuals have said, I regained control and composure of myself. But, only for a short term.

Moving forward to grade 5 and 6, they changed. Could this be a form of retaliation? The way that I was treated by other classmates was evidently troublesome, as I wasn't someone that they would come find to talk to, I wasn't someone that they could look up to, and I wasn't somebody they treated nicely like the rest (yes, life isn't fair). I was actually being used, I was abused for how I chose to be myself, and, later on, I was excluded from the group I was most close with. This may have happened naturally, but at the same time, they showed signs of hate. However, there is this one day that I will never forget and I just won't. There was a huge traffic jam on the north side of the city that day and I arrived late at school. I walked into the room and the unexpected, yet expected happened. I was asked by this person (a classmate at the time) why I was late, and I told him the reason. He then went on to say: "why did you even come/ show up?". I was intrigued by it but as I built a tolerance to bullying, I let it go. An important thing to note is that the teacher was in the same room and nearby. Somehow, they let slide as if they hadn't heard it. I had a feeling that something similar would occur as I was being driven to school. How did I have that feeling? That's simply because I was abused, mistreated and from their past actions of bullying, I was able to predict what could happen in future scenarios. The takeaway from this is: why do teachers not step in to say something? Why treat it as 'they are just kids'?

So, that's that. That was my elementary life. I graduated elementary school with people I was abused and bullied by. 

While this did take place years ago, I am fully aware that teachers have gone through professional development and have attended conventions, where more emphasis is placed on the topic of bullying. In my opinion, what's being done is still isn't enough. To this day, bullying still occurs, reported or unreported. The discussions that take place between students may not be in-depth or thorough. However, I am delighted that junior highs and high schools have taken the role of informing and talking to students about bullying, especially on days like Pink Shirt day. For now, take notice of how bullying is handled in elementary, take notice of how easily it is brushed aside, and take notice of how little the teachers actually care as they aren't able to decipher when a student is being bullied. As always, there are two sides to a story, or there is information that is left out, but in retrospect, the pieces that I left out, don't affect what happened at the end of that day.

Authors note: I know I did not discuss in depth about the idea of retaliation. This will take place in my future blog posts.

http://feer56.blogspot.com/2016/02/experience-learn-revitalize-and-share.html


Niko Matsakis: Parallel Iterators Part 2: Producers

Четверг, 25 Февраля 2016 г. 19:02 + в цитатник

This post is the second post in my series on Rayon’s parallel iterators. The goal of this series is to explain how parallel iterators are implemented internally, so I’m going to be going over a lot of details and giving a lot of little code examples in Rust. If all you want to do is use parallel iterators, you don’t really have to understand any of this stuff.

I’ve had a lot of fun designing this system, and I learned a few lessons about how best to use Rust (some of which I cover in the conclusions). I hope you enjoy reading about it!

This post is part 2 of a series. In the initial post I covered sequential iterators, using this dot-product as my running example:

1
2
3
4
vec1.iter()
    .zip(vec2.iter())
    .map(|(i, j)| i * j)
    .sum()

In this post, we are going to take a first stab at extending sequential iterators to parallel computation, using something I call parallel producers. At the end of the post, we’ll have a system that can execute that same dot-product computation, but in parallel:

1
2
3
4
vec1.par_iter()
    .zip(vec2.par_iter())
    .map(|(i, j)| i * j)
    .sum()

Parallel producers are very cool, but they are not the end of the story! In the next post, we’ll cover parallel consumers, which build on parallel producers and add support for combinators which produce a variable number of items, like filter or flat_map.

Parallel Iteration

When I explained sequential iterators in the previous post, I sort of did it bottom-up: I started with how to get an iterator from a slice, then showed each combinator we were going to use in turn (zip, map), and finally showed how the sum operation at the end works.

To explain parallel iterators, I’m going to work in the opposite direction. I’ll start with the high-level view, explaining the ParallelIterator trait and how sum works, and then go look at how we implement the combinators. This is because the biggest difference in parallel iterators is actually the end operations, like sum, and not as much the combinators (or at least that is true for the combinators we’ll cover in this post).

In Rayon, the ParallelIterator traits are divided into a hierarchy:

  • ParallelIterator: any sort of parallel iterator.
  • BoundedParallelIterator: ParallelIterator: a parallel iterator that can give an upper-bound on how many items it will produce, such as filter.
  • ExactParallelIterator: BoundedParallelIterator: a parallel iterator that knows precisely how many items will be produced.
  • IndexedParallelIterator: ExactParallelIterator: a parallel iterator that can produce the item for a given index without producing all the previous items. A parallel iterator over a vector has this propery, since you can just index into the vector.
    • (In this post, we’ll be focusing on parallel iterators in this category. The next post will discuss how to handle things like filter and flat_map, where the number of items being iterated over cannot be known in advance.)

Like sequential iterators, parallel iterators represent a set of operations to be performed (but in parallel). You can use combinators like map and filter to build them up – doing so does not trigger any computation, but simply produces a new, extended parallel iterator. Finally, once you have constructed a parallel iterator that produces the values you want, you can use various operation methods like sum, reduce, and for_each to actually kick off execution.

This is roughly how the parallel iterator traits are defined:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
trait ParallelIterator {
    type Item;

    // Combinators that produce new iterators:
    fn map(self, ...);
    fn filter(self, ...);   // we'll be discussing these...
    fn flat_map(self, ...); // ...in the next blog post

    // Operations that process the items being iterated over:
    fn sum(self, ...);
    fn reduce(self, ...);
    fn for_each(self, ...);
}

trait BoundedParallelIterator: ParallelIterator {
}

trait ExactParallelIterator: BoundedParallelIterator {
    fn len(&self) -> usize; // how many items will be produced
}

trait IndexedParallelIterator {
    // Combinators:
    fn zip(self, ...);
    fn enumerate(self, ...);

    // Operations:
    fn collect(self, ...);
    fn collect_into(self, ...);

    // I'll come to this one shortly :)
    fn with_producer<CB>(self, callback: CB)
        where CB: ProducerCallback<Self::Item>;
}

These look superficially similar to the sequential iterator traits, but you’ll notice some differences:

  • Perhaps most importantly, there is no next method! If you think about it, drawing the next item from an iterator is an inherently sequential notion. Instead, parallel iterators emphasize high-level operations like sum, reduce, collect, and for_each, which are then automatically distributed to worker threads.
  • Parallel iterators are much more sensitive to being indexable than sequential ones, so some combinators like zip and enumerate are only possible when the underlying iterator is indexed. We’ll discuss this in detail when covering the zip combinator.

Implementing sum with producers

One thing you may have noticed with the ParallelIterator traits is that, lacking a next method, there is no way to get data out of them! That is, we can build up a nice parallel iterator, and we can call sum (or some other high-level method), but how do we implement sum?

The answer lies in the with_producer method, which provides a way to convert the iterator into a producer. A producer is kind of like a splittable iterator: it is something that you can divide up into little pieces and, eventually, convert into a sequential iterator to get the data out. The trait definition looks like this:

1
2
3
4
5
trait Producer: IntoIterator {
    // Divide into two producers, one of which produces data
    // with indices `0..index` and the other with indices `index..`.
    fn split_at(self, index: usize) -> (Self, Self);
}

Using producers, we can implement a parallel version of sum based on a divide-and-conquer strategy. The idea is that we start out with some producer P and a count len indicating how many items it will produce. If that count is too big, then we divide P into two producers by calling split_at and then recursively sum those up (in parallel). Otherwise, if the count is small, then we convert P into an iterator and sum it up sequentially. We can convert to an iterator by using the into_iter method from the IntoIterator trait, which Producer extends. Here is a parallel version of sum that works for any producer (as with the sequential sum we saw, we simplify things by making it only word for i32 values):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
fn sum_producer<P>(mut producer: P, len: usize) -> i32
    where P: Producer<Item=i32>
{
    if len > THRESHOLD {
        // Input too large: divide it up
        let mid = len / 2;
        let (left_producer, right_producer) =
            iter.split_at(mid);
        let (left_sum, right_sum) =
            rayon::join(
                || sum_producer(left_producer, mid),
                || sum_producer(right_producer, len - mid));
        left_sum + right_sum
    } else {
        // Input too small: sum sequentially
        let mut sum = 0.0;
        for value in producer {
            sum += value;
        }
        sum
    }
}

(The actual code in Rayon most comparable to this is called bridge_producer_consumer; it uses the same basic divide-and-conquer strategy, but it’s generic with respect to the operation being performed.)

Ownership, producers, and iterators

You may be wondering why I introduced a separate Producer trait rather than just adding split_at directly to one of the ParallelIterator traits? After all, with a sequential iterator, you just have one trait, Iterator, which has both composition methods like map and filter as well as next.

The reason has to do with ownership. It is very common to have shared resources that will be used by many threads at once during the parallel computation and which, after the computation is done, can be freed. We can model this easily by having those resources be owned by the parallel iterator but borrowed by the producers, since the producers only exist for the duration of the parallel computation. We’ll see an example of this later with the closure in the map combinator.

Implementing producers

When we looked at sequential iterators, we saw three impls: one for slices, one for zip, and one for map. Now we’ll look at how to implement the Producer trait for each of those same three cases.

Slice producers

Here is the code to implement Producer for slices. Since slices already support the split_at method, it is really very simple.

1
2
3
4
5
6
7
8
9
10
11
12
13
pub struct SliceProducer<'iter, T: 'iter> {
    slice: &'iter [T],
}

impl<'iter, T> Producer for SliceProducer<'iter, T> {
    // Split-at can just piggy-back on the existing `split_at`
    // method for slices.
    fn split_at(self, mid: usize) -> (Self, Self) {
        let (left, right) = self.slice.split_at(mid);
        (SliceProducer { slice: left },
         SliceProducer { slice: right })
    }
}

We also have to implement IntoIterator for SliceProducer, so that we can convert to sequential execution. This just builds on the slice iterator type SliceIter that we saw in the initial post (in fact, for the next two examples, I’ll just skip over the IntoIterator implementations, because they’re really quite straightforward):

1
2
3
4
5
6
7
impl<'iter, T> IntoIterator for SliceProducer<'iter, T> {
    type Item = &'iter T;
    type IntoIter = SliceIter<'iter, T>;
    fn into_iter(self) -> SliceIter<'iter, T> {
        self.slice.iter()
    }
}
Zip producers

Here is the code to implement the zip producer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pub struct ZipProducer<A: Producer, B: Producer> {
    a: A,
    b: B
}

impl<A, B> Producer for ZipProducer<A, B>
    where A: Producer, B: Producer,
{
    type Item = (A::Item, B::Item);

    fn split_at(self, mid: usize) -> (Self, Self) {
        let (a_left, a_right) = self.a.split_at(mid);
        let (b_left, b_right) = self.b.split_at(mid);
        (ZipProducer { a: a_left, b: b_left },
         ZipProducer { a: a_right, b: b_right })
    }
}

What makes zip interesting is split_at – and I don’t mean the code itself, which is kind of obvious, but rather the implications of it. In particular, if we’re going to walk two iterators in lock-step and we want to be able to split them into two parts, then those two parts need to split at the same point, so that the items we’re walking stay lined up. This is exactly why the split_at method in the Producer takes a precise point where to perform the split.

If it weren’t for zip, you might imagine that instead of split_at you would just have a function like split, where the producer gets to pick the mid point:

1
fn split(self) -> (Self, Self);

But if we did this, then the two producers we are zipping might pick different points to split, and we wouldn’t get the right result.

The requirement that a producer be able to split itself at an arbitrary point means that some iterator combinators cannot be accommodated. For example, you can’t make a producer that implements the filter operation. After all, to produce the next item from a filtered iterator, we may have to consume any number of items from the base iterator before the filter function returns true – we just can’t know in advance. So we can’t expect to split a filter into two independent halves at any precise point. But don’t worry: we’ll get to filter (as well as the more interesting case of flat_map) later on in this blog post series.

Map producers

Here is the type for map producers.

1
2
3
4
5
6
7
pub struct MapProducer<'m, PROD, MAP_OP, RET>
    where PROD: Producer,
          MAP_OP: Fn(PROD::Item) -> RET + Sync + 'm,
{
    base: P,
    map_op: &'m MAP_OP
}

This type definition is pretty close to the sequential case, but there are a few crucial differences. Let’s look at the sequential case again for reference:

1
2
3
4
5
6
7
8
// Review: the sequential map iterator
pub struct MapIter<ITER, MAP_OP, RET>
    where ITER: Iterator,
          MAP_OP: FnMut(ITER::Item) -> RET,
{
    base: ITER,
    map_op: MAP_OP
}

All of the differences between the (parallel) producer and the (sequential) iterator are due to the fact that the map closure is now something that we plan to share between threads, rather than using it only on a single thread. Let’s go over the differences one by one to see what I mean:

  • MAP_OP implements Fn, not FnMut:
    • The FnMut trait indicates a closure that receives unique, mutable access to its environment. That makes sense in a sequential setting, but in a parallel setting there could be many threads executing map at once. So we switch to the Fn trait, which only gives shared access to the environment. This is part of the way that Rayon can statically prevent data races; I’ll show some examples of that later on.
  • MAP_OP must be Sync:
    • The Sync trait indicates data that can be safely shared between threads. Since we plan to be sharing the map closure across many threads, it must be Sync.
  • the field map_op contains a reference &MAP_OP:
    • The sequential map iterator owned the closure MAP_OP, but the producer only has a shared reference. The reason for this is that the producer needs to be something we can split into two – and those two copies can’t both own the map_op, they need to share it.

Actually implementing the Producer trait is pretty straightforward. It looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
impl<'m, PROD, MAP_OP, RET> Producer for MapProducer<'m, PROD, MAP_OP, RET>
    where PROD: Producer,
          MAP_OP: Fn(PROD::Item) -> RET + Sync + 'm,
{
    type Item = RET;

    fn split_at(self, mid: usize) -> (Self, Self) {
        let (left, right) = self.base.split_at(mid);
        (MapProducer { base: left, map_op: self.map_op },
         MapProducer { base: left, map_op: self.map_op })
    }
}

Whence it all comes

At this point we’ve seen most of how parallel iterators work:

  1. You create a parallel iterator by using the various combinator methods and so forth.
  2. When you invoke a high-level method like sum, sum will convert the parallel iterator into a producer.
  3. sum then recursively splits this producer into sub-producers until they represent a reasonably small (but not too small) unit of work. Each sub-producer is processed in parallel using rayon::join.
  4. Eventually, sum converts the producer into an iterator and performs that work sequentially.

In particular, we’ve looked in detail at the last two steps. But we’ve only given the first two a cursory glance. Before I finish, I want to cover how one constructs a parallel iterator and converts it to a producer – it seems simple, but the setup here is something that took me a long time to get right. Let’s look at the map combinator in detail, because it exposes the most interesting issues.

Defining the parallel iterator type for map

Let’s start by looking at how we define and create the parallel iterator type for map, MapParIter. The next section will dive into how we convert this type into the MapProducer we saw before.

Instances of the map combinator are created when you call map on some other, pre-existing parallel iterator. The map method itself simply creates an instance of MapParIter, which wraps up the base iterator self along with the mapping operation map_op:

1
2
3
4
5
6
7
8
9
10
trait ParallelIterator {
    type Item;

    fn map<MAP_OP, RET>(self, map_op: MAP_OP)
                       -> MapParIter<Self, MAP_OP, RET>
        where MAP_OP: Fn(Self::Item) -> RET + Sync,
    {
        MapParIter { base: self, map_op: map_op }
    }
}

The MapParIter struct is defined like so:

1
2
3
4
5
6
7
pub struct MapParIter<ITER, MAP_OP, RET>
    where ITER: ParallelIterator,
          MAP_OP: Fn(ITER::Item) -> RET + Sync,
{
    base: ITER,
    map_op: MAP_OP
}

The parallel iterator struct bears a strong resemblance to the producer struct (MapProducer) that we saw earlier, but there are some important differences:

  1. The base is another parallel iterator of type ITER, not a producer.
  2. The closure map_op is owned by the parallel iterator.

During the time when the producer is active, the parallel iterator will be the one that owns the shared resources (in this case, the closure) that the various threads need to make use of. Therefore, the iterator must outlive the entire high-level parallel operation, so that the data that those threads are sharing remains valid.

Of course, we must also implement the various ParallelIterator traits for MapParIter. For the basic ParallelIterator this is straight-forward:

1
2
3
4
5
6
impl<ITER, MAP_OP, RET> ParallelIterator for MapParIter<ITER, MAP_OP, RET>
    where ITER: ParallelIterator,
          MAP_OP: Fn(ITER::Item) -> RET + Sync,
{
    ...
}

When it comes to the more advanced classifications, such as BoundedParallelIterator or IndexedParallelIterator, we can’t say unilaterally whether maps qualify or not. Since maps produce one item for each item of the base iterator, they inherit their bounds from the base producer. If the base iterator is bounded, then a mapped version is also bounded, and so forth. We can reflect this by tweaking the where-clauses so that instead of requiring that ITER: ParallelIterator, we require that ITER: BoundedParallelIterator and so forth:

1
2
3
4
5
6
7
8
9
10
11
12
13
impl<ITER, MAP_OP, RET> BoundedParallelIterator for MapParIter<ITER, MAP_OP, RET>
    where ITER: BoundedParallelIterator,
          MAP_OP: Fn(ITER::Item) -> RET + Sync,
{
    ...
}

impl<ITER, MAP_OP, RET> IndexedParallelIterator for MapParIter<ITER, MAP_OP, RET>
    where ITER: IndexedParallelIterator,
          MAP_OP: Fn(ITER::Item) -> RET + Sync,
{
    ...
}

Converting a parallel iterator into a producer

So this brings us to the question: how do we convert a MapParIter into a MapProducer? My first thought was to have a method like into_producer as part of the IndexedParallelIterator trait:

1
2
3
4
5
6
// Initial, incorrect approach:
pub trait IndexedParallelIterator {
    type Producer;

    fn into_producer(self) -> Self::Producer;
}

This would then be called by the sum method to get a producer, which we could pass to the sum_producer method we wrote earlier. Unfortunately, while this setup is nice and simple, it doesn’t actually get the ownership structure right. What happens is that ownership of the iterator passes to the into_producer method, which then returns a producer – so all the resources owned by the iterator must either be transfered to the producer, or else they will be freed when into_producer returns. But it often happens that we have shared resources that the producer just wants to borrow, so that it can cheaply split itself without having to track ref counts or otherwise figure out when those resources can be freed.

Really the problem here is that into_producer puts the caller in charge of deciding how long the producer lives. What we want is a way to get a producer that can only be used for a limited duration. The best way to do that is with a callback. The idea is that instead of calling into_producer, and then having a producer returned to us, we will call with_producer and pass in a closure as argument. This closure will then get called with the producer. This producer may have borrowed references into shared state. Once the closure returns, the parallel operation is done, and so that shared state can be freed.

The signature looks like this:

1
2
3
4
5
trait IndexedParallelIterator {
    ...
    fn with_producer<CB>(self, callback: CB)
        where CB: ProducerCallback<Self::Item>;
}

Now, if you know Rust well, you might be surprised here. I said that with_producer takes a closure as argument, but typically in Rust a closure is some type that implements one of the closure traits (probably FnOnce, in this case, since we only plan to do a single callback). Instead, I have chosen to use a custom trait, ProducerCallback, defined as follows:

1
2
3
4
5
trait ProducerCallback<ITEM> {
    type Output;
    fn callback<P>(self, producer: P) -> Self::Output
        where P: Producer<Item=ITEM>;
}

Before I get into the reason to use a custom trait, let me just show you how one would implement with_producer for our map iterator type (actually, this is a simplified version, I’ll revisit this example in a bit to show the gory details):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
impl IndexedParallelIterator for MapParIter<ITER, MAP_OP, RET>
    where ITER: ParallelIterator,
          MAP_OP: Fn(ITER::Item) -> RET + Sync
{
    fn with_producer<CB>(self, callback: CB)
        where CB: ProducerCallback<Self::Item>
    {
        let base_producer = /* convert base iterator into a
                               producer; more on this below */;
        let map_producer = MapProducer {
            base: base_producer,
            map_op: &self.map_op, // borrow the map op!
        };
        callback.callback(map_producer);
    }
}

So why did I choose to define a ProducerCallback trait instead of using FnOnce? The reason is that, by using a custom trait, we can make the callback method generic over the kind of producer that will be provided. As you can see below, the callback method just says it takes some producer type P, but it doesn’t get more specific than that:

1
2
3
4
5
fn callback<P>(self, producer: P) -> Self::Output
    where P: Producer<Item=ITEM>;
    //    ^~~~~~~~~~~~~~~~~~~~~~
    //
    // It can be called back with *any* producer type `P`.

In contrast, if I were to use a FnOnce trait, I would have to write a bound that specifies the producer’s type (even if it does so through an associated type). For example, to use FnOnce, we might change the IndexedParallelIterator trait as follows:

1
2
3
4
5
6
7
8
9
10
11
12
trait IndexedParallelIteratorUsingFnOnce {
    type Producer: Producer<Self::Item>;
    //   ^~~~~~~~
    //
    // The type of producer this iterator creates.

    fn with_producer<CB>(self, callback: CB)
        where CB: FnOnce(Self::Producer);
        //               ^~~~~~~~~~~~~~
        //
        // The callback can expect a producer of this type.
}

(As an aside, it’s conceivable that we could add the ability to write where clauses like CB: for FnOnce(P), which would be the equivalent of the custom trait, but we don’t have that. If you’re not familiar with that for notation, that’s fine.)

You may be wondering what it is so bad about adding a Producer associated type. The answer is that, in order for the Producer to be able to contain borrowed references into the iterator, its type will have to name lifetimes that are internal to the with_producer method. This is because the the iterator is owned by the with_producer method. But you can’t write those lifetime names as the value for an associated type. To see what I mean, imagine how we would write an impl for our modified IndexedParallelIteratorUsingFnOnce trait:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
impl<ITER, MAP_OP, RET> IndexedParallelIteratorUsingFnOnce
    for MapParIter<ITER, MAP_OP, RET>
    where ITER: IndexedParallelIteratorUsingFnOnce,
          MAP_OP: Fn(ITER::Item) -> RET + Sync,
{
    type Producer = MapProducer<'m, ITER::Producer, MAP_OP, RET>;
    //                          ^~
    //
    // Wait, what is this lifetime `'m`? This is the lifetime for
    // which the `map_op` is borrowed -- but that is some lifetime
    // internal to `with_producer` (depicted below). We can't
    // name lifetimes from inside of a method from outside of that
    // method, since those names are not in scope here (and for good
    // reason: the method hasn't "been called" here, so it's not
    // clear what we are naming).

    fn with_producer<CB>(self, callback: CB)
        where CB: FnOnce(Self::Producer)
    {
        self.base.with_producer(|base_producer| {
            let map_producer = MapProducer { // +----+ 'm
                base: base_producer,         //      |
                map_op: &self.map_op,        //      |
            };                               //      |
            callback(map_producer);          //      |
        })                                   // <----+

    }
}

Using the generic ProducerCallback trait totally solves this problem, but it does mean that writing code which calls with_producer is kind of awkward. This is because we can’t take advantage of Rust’s builtin closure notation, as I was able to do in the previous, incorrect example. This means we have to desugar the closure manually, creating a struct that will store our environment. So if we want to see the full gory details, implementing with_producer for the map combinator looks like this (btw, here is the actual code from Rayon):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
impl IndexedParallelIterator for MapParIter<ITER, MAP_OP, RET>
    where ITER: ParallelIterator,
          MAP_OP: Fn(ITER::Item) -> RET + Sync
{
    fn with_producer<CB>(self, callback: CB)
        where CB: ProducerCallback<RET>
    {
        let my_callback = MyCallback { // defined below
            callback: callback,
            map_op: &self.map_op,
        };

        self.base.with_producer(my_callback);

        struct MyCallback<'m, MAP_OP, CB> {
            //          ^~
            //
            // This is that same lifetime `'m` we had trouble with
            // in the previous example: but now it only has to be
            // named from *inside* `with_producer`, so we have no
            // problems.

            callback: CB,
            map_op: &'m MAP_OP
        }

        impl<'m, ITEM, MAP_OP, CB> ProducerCallback<ITEM> for MyCallback<'m, MAP_OP, CB>
            where /* omitted for "brevity" :) */
        {
            type Output = (); // return type of `callback`

            // The method that `self.base` will call with the
            // base producer:
            fn callback<P>(self, base_producer: P)
                where P: Producer<Item=ITEM>
            {
                // Wrap the base producer in a MapProducer.
                let map_producer = MapProducer {
                   base: base_producer,
                   map_op: self.map_op,
                };

                // Finally, callback the original callback,
                // giving them out `map_producer`.
                self.callback.callback(map_producer);
            }
        }
    }
}

Conclusions

OK, whew! We’ve now covered parallel producers from start to finish. The design you see here did not emerge fully formed: it is the result of a lot of iteration. This design has some nice features, many of which are shared with sequential iterators:

  • Efficient fallback to sequential processing. If you are processing a small amount of data, we will never bother with splitting the producer, and we’ll just fallback to using the same old sequential iterators you were using before, so you should have very little performance loss. When processing larger amounts of data, we will divide into threads – which you want – but when the chunks get small enough, we’ll use the same sequential processing to handle the leaves.
  • Lazy, no allocation, etc. You’ll note that nowhere in any of the above code did we do any allocation or eager computation.
  • Straightforward, no unsafe code. Something else that you didn’t see in this blog post: unsafe code. All the unsafety is packaged up in Rayon’s join method, and most of the parallel iterator code just leverages that. Overall, apart from the manual closure desugaring in the last section, writing producers is really pretty straightforward.

Things I learned

My last point above – that writing producers is fairly straightforward – was certainly not always the case: the initial designs required a lot of more stuff – phantom types, crazy lifetimes, etc. But I found that these are often signs that your traits could be adjusted to make things go more smoothly. Some of the primary lessons follow.

Align input/output type parameters on traits to go with dataflow. One of the biggest sources of problems for me was that I was overusing associated types, which wound up requiring a lot of phantom types and other things. At least in these cases, what worked well as a rule of thumb was this: if data is flowing in to the trait, it should be an input type parameter. It data is flowing out, it should be an associated type. So, for example, producers have an associated type Item, which indicates the kind of data a Producer or iterator will produce, is an associated type. But the ProducerCallback trait is parameteried over T, the type of that the base producer will create.

Choose RAII vs callbacks based on who needs control. When designing APIs, we often tend to prefer RAII over callbacks. The immediate reason is often superficial: callbacks lead to rightward drift. But there is also a deeper reason: RAII can be more flexible.

Effectively, whether you use the RAII pattern or a callback, there is always some kind of dynamic scope associated with the thing you are doing. If you are using a callback, that scope is quite explicit: you will invoke the callback, and the scope corresponds to the time while that callback is executing. Once the callback returns, the scope is over, and you are back in control.

With RAII, the scope is open-ended. You are returning a value to your caller that has a destructor – this means that the scope lasts until your caller chooses to dispose of that value, which may well be never (particularly since they could leak it). That is why I say RAII is more flexible: it gives the caller control over the scope of the operation. Concretely, this means that the caller can return the RAII value up to their caller, store it in a hashmap, whatever.

But that control also comes at a cost to you. For example, if you have resources that have to live for the entire scope of the operation you are performing, and you are using a callback, you can easily leverage the stack to achieve this. Those resources just live on your stack frame – and so naturally they are live when you call the callback, and remain live until the callback returns. But if you are using RAII, you have to push ownership of those resources into the value that you


Air Mozilla: Reps weekly, 25 Feb 2016

Четверг, 25 Февраля 2016 г. 19:00 + в цитатник

Reps weekly This is a weekly call with some of the Reps to discuss all matters about/affecting Reps and invite Reps to share their work with everyone.

https://air.mozilla.org/reps-weekly-20160225/


Mozilla Addons Blog: Friend of Add-ons: Johann Hofmann

Четверг, 25 Февраля 2016 г. 13:22 + в цитатник

Our newest Friend of Add-ons is Johann Hofmann! Johann is active in the Rust ecosystem, and has been contributing to WebExtensions in the past few months. He explains, “I like that the WebExtensions project enables me to have such a big impact on the API that I and many other developers will use in the future.”

Johann has also contributed code to JPM, and is a volunteer AMO reviewer. He enjoys open source projects and explains how he began contributing to add-ons:

“I got into contributing to add-ons when I started to write my own Firefox extensions and noticed small bugs in the tooling, which I managed to fix. I love contributing to open source. I think Mozilla is the perfect place to do open source because it focuses on the people behind the code. Everyone is really welcoming and trying to help you make an impact, in real-life events as much as on IRC.”

Thanks to Johann for making an impact with add-ons.

We encourage you to document your contributions on the Recognition page, which also serves as the nomination vehicle for the Friends of Add-ons features.

https://blog.mozilla.org/addons/2016/02/25/friend-of-add-ons-johann-hofmann/


Karl Dubost: `border-image`? fill-me in, please

Четверг, 25 Февраля 2016 г. 11:43 + в цитатник

Background Story: Gmail

Firefox on Android is receiving a simplified version of Gmail.

gmail screenshot

In fact, we somehow chose to do this. When we added the Android version number to the User Agent string to be more compatible with the Web (Read here, that a lot of code is breaking if they detect Android but not the version number going with it), we decided to remove the Android version number for gmail. For a couple of Web properties we send a different User Agent string. In this case

 // bug 1184320, gmail.com
  "mail.google.com": "Android\\s\\d.+?;#Android;",

Google was sending to Firefox on Android, the version for Chrome which was really broken.

gmail screenshot

Google is fixing gmail. The current version sent to the Firefox with an Android version number looks much better. We are almost there but not yet.

gmail screenshot

border-image Fill Me

Your expert eyes have noticed that the contrast of the edit button on the top right is a bit… low. There's something missing.

gmail toolbar in firefox

Let's check what a blink rendering engine is receiving.

gmail toolbar in blink

Ah much better. Let's find out what is happening. We open Firefox Developer Tools for inspecting the code. There is indeed a red image is specified for the editor button.

inspector - firefox developer tools

Let's check the CSS code:

.Ke {
    -moz-border-image: url(…5CYII=) 4 4 5 4;
    border-width: 4px 4px 5px;
}

Nothing seems wrong at first sight. What is happening? Chrome basically receives the same thing but with its own vendor prefix

.Ke {
    -webkit-border-image: url(…5CYII=) 4 4 5 4;
    border-width: 4px 4px 5px;
}

Something is missing.

border-image is defined in [CSS Backgrounds and Borders Module Level 3]. Quoting here the Editor's Draft 14 August 2015.

Authors can specify an image to be used in place of the border styles. In this case, the border's design is taken from the sides and corners of an image specified with ‘border-image-source’, whose pieces may be sliced, scaled and stretched in various ways to fit the size of the border image area.

There is a mention of the fill keyword.

This property specifies inward offsets from the top, right, bottom, and left edges of the image, dividing it into nine regions: four corners, four edges and a middle. The middle image part is discarded (treated as fully transparent) unless the ‘fill’ keyword is present. (It is drawn over the background; see Drawing the Border Image.)

So it seems that the fill keyword is missing. You may want to read also CSS border-image changes and unprefixing by David.

Let's add it to the property.

.Ke {
    -moz-border-image: url(…5CYII=) 4 4 5 4 fill;
    border-width: 4px 4px 5px;
}

gmail toolbar in firefox

OK we got the red! Yoohoo. I asked Google to add the fill keyword last week. Hmm well… but not as beautiful as in a blink or WebKit rendering engine.

border-image The Cosmetics Details

What's happening. Let's go back to the drawing board and create a test we can play with:

  1. border-image
  2. -moz-border-image (What Google sends to Firefox)
  3. -webkit-border-image (What Google sends to Chrome)
  4. -moz-border-image + fill
  5. border-image + fill

Results in Firefox

testing border-image in firefox

Results in Blink

testing border-image in blink

As we can see, border-image default as a nicer feeling in Blink engines. When there is a difference like this, it is always good to check the computed values in Blink inspector and Firefox inspector, to see if there are any differences.

Computed values difference

There's a difference of interpretation for the border-width parameters, but modifying them doesn't seem to change things. Let's fork our test.

I noticed that adding a border-image-width in Firefox was giving a sense of a button. Once I have done this and modifying a bit the height and line-height, I have something which makes sense and it's working in Blink!

a working button

Basically it seems that in Blink world the border-width is taking over the border-image-width. This should be verified. Definitely a difference of implementation. I need to check with David Baron.

Otsukare!

http://www.otsukare.info/2016/02/25/border-image-css


Mozilla Security Blog: Payment Processors Still Using Weak Crypto

Четверг, 25 Февраля 2016 г. 03:20 + в цитатник

Part of how Mozilla protects the Web is by participating in the governance of the Web PKI, the system of security certificates that allows websites to authenticate themselves to browsers. Together with the other browsers and stakeholders in the Web, we agree on standards for how such certificates are issued.  We then require that these standards, plus a few additional ones specific to Mozilla, be applied to all certificates which are issued, directly or indirectly, by the “roots” that Firefox trusts.

We have been notified that some payment providers are using Web PKI certificates (i.e. certificates which chain up to roots trusted by Firefox) to secure the connection between central servers and payment terminals, for the purpose of transmitting payment data over the public Internet. Unfortunately, some of those non-browser users of the Web PKI have not kept up with the advances in security that the Web is achieving. The SHA-1 hash algorithm (used to validate the integrity of a certificate) has been declared obsolete in the Web PKI, but these providers have failed to upgrade these devices to support its replacement, SHA-2, despite the SHA-1 deadlines having been set years ago. As a result, many payment-related devices continue to require their servers to have certificates which use SHA-1 in order to be able to operate.

In particular, Worldpay PLC approached Mozilla through their Certificate Authority, Symantec, to request authorization to issue, in violation of standard policy, a limited number of SHA-1 certificates needed to support a large number of outdated devices. They made this request less than two weeks before the authorization needed to be effective. To avoid disruption for users of these devices, after a discussion on the dev.security.policy mailing list, in this particular case we have decided to allow these certificates to be issued, but only under a set of conditions that ensure that the issuance of SHA-1 certificates is fully transparent and allowed only for purposes of transition to SHA-2.

This authorization means that Symantec can issue SHA-1 certificates that will enable Worldpay’s devices to keep operating a while longer, and that issuance will not be regarded by Mozilla as a defect. This decision only affects the Mozilla root program; other root programs may still consider the issuance of these certificates to be a mis-issuance.

We understand that there are payment processing organizations other than Worldpay that continue to have similar requirements for SHA-1 — either within the Web PKI or outside it. It is disappointing that these organizations are putting the public’s data at risk by using a weak, outdated security technology.  We encourage organizations with a continuing need for SHA-1 in the Web PKI to come forward as soon as possible and provide as much detail as possible about their plans for a transition to SHA-2.

https://blog.mozilla.org/security/2016/02/24/payment-processors-still-using-weak-crypto/


John O'Duinn: Finding the tone and structure of “Distributed”

Четверг, 25 Февраля 2016 г. 03:08 + в цитатник

As many of you know, I’m writing Book Cover for Distributedmy first book. While I just jumped in and start typing, people with a lot of experience in this area started asking me questions like: What is your writing tone? What is the book structure? Who is your intended audience?

Figuring out the answers to these questions felt even more daunting then the idea of “just” writing a book!

The book has evolved as I write and as I learned what was important to me in answering those original questions. Now that I’m into the swing of this, I thought it would be interesting to describe how I figured out the tone and structure of this book, and the logic behind those choices.

Over the years, I’ve bought many weighty management books that only get half-read before I give up, leaving them to gather dust with a bookmark somewhere in the middle. I wanted my book to be a book that I would make time to read. Not just a book by John for John – most people I worked with had similar time constraints. With a busy work life, an immediately helpful, practical, book felt important.

I kept these realities in mind and with a fresh set of eyes, went back looking at books I did and didn’t like. I studied how those authors structured their books and how their use of English changed the tone and feel of the book. I discovered some common patterns in terms of structure and language, which became important criteria for this book:

  • I use very casual, readable english throughout. No formal management or textbook english. Yes, I know enough pointy-haired-boss management words to play Dilbert Bingo, but I felt that language would only get in the way. I want this book to be something easy to read after a long tiring day at work, not a management-speak IQ test.
  • Each chapter is very short, typically 10-20 pages. Intentionally short enough to be read in one sitting in one evening after a busy day, over lunch or during a long commute!
  • Almost every chapter is self-contained, so if a reader has a specific pressing need, they can jump to that chapter for immediate practical help. If this happens often enough, hopefully they’ll keep the book close to hand.
  • Each chapter has simple, concrete, practical “takeaways” to put into use immediately to help make your life better today.

This book is designed so you should be able to just open a page on a specific topic you are dealing with today and just dive in. Having said that, if you are not sure where to start, the carefully chosen sequence of these chapters, and the way they are arranged in sections is a good default path:

  • Section One: Why are geo-distributed teams and organizations good for business, good for the economy and yes, good for humans? This section should help anyone justify building a distributed team/organization using cold, hard, financial business justifications – not just a touchy-feely “trust me, it makes people happy”.
  • Section Two: Most companies have day-to-day organizational inefficiencies that are so commonplace they are considered “normal” – or even worse “just the way it is”. This section details mechanical tips and tricks which make organizations more efficient. Becoming very, very crisply organized on those basic everyday mechanics improves efficiencies of an all-in-the-office team and make effective distributed teams possible.
  • Section Three: How do you handle the humanity in your distributed team? This section covers hirings, firings, one-on-ones, reviews, cultural issues and conflict. Also, some advice for long term “remote” workers on staying sane, healthy and planing a career path.

My first management job came with no training, so I had to make it up, learning as I went along. Same for each management role I’ve held since. This is true for most leaders I know. The lucky few got formal on-the-job training or mentors, but most don’t get any training until they’ve already been doing the job for a few years. Those initial years as a manager are formative and can shape what you perceive as possible in future management jobs.

It so happened, that my first job as a manager was in a company with people in multiple locations. Since then I’ve worked, as an engineer and leader, in many different geographically distributed companies. This book is coming together and I can honestly say I wish I’d had a book like this when I started.

John.
=====
As of now, the list of chapters is as follows:
Section1
* Chapter 1: Distributed Teams are Not New – AVAILABLE
* Chapter 2: The Real Cost of an Office – AVAILABLE
* Chapter 3: Disaster Planning – AVAILABLE
* Chapter 4: Mindset – AVAILABLE
Section2
* Chapter 5: Physical Setup – AVAILABLE
* Chapter 6: Video Etiquette – AVAILABLE
* Chapter 7: Own your calendar – AVAILABLE
* Chapter 8: Meetings – AVAILABLE
* Chapter 9: Meeting Moderator – AVAILABLE
* Chapter 10: Single Source of Truth
* Chapter 11: Email Etiquette – AVAILABLE
* Chapter 12: Group Chat Etiquette – AVAILABLE
Section3
* Chapter 13: Culture, Trust and Conflict
* Chapter 14: One-on-Ones and Reviews – AVAILABLE
* Chapter 15: Hiring, Firing, Reorgs and Layoffs – AVAILABLE
* Chapter 16: Bring Humans Together – AVAILABLE
* Chapter 17: Career path – AVAILABLE
* Chapter 18: Feed your soul – AVAILABLE
* Chapter 19: Final Chapter
Appendices
* Appendix A: The Bathroom Mirror Test – AVAILABLE
* Appendix B: How NOT to Work – AVAILABLE
* Appendix C: Further Reading – AVAILABLE
=====

http://oduinn.com/blog/2016/02/24/finding-the-tone-and-structure-of-distributed/


Mozilla Addons Blog: How to get a quicker review for add-ons with source code attached

Четверг, 25 Февраля 2016 г. 02:28 + в цитатник

When submitting an add-on to addons.mozilla.org (AMO), it is sometimes necessary to attach source code in addition to the xpi file. This usually applies to add-ons with obfuscated code, because a reviewer wouldn’t be able to approve an add-on without reviewing what was obfuscated. Since these types of add-ons are more complex to review, I’ve written some tips on what you can do to help them get through the queue faster.

When to attach sources

When the add-on xpi file you upload to addons.mozilla.org (AMO) contains code that is not completely readable by a human, it is probably a good idea to attach sources.

For example, if you used tools like uglifyJS, Google Closure Compiler, browserify or a custom pre-processor, you will have to upload sources. The same goes if you are using js-ctypes or including other binary components.

There are also a few cases where it is actually NOT required and in fact not recommended. If your library only contains third-party minified libraries (like jQuery or Angular), or if the libraries you are calling via js-ctypes are system libraries or open source, please do not upload sources. Instead, provide links to the repositories of the respective libraries.

What happens during reviews

When you attach sources, the add-on is marked for “admin-review”. This means that your sources and are only accessible to a small group of admins. We do this to protect your sources.

A very important aspect of reviewing sources is reproducing the obfuscation. As we need to treat every extension developer the same, we must verify that the source code we reviewed matches the uploaded xpi. If we skip this step, a malware author could provide us with legitimate-looking sources and add a backdoor to the previously minified xpi file.

Here are the steps we take:

  1. Download the sources and extract them.
  2. Run , including minifiers, obfuscators, compilers, or code generators.
  3. Take the output directory from the previous step and compare it with the add-on xpi that has been uploaded.
  4. Review the source code files as we would review any other add-on that does not have sources.

In step 3, we use a diff tool to compare the generated sources to the add-on xpi file. There must be no differences at all. To save time, it is very important to provide us with all the . If you don’t add this, we will have to get in touch with you, and that adds time to the review process.

Providing instructions

The easiest way for you to provide the magic steps is to include a README file in the uploaded sources. If it is just one or two files that are obfuscated, the instructions can be something like “run uglifyjs data/mycoolstuff.js”. If the extension is any more complex, please provide a script that we can run that takes care of everything at once. Things you should mention in your README include:

  • Prerequisites that need to be downloaded separately, for example yuicompressor.
  • For less popular or custom-written build tools, provide links where they can be downloaded, as well as installation steps.
  • If a specific version of supplemental software needs to be used, please let us know. But avoid doing so if the latest version would work just as well.
  • All commands we should run to go from sources to a generated xpi file that matches the one you’ve uploaded, for example npm install or a grunt target.

Please assume the reviewer has a vanilla operating system set up. You don’t need to describe how to install common tools including npm, node, the add-on SDK, but please make sure the reviewer can figure out how to install everything needed to generate the xpi file.

Aside from the README file, you also need to package everything required to build. If your add-on depends on a private repository or frameworks not commonly available, please include them as well.

Desired outcome

Common build tools used are make, grunt, gulp and ant. If you don’t already have a build target that runs the above steps, please add a target that does so. For example, allow us to run grunt firefox-dist to create the generated xpi. Here is an example of what the sources could look like. The dist directory is initially empty, until your build script (in this case grunt) generates the directory contents. You could then zip them up and upload them to AMO.

sources
+-- README.md                 
+-- Gruntfile.js              
+-- package.json               
+-------------------------------- dist
|                                 +-- bootstrap.js
|                                 +-- install.rdf
|                                 +-- package.json
+-- data                          +-- data
|   +-- js                        |   +-- js
|   |   +-- dialog.js             |   |   +-- dialog.min.js
|   |   +-- popup.js              |   |   +-- popup.min.js
|   +-- scss                      |   +-- css
|   |   +-- popup.scss            |   |   +-- popup.css
|   |   +-- dialog.scss           |   |   +-- dialog.scss
|   |   +-- common.scss           |   |
|   +-- html                      |   +-- html
|   |   +-- dialog.html           |   |   +-- dialog.html
|   |   +-- popup.html            |   |   +-- popup.html
|   +-- vendor                    |   +-- vendor
|   |   +-- jquery-2.1.4.min.js   |   |   +-- jquery-2.1.4.min.js
|   +-- images                    |   +-- images
|       +-- logo.png              |       +-- logo.png
+-- lib                           +-- lib
|   +-- main.js                   |   +-- main.js
+-- locale                        +-- locale
    +-- en-US.properties              +-- en-US.properties

 

A few more tips

If you can avoid obfuscating or minifying code, your review can be done by any reviewer. Should you still need to attach sources, make sure you provide clear instructions so that our admin reviewers can handle your add-on quickly. Responding to questions quickly and using well-known obfuscation tools also improve review time.

Some last words

While our reviewers, both volunteer and staff, review add-ons around the clock, there may be times when it just takes longer. This can be due to anything from Firefox releases and major product changes, to holiday seasons.

I hope you’ve found this post helpful. There’s a lot to remember, but after you’ve done this once or twice you should get the hang of it. If you’d like a page to bookmark that contains this information and some more details on the topic, please head over to our new article on MDN.

If you also have tips to share, or questions on this topic, please post in our forums. Also, if you ever want to sit with developers on the other side of the table, perhaps consider applying to become an add-on reviewer?

https://blog.mozilla.org/addons/2016/02/24/quicker-reviews-with-sources-attached/


Air Mozilla: Bugzilla Development Meeting, 24 Feb 2016

Четверг, 25 Февраля 2016 г. 00:00 + в цитатник

Bugzilla Development Meeting Join the core team to discuss the on-going development of the Bugzilla product. Anyone interested in contributing to Bugzilla, in planning, development, testing, documentation, or...

https://air.mozilla.org/bugzilla-development-meeting-20160224/


Yunier Jos'e Sosa V'azquez: Firefox cambiar'a su ritmo de actualizaciones

Среда, 24 Февраля 2016 г. 21:58 + в цитатник

Desde hace algunos a~nos atr'as, Mozilla introdujo un nuevo modelo fijo de actualizaciones para Firefox cada seis semanas para entregar al usuario final nuevas funcionalidades en un per'iodo de tiempo m'as corto e ir a la velocidad de la Web. Despu'es de analizar cuidadosamente el proceso de lanzamientos, Mozilla ha decidido cambiar a un modelo m'as flexible.

El nuevo ciclo de liberaci'on comprende actualizaciones entre 6 y 8 semanas, por lo que Mozilla proporcionar'a el mismo n'umero de versiones de Firefox en un a~no y al mismo tiempo ganar'a algunas ventajas significativas con respecto al modelo anterior. Por ejemplo, ahora Mozilla podr'a ser capaz de ajustar las fechas de lanzamientos para responder a las necesidades de los usuarios y proporcionar al menos seis semanas para tener lista todas las nuevas funcionalidades.

En el blog Future Releases de Mozilla tambi'en han informado que para el resto del a~no el plan ser'a el siguiente:

2016-01-26 – Firefox 44
2016-03-08 – Firefox 45, ESR 45 (ciclo de 6 semanas)
2016-04-19 – Firefox 46 (ciclo de 6 semanas)
2016-06-07 – Firefox 47 (ciclo de 7 semanas)
2016-08-02 – Firefox 48 (ciclo de 8 semanas)
2016-09-13 – Firefox 49 (ciclo de 6 semanas)
2016-11-08 – Firefox 50 (ciclo de 8 semanas)
2016-12-13 – Firefox 50.0.1 (ciclo de 5 semanas, actualizaciones por fallos cr'iticos si son necesarias)
2017-01-24 – Firefox 51 (6 semanas desde de la versi'on anterior)

Adem'as, en la Wiki de Mozilla se puede ver todo el calendario y si existe alg'un cambio se informar'a con tiempo. Las versiones de soporte extendido seguir'an entreg'andose cuando se libere Firefox.

Por nuestra parte seguiremos informando a nuestros lectores de cada versi'on y sus novedades.

http://firefoxmania.uci.cu/firefox-cambiara-su-ritmo-de-actualizaciones/


Mozilla WebDev Community: Beer and Tell – February 2016

Среда, 24 Февраля 2016 г. 21:32 + в цитатник

Once a month, web developers from across the Mozilla Project get together to talk about our side projects and drink, an occurrence we like to call "Beer and Tell".

There's a wiki page available with a list of the presenters, as well as links to their presentation materials. There's also a recording available courtesy of Air Mozilla.

Bruce Banner: Web Developer

shobson was up first with Bruce Banner: Web Developer, a small webcomic generator. It provides sweet relief from those workplace stressors via the violent justice of The Incredible Hulk. The idea came from willkg, the code from shobson, and the art from craigcook. Excelsior!

Dokku + Let's Encrypt

Next up was pmac, who showed off dokku, a small PaaS implementation similar to Heroku, that uses Docker containers. Not only is it convenient for running several apps on a single server, but there is also a plugin called dokku-letsencrypt that lets you automatically retrieve and install TLS certificates from letsencrypt.org. Easy peasy!

RPG Maker MV

Next was Osmose (that's me!) who talked about RPG Maker MV, the latest entry in the RPG Maker series of game-making tools. Interestingly, RPGMV uses HTML and JavaScript to implement the engine used to run games made with it. The application itself edits JSON files that are loaded by the web-based engine. The engine itself uses pixi.js for rendering, and can be extended via plugins written in JavaScript.

Battleshits

peterbe stopped by to share Battleshits, a mobile-friendly web app and a fairly gross version of the popular boardgame Battleship. The game connects you with other players via WebSockets and Fanout, and most of the interface is implemented using React.

Chava

Last up was jpetto with a small personal project memorializing his local coffeeshop, Chava, which closed earlier this year. The page uses Hammer.js for touch events and LazyLoad to lazily load the images, but the lightbox implementation is custom-made from scratch. Neato!


If you're interested in attending the next Beer and Tell, sign up for the dev-webdev@lists.mozilla.org mailing list. An email is sent out a week beforehand with connection details. You could even add yourself to the wiki and show off your side-project!

See you next month!

https://blog.mozilla.org/webdev/2016/02/24/beer-and-tell-february-2016/


Chris Lord: The case for an embeddable Gecko

Среда, 24 Февраля 2016 г. 21:10 + в цитатник

Strap yourself in, this is a long post. It should be easy to skim, but the history may be interesting to some. I would like to make the point that, for a web rendering engine, being embeddable is a huge opportunity, how Gecko not being easily embeddable has meant we’ve missed several opportunities over the last few years, and how it would still be advantageous to make Gecko embeddable.

What?

Embedding Gecko means making it easy to use Gecko as a rendering engine in an arbitrary 3rd party application on any supported platform, and maintaining that support. An embeddable Gecko should make very few constraints on the embedding application and should not include unnecessary resources.

Examples

  • A 3rd party browser with a native UI
  • A game’s embedded user manual
  • OAuth authentication UI
  • A web application
  • ???

Why?

It’s hard to predict what the next technology trend will be, but there’s is a strong likelihood it’ll involve the web, and there’s a possibility it may not come from a company/group/individual with an existing web rendering engine or particular allegiance. It’s important for the health of the web and for Mozilla’s continued existence that there be multiple implementations of web standards, and that there be real competition and a balanced share of users of the various available engines.

Many technologies have emerged over the last decade or so that have incorporated web rendering or web technologies that could have leveraged Gecko;

(2007) iPhone: Instead of using an existing engine, Apple forked KHTML in 2002 and eventually created WebKit. They did investigate Gecko as an alternative, but forking another engine with a cleaner code-base ended up being a more viable route. Several rival companies were also interested in and investing in embeddable Gecko (primarily Nokia and Intel). WebKit would go on to be one of the core pieces of the first iPhone release, which included a better mobile browser than had ever been seen previously.

(2008) Chrome: Google released a WebKit-based browser that would eventually go on to eat a large part of Firefox’s user base. Chrome was initially praised for its speed and light-weightedness, but much of that was down to its multi-process architecture, something made possible by WebKit having a well thought-out embedding capability and API.

(2008) Android: Android used WebKit for its built-in browser and later for its built-in web-view. In recent times, it has switched to Chromium, showing they aren’t adverse to switching the platform to a different/better technology, and that a better embedding story can benefit a platform (Android’s built in web view can now be updated outside of the main OS, and this may well partly be thanks to Chromium’s embedding architecture). Given the quality of Android’s initial WebKit browser and WebView (which was, frankly, awful until later revisions of Android Honeycomb, and arguably remained awful until they switched to Chromium), it’s not much of a leap to think they may have considered Gecko were it easily available.

(2009) WebOS: Nothing came of this in the end, but it perhaps signalled the direction of things to come. WebOS survived and went on to be the core of LG’s Smart TV, one of the very few real competitors in that market. Perhaps if Gecko was readily available at this point, we would have had a large head start on FirefoxOS?

(2009) Samsung Smart TV: Also available in various other guises since 2007, Samsung’s Smart TV is certainly the most popular smart TV platform currently available. It appears Samsung built this from scratch in-house, but it includes many open-source projects. It’s highly likely that they would have considered a Gecko-based browser if it were possible and available.

(2011) PhantomJS: PhantomJS is a headless, scriptable browser, useful for testing site behaviour and performance. It’s used by several large companies, including Twitter, LinkedIn and Netflix. Had Gecko been more easily embeddable, such a product may well have been based on Gecko and the benefits of that would be many sites that use PhantomJS for testing perhaps having better rendering and performance characteristics on Gecko-based browsers. The demand for a Gecko-based alternative is high enough that a similar project, SlimerJS, based on Gecko was developed and released in 2013. Due to Gecko’s embedding deficiencies though, SlimerJS is not truly headless.

(2011) WIMM One: The first truly capable smart-watch, which generated a large buzz when initially released. WIMM was based on a highly-customised version of Android, and ran software that was compatible with Android, iOS and BlackBerryOS. Although it never progressed past the development kit stage, WIMM was bought by Google in 2012. It is highly likely that WIMM’s work forms the base of the Android Wear platform, released in 2014. Had something like WebOS been open, available and based on Gecko, it’s not outside the realm of possibility that this could have been Gecko based.

(2013) Blink: Google decide to fork WebKit to better build for their own uses. Blink/Chromium quickly becomes the favoured rendering engine for embedding. Google were not afraid to introduce possible incompatibility with WebKit, but also realised that embedding is an important feature to maintain.

(2014) Android Wear: Android specialised to run on watch hardware. Smart watches have yet to take off, and possibly never will (though Pebble seem to be doing alright, and every major consumer tech product company has launched one), but this is yet another area where Gecko/Mozilla have no presence. FirefoxOS may have lead us to have an easy presence in this area, but has now been largely discontinued.

(2014) Atom/Electron: Github open-sources and makes available its web-based text editor, which it built on a home-grown platform of Node.JS and Chromium, which it later called Electron. Since then, several large and very successful projects have been built on top of it, including Slack and Visual Studio Code. It’s highly likely that such diverse use of Chromium feeds back into its testing and development, making it a more robust and performant engine, and importantly, more widely used.

(2016) Brave: Former Mozilla co-founder and CTO heads a company that makes a new browser with the selling point of blocking ads and tracking by default, and doing as much as possible to protect user privacy and agency without breaking the web. Said browser is based off of Chromium, and on iOS, is a fork of Mozilla’s own WebKit-based Firefox browser. Brendan says they started based off of Gecko, but switched because it wasn’t capable of doing what they needed (due to an immature embedding API).

Current state of affairs

Chromium and V8 represent the state-of-the-art embeddable web rendering engine and JavaScript engine and have wide and varied use across many platforms. This helps reenforce Chrome’s behaviour as the de-facto standard and gradually eats away at the market share of competing engines.

WebKit is the only viable alternative for an embeddable web rendering engine and is still quite commonly used, but is generally viewed as a less up-to-date and less performant engine vs. Chromium/Blink.

Spidermonkey is generally considered to be a very nice JavaScript engine with great support for new EcmaScript features and generally great performance, but due to a rapidly changing API/ABI, doesn’t challenge V8 in terms of its use in embedded environments. Node.js is likely the largest user of embeddable V8, and is favoured even by Mozilla employees for JavaScript-based systems development.

Gecko has limited embedding capability that is not well-documented, not well-maintained and not heavily invested in. I say this with the utmost respect for those who are working on it; this is an observation and a criticism of Mozilla’s priorities as an organisation. We have at various points in history had embedding APIs/capabilities, but we have either dropped them (gtkmozembed) or let them bit-rot (IPCLite). We do currently have an embedding widget for Android that is very limited in capability when compared to the default system WebView.

Plea

It’s not too late. It’s incredibly hard to predict where technology is going, year-to-year. It was hard to predict, prior to the iPhone, that Nokia would so spectacularly fall from the top of the market. It was hard to predict when Android was released that it would ever overtake iOS, or even more surprisingly, rival it in quality (hard, but not impossible). It was hard to predict that WebOS would form the basis of a major competing Smart TV several years later. I think the examples of our missed opportunities are also good evidence that opening yourself up to as much opportunity as possible is a good indicator of future success.

If we want to form the basis of the next big thing, it’s not enough to be experimenting in new areas. We need to enable other people to experiment in new areas using our technology. Even the largest of companies have difficulty predicting the future, or taking charge of it. This is why it’s important that we make easily-embeddable Gecko a reality, and I plead with the powers that be that we make this higher priority than it has been in the past.

http://chrislord.net/index.php/2016/02/24/the-case-for-an-embeddable-gecko/


Air Mozilla: The Joy of Coding - Episode 46, 24 Feb 2016

Среда, 24 Февраля 2016 г. 21:00 + в цитатник

The Joy of Coding - Episode 46 mconley livehacks on real Firefox bugs while thinking aloud.

https://air.mozilla.org/the-joy-of-coding-episode-46/


John Ford: cloud-mirror – Platform Engineering Operations Project of the Month

Среда, 24 Февраля 2016 г. 18:13 + в цитатник
Hello from Platform Engineering Operations! Once a month we highlight one of our projects to help the Mozilla community discover a useful tool or an interesting contribution opportunity. This month's project is our cloud-mirror.

The cloud-mirror is something that we've written to reduce costs and time of inter-region S3 transfers. Cloud-mirror was designed for use in the Taskcluster system, but is possible to run independently. Taskcluster, which is the new automation environment for Mozilla, can support passing artifacts between dependent tasks. An example of this is that when we do a build, we want to make the binaries available to the test machines. We originally hosted all of our artifacts in a single AWS region. This meant that every time a test was done in a region outside of the main region, we would incur an inter-region transfer for each test run. This is expensive and slow compared to in-region transfers.

We decided that a better idea would be to transfer the data from the main region to the other regions the first time it was requested in that region and then have all subsequent requests be inside of the region. This means that for the small overhead of an extra in-region copy of the file, we lose the cost and time overhead of doing inter-region transfers every single time.

Here's an example. We use us-west-2 as our main region for storing artifacts. A test machine in eu-central-1 requires "firefox-50.tar.bz2" for use in a test. The test machine in eu-central-1 will ask cloud mirror for this file. Since this is the first test to request this artifact in eu-central-1, cloud mirror will first copy "firefox-50.tar.bz2" into eu-central-1 then redirect to the copy of that file in eu-central-1. The second test machine in eu-central-1 will then ask for a copy of "firefox-50.tar.bz2" and because it's already in the region, the cloud mirror will immediately redirect to the eu-central-1 copy.

We expire artifacts from the destination regions so that we don't incur too high storage costs. We also use a redis cache configured to expire keys which have been used least recently first. Cloud mirror is written with Node 5 and uses Redis for storage. We use the upstream aws-sdk library for doing our S3 operations.

We're in the process of deploying this system to replace our original implementation called 's3-copy-proxy'. This earlier version was a much simpler version of this idea which we've been using in production. One of the main reasons for the rewrite was to be able to abstract the core concepts to allow anyone to write a backend for their storage type as well as being able to support more aws regions and move towards a completely HTTPS based chain.

If this is a project that's interesting to you, we have lots of ways that you could contribute! Here are some:
  • switch polling for pending copy operations to use redis's pub/sub features
  • write an Azure or GCE storage backend
  • Modify the API to determine which cloud storage pool a request should be redirected to instead of having to encode that into the route
  • Write a localhost storage backend for testing that serves content on 127.0.0.1
If you have any ideas or find some bugs in this system, please open an issue https://github.com/taskcluster/cloud-mirror/issues. For the time being, you will need to have an AWS account to run our integration tests (`npm test`). We would love to have a storage backend that allows running the non-service specific portions of the system without any extra permissions.
If you're interested in contributing, please ping me (jhford) in #taskcluster on irc.mozilla.org.

For more information about all Platform Ops projects, visit our wiki. If you're interested in helping out, http://ateam-bootcamp.readthedocs.org/en/latest/guide/index.html has resources for getting started.

http://blog.johnford.org/2016/02/cloud-mirror-platform-engineering.html



Поиск сообщений в rss_planet_mozilla
Страницы: 472 ... 244 243 [242] 241 240 ..
.. 1 Календарь