‘Open source development at Google is both very diverse and distributed’

July 1, 2017

19468

Everyone knows that Google is a leader in the open source world. But what is it that makes it a distinct player in the fast-growing developer community? Will Norris, the engineering manager at Google’s Open Source Programs Office, reveals some secrets in an exclusive talk with Jagmeet Singh of OSFY. Edited excerpts…

Q What is the development model for open source technologies at Google?

Open source development at Google is both very diverse and distributed. The larger projects that we release generally have dedicated teams developing and supporting the project, working with their external developer communities and providing internal support to other Googlers. Many of the smaller projects include just one or two engineers working on something experimental or just a fun, side project. While we do have a central Open Source Programs Office (the group I manage), it is relatively small compared to the size of the company. Instead, the actual development happens throughout the company, with hundreds of teams and thousands of engineers, tech writers, designers and product managers contributing to open source in some way.

Q What is the process that Google follows to identify a project as an open source release?

Most often, the decision to open source a project at Google comes from the product team members themselves. There are a lot of different reasons why they might do this. TensorFlow is a good example of us having worked on machine learning within Google for many years, and then seeing an opportunity to push the entire industry forward by releasing the work that we had done. Because open source has been a part of Google’s culture for so long, its benefits are generally well understood throughout the company and so releasing a given project is very natural.

Q Apart from releasing projects for developers, how does Google deploy open source internally to develop innovations?

Internally, most of the source code for all Google products is stored in one monolithic repository. So in many ways, it is like a big open source project where just about everyone can see everyone else’s code. This allows Googlers to read others’ code to learn from it. It also means that they can submit patches to other teams’ projects to help improve them. The new buzzword for this is ‘inner source’, but for us, it is just how we have always operated.
Beyond that, we also make it really easy for Googlers to bring outside open source code into the company. In terms of comparing just the number of lines of code, we have nearly as much external open source code as we have of code that we’ve authored ourselves.

Q What are all the leading open source projects that Google leverages to build new solutions?

There is probably not a single Google product that is not touched by open source in some way—whether it is a library used directly by the product, our build and testing tools, or our data centres that they run in. Big projects like Linux can also be found in our data centres and at the heart of products like Android and Chrome OS. Likewise, LLVM is used extensively as part of the build toolchain, and most of the development languages we use are all open source.

Q How do open source engineers at Google plan project life cycles?

We have never had a ‘one size fits all’ approach to open source. Instead, we are comfortable with project heads making the decisions. Most of our projects are developed completely in the open, with very little internal-only discussion or planning. There are certainly cases where development or planning happens internally, and there are different reasons for that. There is a real cost to that style of project management, particularly in terms of community development, so we do our best to help educate teams on the trade-offs and then let them make the decision that is right for their project.

Q What is the average time frame at Google for moving a project from its alpha stage to beta?

Some projects are public right from when the very first line of code is written, while others have been developed internally for quite some time before being released publicly. So projects may be in very different stages of development when they are announced. One thing we are trying to do better is making it more clear what stage a project is in.

Q How do Google engineers test an open source project before its public release?

Some of the bigger releases we have done in the last few years were based on technologies that have existed within Google for many years. For example, TensorFlow was based on our internal machine learning platform called Brain; GRPC is based on our internal RPC system called Stubby; Bazel was called Blaze internally before it was released online; and Kubernetes was the result of our years of experience with our internal job scheduler called Borg. So, in many respects, these projects were tested in the most rigorous manner by first powering Google products and infrastructure.
Now that is not true of all (or maybe even most) of our open source projects. The small- to medium-sized projects go through the same ‘dogfooding’ trials as Google products, where they are tested internally by fellow Googlers prior to public release.

Q How does Google enable community engagements around its various open source projects?

We generally use the same tools as the rest of the open source community, which includes GitHub, IRC or Slack, Stack Overflow, mailing lists and meetups, among others. Also, we have a strong presence at many open source conferences, and so engage with the community members there or wherever else they happen to be.
We have an extensive network of Google Developer Groups around the world, and these are also a great place for community engagements.

Q How does the Open Source Programs Office develop new policies related to open source projects?

Google’s Open Source Programs Office has been around for almost 13 years. So our policies have developed somewhat organically over the years as the company has grown and new challenges have emerged. Last month, we actually published all of our policies and documentation on our new open source-centred site (opensource.google.com). These docs are based on all the lessons we have learned from working with open source at a large technology company for many years. We know that the way we do things may not be right for everyone else, but we published these docs under an open licence so others can adapt them to whatever makes sense at their company.

Q Is it the commercialised open source or FOSS (free and open source software) that gets more interest from Google?

Many successful free and open source projects have corporate sponsorship in varying forms, even if that simply means that they have employees whose full-time jobs are to work on the project. This is probably the biggest way that Google supports many open source projects. For instance, we do give money to the foundations behind Linux, LLVM, Git and Samba. But, more importantly, we employ engineers whose sole job is to contribute to and help maintain those projects because they’re very important to us.

Q What do you think are the major challenges in releasing an open source solution nowadays?

Once you get past whatever technical problem a given project is trying to address, one of the biggest challenges almost inevitably ends up being community management. But this is not unique to open source; it is also true inside any engineering organisation. People are complicated and messy. They can have very strong opinions, they often do not care about the same things you care about, and they have lives outside of your project that will at some point bleed into their work in unexpected and inexplicable ways.

Managing people, whether that means a formal employee relationship or just as a part of an open source community, is incredibly hard. But it can be so incredibly rewarding.

I believe helping facilitate a group of diverse people with a common goal and working together to build something is greater than the sum of their individual contributions. That is at least why I love what I do.

Q How does Google resolve those challenges?

That is really the key question! It starts with trying to create an environment where people feel welcome to participate, and where they really can contribute on an equal footing. It also means being honest about the fact that conflict will arise and being prepared to handle it when it does. All this means fostering an environment where grace can be extended when people mess up, but also not be afraid to pull out the weeds when they are choking out the flowers.

We know we do not always hit that mark, but that is what I aim for.

Q In addition to your current role at Google, you are apparently a huge supporter of WordPress. What do you feel makes WordPress a platform that has democratised publishing?

Particularly as WordPress is self-hosted on a personal domain name, anyone can download the software, load it up on their server or any hosting platform and publish on the Internet without having to ask anyone for permission. There are no use policies that need to be ‘accepted’ specifying what you can and cannot post, and there are no terms and conditions to agree to. Today, countless platforms are enabling you to do this, but that was not as true 10 to 15 years ago.

Q What do you think are the major common factors between Google and WordPress?

Both believe, at the most fundamental level, that the Internet must be an open platform for innovation.

Q Finally, how do you see the future of open source, moving beyond Google and WordPress?

There is far more open source code yet to be written than everything that exists today. So really, we are still only at the beginning of the open source growth curve. We all have to look forward to it.