LINUX For You caught up with Mahshad Koohgoli, CEO, Protecode, to gauge the complexity in the open source license landscape, and learn the best practices that both software development teams and independent developers must adopt, if they want to have clean code at the end of the product development cycle.
To begin with, could you tell us how many types of open source licenses there are, and what their related obligations are?
Broadly speaking, we divide open source licenses into two categories — permissive licenses and restrictive licenses. Permissive licenses put very little restrictions on how you modify and use the modified version of the code. Good examples are BSD (Berkeley Software Distribution), or New BSD, Apache version 2, and Mozilla Public License (MPL). These licenses don’t put onerous restrictions on users, except that the license generally has to be cited, and a copy of the text of the license needs to be distributed along with the code. BSD, or the modified version of BSD, is also very simple — and any company that’s not looking for copyleft licenses can usually opt for this one.
The other type of license is the restrictive open source license. We call these copyleft licenses, as opposed to copyright licenses. These are also sometimes called viral licenses. A good example is the GNU GPL (General Public License). Copyleft licenses try to keep the code open source — along with any modification of the code, or any product based on it. So if you use code with such licenses as the GPL, you are bound by the terms of the license to also release your own code under the GPL license as open source code.
This license can potentially be unsuitable for some businesses. For example, if you are operating in an environment where you want to keep your code proprietary, the use of this license may not be in line with the business objectives of your firm. However, sometimes it is fine to use such licenses, when your business is not based on selling the product — when your real value-add comes from the services that you give to support the product, facilitate its installation, or effect changes to the code. That’s a legitimate business too. Most open source companies practice this model. SugarCRM is a good example. There are companies that have built a whole business around supporting SugarCRM.
So, there is no question of what is good or bad when it comes to licenses. The choice depends entirely on the kind of requirements a business has.
You have once said that, “The Open Source Initiative has an approved list of nearly 70 open source licenses. However, Protecode has catalogued more than 3,000 variations of these licenses.” How come there are so many licenses and versions of open source licenses? Could you explain this statement of yours?
As regards open source licenses, there is a formal open source organisation called the Open Source Initiative (OSI), which has a formal definition of what constitutes an open source license. Based on this definition, it has recognised certain licenses as legitimate open source licenses. OSI has an approved list of nearly 70 open source licenses. However, Protecode has catalogued more than 3,000 variations of these licenses. People have created these versions in order to suit the specific requirements of their open source projects. However, when you look at the actual usage, there are mainly 6 to 10 licenses that cover 90-95 per cent of open source software.
People who work on software development projects, and who usually adopt open source licenses, modify and change these. This creates a proliferation of different licenses, which in turn becomes more and more complex, requiring one to take legal advice to understand the implications of the changes, and so on. So, as much as possible, I would recommend that development teams adopt one of the OSI-approved licenses.
Many “pundits” like to claim that open source licenses put restrictions on making money from products based on open source software. What’s your take on this statement?
There is a misconception that using open source to build your products can curtail your chances of making money from them. This is absolutely untrue. There is absolutely no restriction imposed by any of the OSI-approved open source licenses regarding what to charge for your software. You can choose to charge, and people can choose to pay or not to pay for your product, whether you are using open source or not.
As I mentioned, based on the open source licenses governing the software product, there will be certain licensing obligations — but none of them are monetary. The obligations are generally very mundane. For example, in some cases you may be required to just cite the license you are using. In addition, you may have to display the license, if your product has a user interface. For example, in your smartphone, you would find a tab where all the software and their corresponding licenses are listed.
However, copyleft licenses, depending on how you are using them, will require you to open the source code of your software, as in the case of GPL or GPLv3.
One of the things that I have observed is that the licensing terms are difficult to understand for developers, or even their managers. Hence, it is always best to have access to a licensing expert or IP lawyer. While these days we have solutions that do indicate the licensing obligations associated with software, and generate an action-item report to be considered by the project team, this is not meant to replace legal experts.
With the existence of so many versions of open source licenses, doesn’t it make the whole process of tracking the obligations related to these versions an unmanageable task? How can one discover license violations?
I will give you an example to illustrate the level of complexity that is inherent in any software product development cycle. Consider that you have to create a presentation. To do so, you may refer to your previous presentations. You may have a folder of your favourite icons, you may go to the Web, from where you may find a piece of useful text and a picture, and you add your own creativity around it. Once you have added various elements and finished making the presentation, if somebody asks you what’s yours and what’s not, you may not know, as you have not been keeping track of the origin of each element incorporated into your presentation.
Similarly, when it comes to software development, it is absolutely not possible to keep a track of the licenses and related obligations manually, as in almost any software development environment, there is more than one person involved. Even when one person is developing a project, this task is difficult to accomplish.
However, this doesn’t mean that you shouldn’t leverage open source software code. The best developers in the world don’t write code from scratch — that is just so very inefficient. There is so much software code freely available that you can use, and add your own creativity to it. There is nothing wrong in doing this, as long as you are doing all this in a managed way.
While manually keeping track of different pieces of code is extremely difficult, if not impossible, there are license-management solutions available these days that can help organisations in this task. These solutions work automatically in the background, without disturbing the existing development processes, keeping track of the software components that go into the project, and identifying if they are open source or not. If some pieces of open source code are identified, the solution indicates the associated licenses and related obligations.
Could you give a brief synopsis of your whitepaper on The 8 Step Open Source Software Adoption Process (OSSAP) Guide, which is aimed at making the process of open source software adoption foolproof and transparent?
Based on interactions with over 100 companies about the best practices followed by them during the process of software development, we have devised an eight-step software adoption process, which can make the process of software development free of legal hassles. We call it the open source software adoption process (OSSAP). We have also captured these practices in the form of a whitepaper.
It contains eight steps or practices, some of which we found were being followed by all companies. We have termed these as necessary processes. Some practices were followed only by a few companies, but were good — which we have termed as optional practices. Organisations into software development must follow these practices to ensure they are not violating any of the open source software licences. [For more, refer to the information given in the box item at the end of the interview.]
What are the hazards of not adopting any license management process?
License management is like any other quality management process. Many companies scan the product just before it is to be shipped out. In case there are issues found with licensing at this stage, the company may have to recall the product, which can waste a lot of productive time — and may even prolong the time to market for a software product. Apart from this, it could lead to legal hassles if the product is shipped without complying with the necessary licensing terms.
Through its Open Compliance Program, the Linux Foundation, along with some other companies, is currently working on a standard for software packages and licenses named Software Package Data Exchange standard (SPDX). Since Protecode, too, is a member in this initiative, could you give a few insights related to this standard, and its relevance for software development companies and teams?
SPDX is aimed at formalising, in a standard way, the information about the components of a software package, including details such as the description of what’s included in the software package, what third-party content is included (if any), which are the licenses, copyright ownership attributes of the components, and so on.
The SPDX file, which holds this information, is always meant to travel with the software package. It is almost like a bill of materials or components, the ingredients of something that you buy off-the-shelf. Having this as a standard is a significant boon to managing inventory and compliance in an automated manner.
There are solutions (like the ones that Protecode offers) that can detect the presence of the SPDX file, read it and augment it based on the scanning result, update it if needed, and regenerate an updated bill of material that can then be distributed with your software.
The first version of SPDX was released in August 2011. A new version with additional capabilities is on the slate already.
Summing up, what would your advice be to software developers and development firms?
To anyone involved in software development, it is important to accept the fact that the occurrence of third-party open source content in the code cannot be ruled out, as developers don’t write code from scratch. Hence, it is important to make sure you have a policy in your company, in terms of what is acceptable and what is not acceptable, with regard to licenses. This policy should be known by everyone in the organisation.
Apart from this, try to make sure that you have a good and proven software adoption process in place. Code scanning tools should also be made available to developers. They are affordable, and significantly reduce and ease the effort required in identifying the different licenses and related obligations. They create certificates that indicate what components you have in your software. However, these certificates are only indications to prove that the company practices a quality policy related to scanning and identifying the use of licenses in their projects.
The eight-step open source software adoption process |
Mahshad recommends an eight-step process for effective open source license management.
This is what we call a structured open software adoption process. |