Open source technologies are evolving fast with the continuous development of innovative solutions. This article mainly focuses on describing a few such breakthrough technologies and tools under the open source umbrella.
Open source in virtual reality and augmented reality
Virtual reality is one of the key technologies that goes way beyond gaming into fields like nuclear reaction simulation, flight/battlefield simulation, and phobia treatment through the use of virtual models, etc. A few noteworthy open source tools are OSVR, ARToolkit, OpenSimulator, etc.
- OSVR or Open Source Virtual Reality (http://www.osvr.org/) is an open source project initiated and sponsored by Sensics and Razer. It is provided under the Apache 2 licence. The main components include open source hardware (HMD or head mounted display) and an OSVR framework, which is open source software that provides a standardised way of discovering and configuring various VR and AR peripherals.
- ARToolKit (http://artoolkit.org/) is an open source C library that aims at developing augmented reality applications. The open source project has its source code hosted over GitHub. Its a cross-platform tool with support for Windows, Mac, Linux, iOS and Android platforms and hence is easily portable. The software makes use of various types of computer vision for robust feature tracking.
- OpenSim (http://opensimulator.org/) is an open source multi-platform application server (mainly written in C#), which is primarily used for simulating virtual environments. The source code is released under a BSD licence, such that it can be easily used in projects. Still under development, this .NET based framework allows many features and an inheritance based approach that makes it highly extensible.
Open source in machine learning
Among the most used open source tools/libraries for building efficient ML applications are Apache Mahout, Weka, Scikit-Learn, etc. Two of the most recent ML engines that have been made open source by the developer organisations are TensorFlow from Google and DMTK (Distributed Machine Learning Toolkit) from Microsoft.
- Tensor Flow (https://www.tensorflow.org/) is a numerical library mostly used for deep neural network based computation, where all the computations are modelled as data flow graphs. Recently outsourced by Google, the main advantages of this
tool are its Python based API and easy-to-use C++ interface, and its flexibility to run on a variety of hardware, ranging from cheap commodity hardware to heterogeneous GPU based systems.
- DMTK (http://www.dmtk.io/) is an open source machine learning toolkit developed by Microsoft researchers. The toolkit provides a framework that can be used to train machine learning models on Big Data. Following a client-server based model, this toolkit also provides two inter-process communication libraries, MPI and ZMQ, to handle a variety of cluster environments used for model training.
- Scikit-learn (http://scikit-learn.org/) is a Python based tool with very active community support. The underlying compiled libraries NumPy, Scipy, Matplotlib, etc, make the tool fast and efficient. It is used for both academic as well as commercial ML development due to its versatile and simple APIs.
Open source in IoT
The IoT (Internet of Things) is about how smart devices can collect and exchange data that is transmitted over the Internet via other connected devices and sensors. IoT is the most talked technology around the world. Due to its advantages and popularity, most companies are moving towards solutions based on it. Considering how businesses can take advantage of open source IoT, one can check out the following options.
- Arduino is one of the most preferred tools for IoT since it provides both a hardware specification and software with an IDE programming language for IoT. Refer to https://www.arduino.cc/ for more information on getting started with Arduino.
- Node-RED built on node.js (http://nodered.org/) provides visual tools for developers to connect together hardware devices, APIs and services for the Internet of Things.
- Middleware: KAA (http://www.kaaproject.org/#2) and OpenRemote (http://www. openremote.com/) are open source projects, which provide an end point SDK that can be embedded into connected devices. They support both full OSs and OS-less microchips alike, with virtually any hardware from low-powered MCPUs.
- Operating system: Snappy Ubuntu Core (http://developer.ubuntu.com/en/snappy/), Contiki (http:// www.contiki-os.org/ ) and RIOT (http://www.riot-os.org/ ) are the most preferred OSs. Snappy Ubuntu Core offers a lightweight and more reliable OS for IoT based applications. Contiki provides a full-fledged OS dedicated for WSN with underlying support for various hardware devices. RIOT is another similar IoT-friendly OS designed to minimise the memory footprint and enhance energy efficiency. Extensive community support leads to continuous evolution of these tools.
Open source in Big Data
Big Data has become a reality now, and it is important for the business continuity. The volume, variety, velocity and veracity in which data is accumulated by organisations has led to phenomenal growth not just in understanding Big Data but also better and faster analytics. Due to its popularity, a lot of open source projects have evolved around Big Data and some of the most popular ones listed below serve different purposes.
- Druid is an open source distributed data store originally developed to analyse online events for ad markets and to work on data streams.
- Apache Spark: This is a newer data processing engine used to run analysis faster on large datasets. It has support for applications written in Java, Scala, Python and R.
- Apache Flume: This is a service that gathers information from distributed sources that is later stored in HDFS.
- Apache Hive: This tool allows people to use an SQL-like language to analyse petabytes of data.
- Taiga: Originally developed by LinkedIn, this tool is responsible for distributed stream processing.
- HBase: This open source NoSQL database has been designed for backend search engines and has the ability of returning search results faster.
Open source in containers
Container technology has changed the way in which applications are deployed and managed. Containers provide the capability to host different isolated applications using resource isolation features of the OS.
Docker is the most popular open source container technology due to its light weight and the ability to package applications with all dependencies into standardised units of software development. With Docker, the developer doesnt need to waste hours in building a dev environment, nor make copies of production code and spin up new instances. Neither does the developer need to eliminate the environment inconsistency by packaging an application in such a way that it can run on any machine, either for test or production. This avoids the overhead of installing configs on different systems. Docker creates a common framework for developers and sysadmins to work together on distributed applications.
The evolution of open source is a continuous process as many developers are continually working on enhancing the features. Sooner or later, open source solutions might become the de-facto standard for applications in many fields as they are being adopted by academia and enterprise alike.
The author is a software development engineer at Dell R&D, Bengaluru, and is interested in network security and cryptography.