Working with Apache SINGA, the Deep Learning Library

0
23

Here’s a short tutorial on how to install and configure Apache SINGA, the deep learning library that has been built for training machine learning models.

Apache SINGA is an open source deep learning library developed for training large-scale machine learning models efficiently across distributed systems. It is part of the Apache Software Foundation and focuses on scalability, flexibility, and ease of use for both research and production environments. It involves setting up dependencies, building from source (or installing via package managers), and configuring it for your specific use case (e.g., single-node or distributed training).

Comparison with PyTorch and TensorFlow
Figure 1: Comparison with PyTorch and TensorFlow

Why Apache SINGA?

Distributed training capabilities

SINGA excels in distributed training across multiple GPUs/nodes, making it ideal for large-scale datasets or models (e.g., deep neural networks, transformers). It supports both data parallelism and model parallelism, and integrates with communication backends like MPI, gRPC, and NCCL for efficient inter-node coordination.

Fault tolerance

It automatically recovers from node failures during distributed training, ensuring robustness in production environments.

Its key features are listed below.

Horizontal scaling

Efficiently distributes training across multiple nodes using both data parallelism (splitting data) and model parallelism (splitting model layers).

Synchronous/asynchronous training

Supports flexible synchronisation strategies for distributed environments.

Fault tolerance

Checkpointing and recovery mechanisms to handle node failures during long-running tasks.

Diverse neural networks

Built-in support for CNNs, RNNs, GANs, and reinforcement learning models.

Customisable layers

Low-level APIs (C++/Python) for fine-grained control, plus high-level APIs (like Singa-Easy) for rapid prototyping.

Dynamic and static graphs

Hybrid computation graph support for both ease of use and optimisation.

Edge and cloud deployment

Lightweight export options for mobile/IoT devices and cloud platforms.

Model compression

Techniques like quantization to reduce model size for resource-constrained environments.

Installing Apache SINGA

First, ensure your system meets the following requirements.

Operating system

Linux (Ubuntu 20.04/22.04 recommended) or macOS.

Dependencies

  • Python 3.6+ and pip
  • CMake 3.1+
  • GCC 7+ or Clang 5.0+
  • OpenMPI 4.0+ (for distributed training)
  • CUDA 10.2/11.x and cuDNN 8.x (for GPU support)
  • OpenCV (optional, for image processing)
  • Protocol Buffers (protobuf)
  • BLAS (e.g., OpenBLAS, Intel MKL)

Install dependencies on Ubuntu with the following code:

#sudo apt update
#sudo apt install -y build-essential cmake python3-dev python3-pip \
libopenblas-dev libopencv-dev protobuf-compiler libprotobuf-dev \
openmpi-bin libopenmpi-dev

Option 1 to install SINGA

Install via pip (CPU-only).

For a quick CPU-only installation, type:

pip install singa

Option 2 to install SINGA

Build from source (GPU support).

Clone the repository:

git clone https://github.com/apache/singa.git
 
cd singa

Configure the build (enable CUDA, OpenMPI, etc):

mkdir build
 
cd build
 
cmake -DCMAKE_INSTALL_PREFIX=/usr/local \
 
-DENABLE_CUDA=ON \
 
-DENABLE_DIST=ON \
 
-DENABLE_TEST=ON \

Build and install using the following code:

make -j$(nproc) # Use all CPU cores
 
sudo make install

Install Python bindings:

cd ../python
 
pip install .

Configure environment variables as follows:

Add the following to your ~/.bashrc or ~/.zshrc:
 
export SINGA_HOME=/path/to/singa # Path to SINGA source directory (if built from source)
 
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
 
export PATH=$PATH:/usr/local/bin

Reload the shell:

source ~/.bashrc

Verifying the installation

Test SINGA in Python as follows:

import singa
 
print(singa.__version__) # Should output the installed version

Test GPU support:

import singa
 
print(singa.gpu_devices()) # List available GPUs

Configuring distributed training

For multi-node training, set up MPI. Ensure OpenMPI is installed and nodes can communicate via SSH without passwords.

Create a hostfile (e.g., hostfile) listing all worker nodes:

worker1 slots=2 # 2 GPUs on worker1
 
worker2 slots=1 # 1 GPU on worker2

Run a distributed job:

mpirun -np 3 -hostfile hostfile python train.py

Example configuration for training

Create a simple neural network using SINGA’s APIs:

import singa
 
from singa import tensor, opt, autograd
 
# Define a model
 
class MLP(singa.Module):
 
def __init__(self):
 
super().__init__()
 
self.w0 = tensor.Tensor((784, 512), singa.float32)
 
self.w1 = tensor.Tensor((512, 10), singa.float32)
 
self.w0.gaussian(0, 0.01)
 
self.w1.gaussian(0, 0.01)
 
def forward(self, x):
 
x = autograd.matmul(x, self.w0)
 
x = autograd.relu(x)
 
x = autograd.matmul(x, self.w1)
 
return x
 
# Initialize model and optimizer
 
model = MLP()
 
sgd = opt.SGD(lr=0.01)
 
# Training loop
 
for epoch in range(10):
 
for x, y in dataloader: # Replace with your data loader
 
x = tensor.Tensor(x)
 
y = tensor.Tensor(y)
 
out = model(x)
 
loss = autograd.softmax_cross_entropy(out, y)
 
sgd.backward_and_update(loss)

Apache SINGA is now configured for your machine! You can adjust the configurations based on your hardware (CPU/GPU) and use case (single-node/distributed).

Apache SINGA is ideal for developers and researchers who need scalable, efficient, and flexible deep learning with strong support for distributed environments. Its unique blend of performance optimisations, multi-modal data handling, and fault tolerance makes it a powerful alternative to mainstream frameworks like TensorFlow and PyTorch, especially in large-scale or resource-constrained scenarios.