MongoDB: The Most Popular NoSQL Database

0
7175
Popular Open Source Databases
DatabaseHere are the basics of MongoDB, the database used by thousands of organisations, including 30 of the top 100 companies in the world. Many of the world’s most innovative Web companies use MongoDB.

MongoDB, considered as the most popular NoSQL database, is an open source document database that provides high performance, high availability and automatic scaling.
It has the following key features.

High performance: MongoDB provides high performance data persistence. In particular, its support for embedded data models reduces I/O activity on database systems. Its indexes support faster queries and can include keys from embedded documents and arrays.

High availability: To ensure high availability, MongoDB has the replication facility (also called replica sets), which provides the following:

  • Automatic failover
  • Data redundancy
  • A replica set is a group of MongoDB servers that maintain the same data, providing redundancy and increasing data availability.
    Automatic scaling: MongoDB provides horizontal scalability as part of its core functionality. Automatic sharding distributes data across a cluster of machines. Replica sets can provide eventually-consistent reads for low-latency high throughput deployments.
    Installing MongoDB
    MongoDB can be installed on various OSs in the following manner.
    Red Hat
  • Create a /etc/yum.repos.d/mongodb.repo file to hold the following configuration information for the MongoDB repository:
[mongodb]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/
gpgcheck=0
enabled=1
  • Install the MongoDB packages and associated tools as follows:
sudo yum install mongodb-org

Ubuntu

  • Import the public key used by the package management system:
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10

Create the /etc/apt/sources.list.d/mongodb.list list file using the following command:

echo ‘deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen’ | sudo tee /etc/apt/sources.list.d/mongodb.list
  • Reload the local package database:
sudo apt-get update
  • Install the MongoDB packages:
sudo apt-get install mongodb-org

<strong>Debian</strong>
  • Import the public key used by the package management system:
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10

Create the /etc/apt/sources.list.d/mongodb.list list file using the following command:

echo ‘deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen’ | sudo tee /etc/apt/sources.list.d/mongodb.list
  • Reload the local package database:
sudo apt-get update
  • Install the MongoDB packages:
sudo apt-get install mongodb-org

Getting started with MongoDB
From a system prompt, start MongoDB by running the ‘mongo’ command, as follows:

mongo

Select a database
From the Mongo shell, display the list of databases, with the following operation:

show dbs

Switch to a new database named mydb, with the following operation:

use mydb

Note: If ‘mydb’ is not present, a database with the name ‘mydb’ will be created.

Confirm that your session has the mydb database as context, by checking the value of the db object, which returns the name of the current database, as follows:

Db

Create a collection and insert documents
A collection can be created and documents can be inserted by the following methods.
Using Python PyMongo:

Install Pymongo
sudo pip install pymongo

python
>>>import pymongo
>>> client = MongoClient()
>>> db = client.test_database
>>> collection = db.test_collection
>>> db.test_database.insert(({'i': i} for i in xrange(10000)))

Directly from the Mongo shell:

for (var i = 1; i <= 25; i++) db.testData.insert( { x : i } )

or

j = { name : “mongo” }
k = { x : 3 }
db.testData.insert( j )
db.testData.insert( k )

Display collections

show collections

Or…

db.getCollectionNames()

MongoDB replication
In order to replicate the database, the following steps may be taken:

  • Edit the /etc/mongodb.conf on all the servers and add the following commands:
replSet=<Replication set name>
rest=true
  • Start mongod with mongod.conf :
mongod -f /etc/mongodb.conf
  • On the master MongoDB, enter the following command:
mongo
rs.initiate()
rs.add(“<hostname/IP of slave>:<port number on which Mongo is running, by default it runs on 27017>”)
  • On the slave MongoDB, enter the following command:
mongo
rs.initiate()

Note: If there are an even number of slaves, it’s advisable to add an arbiter to the setup. An arbiter is needed as a secondary cannot vote for itself; so, sometimes a situation arises when the votes are tied.

  • Start the arbiter. Install MongoDB. Create the arbiter data directory mkdir -p /data/arb (this can be anything you want). Start the arbiter service:
mongod --port 3000 --dbpath /data/arb/ --replset <replication set name used>
  • Enter the following command on the master MongoDB to add the arbiter to the setup:
mongo

rs.addArb(“<hostname/IP of arbiter>:3000”)
  • Use rs.status() to display the status of the replication set
  • Another way of checking for the members in replica is to use the URL http://<machine-name>:<port>/_replSe

Note: The port number is 1000 ahead of the MongoDB port; for example, if the Mongo port is 27017, here the port will be 28017.

Sharding
The config server processes are mongod instances that store the cluster’s metadata. You designate a mongod as a config server using the –configsvr option. Each config server stores a complete copy of the cluster’s metadata. In production deployments, you must deploy exactly three config server instances, each running on different servers to ensure good uptime and data safety. In test environments, you can run all three instances on a single server.

Steps to add sharding to the current setup

  • Install MongoDB as described earlier.
  • Create data directories for each of the three config server instances. By default, a config server stores its data files in the /data/configdb directory. You can choose a different location. To create a data directory, run a command similar to the following:
mkdir -p /data/configdb
  • Start the config server instances as follows:
mongod --configsvr --dbpath /data/configdb --port 22019(you can use any port)
  • Start the mongos instance. The mongos instances are lightweight and do not require data directories. You can run a mongos instance on a system that runs other cluster components, such as on an application server or a server running a mongod process.
mongos --configdb <config server hostnames/IP>

Add shards to the cluster

  • On the primary Mongo, issue the following command:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
  • On the mongos shell got by the above command, enter the following command:
sh.addShard( "<replication set name>/<Mongo server1>:<port>,<Mongo server2>:<port>,<Mongo server3>:<port>" )

Enable sharding for the database:

sh.enableSharding("<Name of the database you need to shard")

Now check the sharding status:

mongo
use admin
db.printShardingstatus()

You can enable sharding for a collection as follows:

  • Determine what you will use for the shard key.
  • If the collection already contains data, you must create an index on the shard key using ensureIndex(). If the collection is empty, then MongoDB will create the index as part of the sh.shardCollection() step.
  • Enable sharding for a collection by issuing the sh.shardCollection() method in the Mongo shell. The method uses the following syntax:
sh.shardCollection("<database>.<collection>", shard-key-pattern selected above)

Some errors with MongoDB
A list of a few possible errors with MongoDB, along with solutions to them, is given below.
Error 1: Couldn’t connect to server 127.0.0.1 shell/mongo.js:84 while starting Mongo shell.
Solution: This error takes place when Mongo is not shut down properly and the solution is as follows:

sudo rm /var/lib/mongodb/mongod.lock
sudo -u mongodb mongod -f /etc/mongodb.conf --repair 
sudo start mongodb

Error 2: In a replication setup, the primary server goes down and the secondary server fails to became primary. All we have is just secondary.

Solution A

--Increase priority of a server so that arbiter chooses it for primary
cfg = rs.conf() -Keep in mind the id
cfg.members[0].priority = 0.5
cfg.members[1].priority = 0.5
cfg.members[2].priority = 1 
rs.reconfig(cfg)

Solution B 

--Increase votes of a server so that arbiter chooses it for primary
cfg = rs.conf() -Keep in mind the id
cfg.members[0].voting = 0.5
cfg.members[1].voting = 0.5
cfg.members[2].voting = 1 
rs.reconfig(cfg)

Error 3: Child process failed; exited with error number 100.

Solution:

mongod --smallfiles
add nojournal = true to /etc/mongodb.conf

Error 4: Want to remove server from replication set.
Solution: Edit the /etc/mongodb.conf file and delete the entry repSet=<replication name>. On the Mongo shell, enter use local db.dropDatabase(). Restart Mongo service. On the master Mongo, enter rs.remove (“<Hostname/IP of the server:port”)

LEAVE A REPLY

Please enter your comment!
Please enter your name here