MongoDB is a popular open source document database known for its performance and scalability. It contains a database model that manages the abundant data in any enterprise application. The R programming language has several packages that deal with MongoDB effectively, to extract and manipulate the data.
Eliot Horowitz and Dwight Merriman developed MongoDB in 2007 to address the issue of scalability of data. In 2009, it was made an open source project and people started using it because of its amazing features.
The many prominent features of MongoDB include the following.
- High performance: It supports embedded documents; so I/O operations are fewer. It also handles indexes; so there’s faster query response in comparison to relational database systems.
- Excellent features for Query: It supports all CRUD operations of querying efficiently. It also comprises text search features and aggregation features.
- High availability: It provides the great functionality of auto replication; so availability is high.
- Horizontal scalability: It provides the sharding feature to handle horizontal scalability.
- Multiple storage engines: It supports multiple storage engines in order to save the data in memory as well as disk.
The R programming language contains several interfaces in order to connect with MongoDB. With the R interfaces, we are able to extract data and perform data analytics tasks.
R packages for MongoDB
R provides several packages to deal with MongoDB, which are listed below.
- mongolite: This is a fast and very simple MongoDB client for R programming. It is based on the mongo-c-driver and jsonlite. It has several functionalities like indexing, encryption, Map-Reduce, aggregation, streaming, etc.
- RMongo: This is a user friendly MongoDB database interface for R. The interface is provided via Java calls to the mongo-java-driver.
- rmongodb: This is a very powerful package in order to interact with MongoDB. It provides functionalities like querying, inserting and updating to MongoDB with JSON and BSON, handling BSON objects, creating indices on MongoDB collections, aggregation pipelines, etc.
The mongolite package for MongoDB
Mongolite can be installed directly from CRAN for the Windows operating system, as follows:
install.packages(“mongolite”)
You need to set up a local server for MongoDB. Once the mongolite package is installed, load it by using the following command:
library(mongolite)
The mongo() function connects to a MongoDB server. The parameters for the function are:
mongo(dataset, url)
For example:
Test <- mongo(“mydataset”, url= “mongodb: //cmpica:admin@mango.cmpica.org:40123/mongo_test”)
Once it is connected, initiate it by establishing a new connection:
Newc <- mongo(“Result”) Newc $ insert (ggplot2:: Result)
Once data insertion is over, we can fire queries on the data set. JSON based syntax is used to query the data set.
Newc$count(‘{}”) // { } means all data selection > [1] 270 Readdata <- Newc $ find(‘{}’) // Read all the data Print(Readdata) // Display the data Q1 <- Newc$ find (‘ { “Semester” : “ First”, “CGPA” : { “$gt” : 8 } ) // Retrieve all records having First Semester and CGPA > 8
The find() method is applicable for the collection of records. Another function iterate() allows you to perform a query — this is not for the collection of records, but it reads the records one-by-one without any collection. The iterator has methods one(), batch(n) which allow you to step through a single or n number of records at a time.
Q2 <- Newc$ iterate (‘ { “Semester” : “ First”, sort = ‘{ “CGPA” : 1 }’ ) while(! is.null ( u1 <- 10) ) { cat (sprint( “ Result of First Semester is %d CGPA”, u1 $CGPA) } // Read the records from the iterator
R programming contains a number of packages that deal with MongoDB. In this article we have covered three main packages and their utilities. Mongolite, a fast and very simple MongoDB client from all three packages has been discussed in detail. We have explored how to install and load the packages, and how to initiate the connection with MongoDB, establish the new connection with the mongo function, insert the data using the insert method, and how to fire the queries.