Exploring Software: ZODB, a NoSQL Database

3
6244
Exploring ZODB

Exploring ZODB

Let’s explore how to use ZODB, a NoSQL database, from Python, with an example that stores and retrieves ‘album’ and ‘track’ data from the database.

Most of us are accustomed to using a relational database to  store large volumes of data. We rarely look for alternatives unless we run into a bottleneck. Even then, you are likely  to put in a lot more effort into optimising the database, rather than stepping outside the relational model.

Non-relational databases have been around for many years. When object-oriented programming became popular, a number of object databases were created, but none captured any substantial share of mind. Object-relational-mapping software like Hibernate for Java, QLAlchemy for Python, and ActiveRecord for Ruby, fulfilled the need of using relational databases within the object-oriented programming paradigm.

SQL is a wonderful tool for arbitrary queries on a relational database. However, you may overestimate the need for it. For example, when  dealing with a content-management system, you are more likely to need a keyword-retrieval option, rather than a flexible SQL query.

I use a  keyword search with GMail, and I have rarely felt the need to narrow  the search to, say, the subject only. Even if I search based on the subject line, I still need a keyword search. I can’t recall any need for a search where the use of an index on the subject would have been beneficial — for  example, matching a prefix. Hence, a keyword-search tool like Apache Lucene along with any database, whether relational or not, can be a superb solution.

In the last few years, the need for Web-scale databases has increased the interest in NoSQL databases — a misleading term, which is now often interpreted as “not only SQL”. One category of such databases is object database management systems (ODBMS), and among them is a native object database for  Python — ZODB.

Object databases provide ACID support. They reduce the friction of having to transform objects into relational table rows and vice versa — thus improving the efficiency of accessing and manipulating objects. There is no need to map all your information needs into a well-defined schema, which can be very difficult at times.

Imagine a shopping engine. Each category, or even a product group, may need attributes that are a unique combination for  the product. So do we create a superset of all attributes, or do we create a keyword-value pair? Or, better still, should we just dump them in a string description and interpret the string at runtime?

ZODB, in practice

ZODB is like a (Python) dictionary. It stores data in a key-value pair, where the value is a pickled (serialised) object. An object could be a container, which is like a dictionary for storing a very large number of elements. Let us look at a simple example that would be perfectly suitable for a relational database, and see how it may be implemented in ZODB.

We have a set of albums, and a set of tracks. Now, you may wish to access the tracks, and from there, if need be, access the album of which it is a part. On the other hand, you may access an album, and then want to access the tracks that make up that album.

In the relational model, you would need a table for each, albums and tracks, and a foreign key from a track to an album. You’d need an additional table to maintain the relationship between the album and tracks.

When you realise that a track can be in multiple albums, you’d have to create one more table for that relationship, instead of using a foreign key.

Now, let us look at how to do this using ZODB. The initial step is to create/open the database, open a connection and access its root.

Let’s write this basic code in app_db.py, as you will need to use it in each script that uses the application database.

from ZODB import FileStorage, DB 

class app_db(object):
    def __init__(self, path='./Data.fs'): 
        self.storage = FileStorage.FileStorage(path) 
        self.db = DB(self.storage) 
        self.connection = self.db.open() 
        self.dbroot = self.connection.root()


    def close(self): 
        self.connection.close() 
        self.db.close() 
        self.storage.close()

Let’s next write a script, create_containers.py, to create b-tree containers for albums and tracks.

from app_db import app_db 
import transaction 
from BTrees.OOBTree import OOBTree 
db = app_db() 
dbroot = db.dbroot 
dbroot['Albums']=OOBTree() 
dbroot['Tracks']=OOBTree() 
transaction.commit() 
db.close()

The next step is to define the models you need. Let’s write them in app_models.py. Each track can belong to multiple albums, and each album contains multiple tracks. The only note-worthy line is the assignment of 1 to the _p_changed variable, to tell ZODB that a mutable structure like a list, or a dictionary, has changed.

from persistent import Persistent 

class Track(Persistent): 
    def __init__(self, title, artist=None, albums=[]): 
        self.title = title 
        self.artist = artist 
        self.albums = albums

    def add_album(self, album): 
        self.albums.append(album) 
        self._p_changed = 1 

class Album(Persistent): 
    def __init__(self, name, year=None): 
        self.name = name 
        self.year = year 
        self.tracks = []

    def add_track(self, track): 
        self.tracks.append(track) 
        self._p_changed = 1

Let us create a simple script, store_data.py, to add some tracks and an album.

from app_db import app_db 
from app_models import Album, Track 
import transaction 
db = app_db() 
albums = db.dbroot['Albums'] 
tracks = db.dbroot['Tracks'] 
tracks['Blowing in the Wind'] = Track('Blowing in the Wind', artist='Bob Dylan') 
tracks['Like a Rolling Stone'] = Track('Like a Rolling Stone', artist='Bob Dylan') 
# the key can be any unique id 
albums['U1'] = Album('Ultimate Collection') 
# add relationships 
album = albums['U1'] 
track = tracks['Blowing in the Wind'] 
track.add_album(album) 
album.add_track(track) 
track = tracks['Like a Rolling Stone'] 
track.add_album(album) 
album.add_track(track) 
transaction.commit() 
db.close()

Finally, print the data, to see how to access the data in ZODB. Iterate over each album and each track, and print the values of the object. The details flag is used to prevent an indefinite recursive loop.

from app_db import app_db 
from app_models import Track, Album 

def print_album(album, details=True): 
    print('Name: %s in %s '%(album.name, album.year)) 
    if details: 
        for track in album.tracks: 
            print_track(track, details=False) 
        print('') 

def print_track(track,details=True): 
    print('Title: %s by %s'%(track.title, track.artist)) 
    if details: 
        for album in track.albums: 
            print_album(album,details=False) 
        print('') 

db = app_db() 
# iterate over albums and tracks 
print('List of Albums') 
for album in db.dbroot['Albums'].values(): 
    print_album(album) 
print('List of Tracks') 
for track in db.dbroot['Tracks'].values(): 
    print_track(track)   
db.close()

Working with ZODB is almost as easy as dealing with dictionaries. You can use the Python method isinstanceof to determine the type of an object you are dealing with, and write very versatile and flexible code. ZODB has been around for over a decade, and has been used in various production environments, though the Zope community does not seem to have been successful in marketing it to developers for use outside the Zope (or Plone) environments.

Feature image courtesy: Tim Morgan. Reused under the terms of CC-BY 2.0 License.

3 COMMENTS

  1. how to get values of dictionary which is saved zodb. right now we are getting values in some hex format values . so i just want to know how do i convert it into string format.

  2. Is there any admin interface for exploring(view and edit) ZODB database? zodb browser is a python package by that we can view zodb database, but i want view and edit…Do u know any tool, like that?

LEAVE A REPLY

Please enter your comment!
Please enter your name here