This column is aimed at enthusiasts who are interested in fast development in Python, with a minimal learning period. A rudimentary level of programming expertise is assumed, apart from an ability to visualise the task to be performed and to break it down into smaller parts using standard structures of procedural programming, like if or for constructs. I will also introduce the basic knowledge required for object-based programming, based on the Python libraries to be used. The column does not assume that readers have any level of Python knowledge.
I also aim to show how one can get started with programming in Python with an easier learning, and a fast debugging cycle. Developers, testers and even managers or architects can write small pieces of useful Python code to help improve existing products and make everyones jobs of developing and maintaining the core product easier, even if the core product is developed in some other language. Also, Python is fun to program.
Most real world products tend to have a huge set of attributes for their programs to do their jobs. These could be inputs regarding paths of installation, options for installation, parameters for program functionality and other configuration options for things like logging levels, output files, etc. While passing them through program execution parameters and through environment variables is an option, most professional programs pass and control such inputs using configuration files.
When the configuration files are correct, the programs work fine. But when they are malformed, or erroneously edited, most programs fail to function well. This may result in some aspect of the products availability or functionality being hampered. One way to handle this would be for the programs parsing the parameters to also make a check for valid entries in the configuration files. The programs could then fail safely and instruct users about the erroneous configurations. This, however, could make the programs bulkier. Also, there may not be valid ways for server programs or code to communicate these errors through an interface thats visible to users.
An easier alternative may be to ensure that these configuration files are well formed. This can be done by using some validator programs that run immediately after installation to confirm that the main programs making up the product will not have any functional issues. These validator programs could then print the errors in a user friendly interface and reduce the need for the main programs to check parameters. In this article, I aim to provide one such example of a validator program that will check that the configuration file is well formed and valid options are provided.
Before you begin on Python
For the purpose of this article, I will assume that you have installed Python 2.7.3. There are good discussion points about choosing Python 2 over Python 3. The main difference is that Python 3 is an improvement over Python 2, but these very improvements have resulted in Python 3 not being backward compatible with some of the already deployed code. Also, not all libraries have been ported to Python 3. These arguments make it simpler to use Python 2, for now.
The development environment used for this article is Windows XP. But the code and the examples should work on any Python 2.7 installation, independent of the OS.
Figure 1 shows the menu items that should be visible if your installation went through fine.
I strongly recommend that you start building the program given below using IDLE (Python GUI) to frequently test and understand if the programs parts are working the way it was envisaged.
IDLE GUI usage is shown like this in this article with the prompt >>>.
>>> print using idle gui using idle gui
The Python Manuals are a great starting point for learning Python, especially The Python Tutorial.
I would urge you to frequently use these manuals for reference. They also have a very good Search capability, if you want to find out something more on a particular topic.
The format of config files
Config files are normally in the following format, organised in terms of sections, with each section having different options. The options are normally organised as name value pairs.
[section1] Option1=Value1 Option2=Value2 [section2] Option3=Value1 Option4=Value2
An example of a config file could be as follows:
#example.cfg [General] Customer = CustomerX [customerx] BrandingTitle=CustomerX Title CustomerContact=CustomerX contact [customery] BrandingTitle=CustomerY Title CustomerContact=CustomerY contact
ConfigParser – parsing config files the
Python way
Python 2.7 supports a default ConfigParser package. Before using the package, you should import it as shown below:
>>> import ConfigParser >>> config1=ConfigParser.ConfigParser()
After importing it, we create an object of the ConfigParser called config 1 . We do this by creating an unnamed object and calling its constructor and assigning it to config1.
ConfigParser provides the following interfaces:
- read(filepath): This method initialises the ConfigParser object with the sections and options in a .cfg file, present in the path indicated by file path.
- sections(): This method returns the sections present in the config object. When initialised with a valid config file, this will return a list of section names.
- options(section): This method returns the options or keys present in a named section as a list.
- has_option(section, option): This method checks whether the section named has the option requested. It returns a true if the option exists and a false if it doesnt.
- get(section, option): This method returns the value associated with a particular option or key in the section named.
Let us put together all of the above, using example.cfg, which we can assume to be in the following path: c:\example.cfg
>>> config1.sections() [] >>> config1.read(c:\example.cfg) [c:\\example.cfg] >>> config1.sections() [General, customerx, customery] >>> config1.options(General) [customer] >>> config1.options(customerx) [brandingtitle, customercontact] >>> config1.has_option(customerx,brandingtitle) True >>> config1.has_option(customerx,phonenumber) False >>> config1.get(customerx,brandingtitle) CustomerX Title
Note 1: Strings can be represented using single quotes (General) or double quotes (c:\example.cfg).
Note 2: ConfigParser treats all values as strings. This is an important distinction if you want to treat the values as different data types, depending upon usage. You will need to convert it using built-in converters like int(), float(), etc.
Note 3: ConfigParser treats all values as strings, which also implies that when you want to set values into a config file, you will have to convert them using str() so that they can be written.
A procedure to check a config file
Now that we have the basic tools, let us put together a function to validate a given config file. To do so, we need to know the correct version of the config file, which we shall call the golden file. We want to be able to use a ConfigParser object to read from the golden file and then use that information to check the second config file. To keep the interface simple, we could simply take both files as arguments to the proposed function.
The proposed interface will look like whats shown below:
def ValidateConfigFile(goldenfilepath = none, filetobevalidated = none):
As an alternate, we could incorporate the golden file inside our program. This would avoid the need to distribute the golden file. But in such a design, whenever the config file changes to include newer parameters, it will necessitate changing our validation program to incorporate these changes. This, apparently, is not a good design.
Coming back to the API, here are a few pointers:
- Defining a function is done using the def keyword.
- The type of the parameters need not be defined and is detected to be a string, in this case.
- We can assign default values for the parameters in case the caller does not pass the same. In this case, we use none as the default value to do some input parameter validation.
- It is also Python syntax to indicate the beginning of a block of code, in this case, a function, by ending the declaration with a colon :
- The same applies for other constructs like if, for, etc.
Note: Python uses indentation to indicate that a set of lines are to be treated as part of a block. There is no end statement for the block of code. In the case of our function, the whole function needs to be indented by one level to indicate that the set of lines belong to the function.
Now, let us use a couple of ConfigParser objects and get our processing done. We have an erroneous config file created in the following manner, where the lines not in code are invalid:
#error.cfg [General]
Custome = CustomerX
[customerx]
BrandingTitl=CustomerX Title
CustomContact=CustomerX contact
[customery] BrandingTitle=CustomerY Title
tomerContact=CustomerY contact
Now, lets create a ConfigParser object for the golden file and another for the file to be checked:
#parse golden file goldenconfig = ConfigParser.ConfigParser() goldenconfig.read(goldenfilepath) #parse file to be validated filetovalidate = ConfigParser.ConfigParser() filetovalidate.read(filetobevalidated)
Let us run through those lines in IDLE GUI:
>>> goldenconfig = ConfigParser.ConfigParser() >>> goldenconfig.read(goldenfilepath) [c:\\example.cfg] >>> filetovalidate = ConfigParser.ConfigParser() >>> filetovalidate.read(filetobevalidated) [c:\\error.cfg]
For each option or key in a section, we again confirm that the same exists in the goldenconfig. We can create a simple list of strings with which we can store invalid lines when they occur (called incorrectlines) if the above check fails. A list has a simple append method with which to add a new element. There are many more operations possible with a list, which we could look into in later columns.
When set up for the above purpose, the code looks like what follows:
incorrectlines = [] for section in filetovalidate.sections(): #check each key is present in corresponding golden section for key in filetovalidate.options(section): if not goldenconfig.has_option(section,key): incorrectlines.append(key +=+filetovalidate.get(section,key))
When we execute the same, we get the following output in IDLE GUI:
>>> for section in filetovalidate.sections(): for key in filetovalidate.options(section): if not goldenconfig.has_option(section,key): incorrectlines.append(key +=+filetovalidate.get(section,key)) >>> incorrectlines [custome=CustomerX, brandingtitl=CustomerX Title, customcontact=CustomerX contact, tomercontact=CustomerY contact]
We could choose to print these into the stdout for now. Note that we can simply return a list of incorrectlines as part of our function execution.
if len(incorrectlines) > 0 : print The following lines are incorrect for k in incorrectlines: print k else: print The config file is fine return incorrectlines
len(list) returns the number of elements in the list. If this is non-zero, we iterate through the list using a for construct and print each value as we iterate. If there are no incorrect lines, we print that the config file is fine.
Executing the same in IDLE GUI returns the following output:
>>> len(incorrectlines) 4 >>> if len(incorrectlines) > 0 : print The following lines are incorrect for k in incorrectlines: print k else: print The config file is fine The following lines are incorrect custome=CustomerX brandingtitl=CustomerX Title customcontact=CustomerX contact tomercontact=CustomerY contact
Testing the function
We could add some self-test documentation in the source file to illustrate the usage.
We would expect ValidateConfigFile(c:\example.cfg,c:\example.cfg) to return an empty list.
print invoking ValidateConfigFile(validfile,validfile) lines = ValidateConfigFile(c:\example.cfg,c:\example.cfg) if len(lines)==0: print self test ok lines = ValidateConfigFile(c:\example.cfg,c:\error.cfg) if len(lines)>0: print errors present
Documentation strings
We can add documentation strings in the following way:
""" Validates a given Config File in filetobevalidated against a correct config file pointed to by goldenfilepath returns a list of erroneous lines as a list[strings] if config file is fine, it should return an empty list len(ValidateFile(c:\example.cfg, c:\example.cfg ))== 0 """
The complete file listing
This is as follows:
#ValidateConfigFile.py import ConfigParser def ValidateConfigFile(goldenfilepath = none, filetobevalidated = none ) : Validates a given Config File in filetobevalidated against a correct config file pointed to by goldenfilepath returns a list of erroneous lines as a list[strings] if config file is fine, it should return an empty list len(ValidateFile(c:\example.cfg, c:\example.cfg ))== 0 #learn golden file goldenconfig = ConfigParser.ConfigParser() goldenconfig.read(goldenfilepath) #learn file to be validated filetovalidate = ConfigParser.ConfigParser() filetovalidate.read(filetobevalidated) incorrectlines = [] for section in filetovalidate.sections(): #check each key is present in corresponding golden section for key in filetovalidate.options(section): if not goldenconfig.has_option(section,key): incorrectlines.append(key +=+filetovalidate.get(section,key)) # print incorrect lines if len(incorrectlines) > 0 : print The following lines are incorrect for k in incorrectlines: print k else: print All keys are fine return incorrectlines
Note: Single line comments can be made using #
Invoking the same using the Python interpreter
We can invoke the same using the Python interpreter. Before you do that though, make sure the Python path is added into the PATH variable.
C:\Users\ktrichy>python C:\ValidateConfigFile.py invoking ValidateConfigFile(validfile,validfile) All keys are fine
So, what have we accomplished so far? We started with a quick introduction to why someone should be interested in using Python. The answer we have suggested in this column is that Python could act as an ally to an existing program or product, and helps to solve small issues so that the main product can function properly.
One problem that could be tormenting products is incorrect configuration files. So we looked at the format of a configuration file, represented by a .cfg extension, and how Python provides a ConfigParser package to parse such files. We then looked at the interface offered by this class and what should be of interest to solve such a problem, before introducing elements of a small function that would validate a config file against a properly formed reference file, called a golden file in this article. Along the way, we also got introduced to IDLE (Python GUI) and how it makes debugging a lot easier. We saw examples of such debugging outputs, before we put together the whole solution and ran it across the Python interpreter.
This article is the first in a series to help you kick start what could prove to be a lifelong interest in Python. If you liked what youve read so far, you can be sure there are more such articles in the pipeline exploring the power and flexibility of Python.