This article, which is part of the series on Linux device drivers, demonstrates the creation and usage of files under the /proc virtual filesystem.
After many months, Shweta and Pugs got together for some peaceful technical romancing. All through, they had been using all kinds of kernel windows, especially through the /proc
virtual filesystem (using cat
), to help them decode various details of Linux device drivers. Here’s a non-exhaustive summary listing:
/proc/modules
— dynamically loaded modules/proc/devices
— registered character and block major numbers/proc/iomem
— on-system physical RAM and bus device addresses/proc/ioports
— on-system I/O port addresses (especially for x86 systems)/proc/interrupts
— registered interrupt request numbers/proc/softirqs
— registered soft IRQs/proc/kallsyms
— running kernel symbols, including from loaded modules/proc/partitions
— currently connected block devices and their partitions/proc/filesystems
— currently active filesystem drivers/proc/swaps
— currently active swaps/proc/cpuinfo
— information about the CPU(s) on the system/proc/meminfo
— information about the memory on the system, viz., RAM, swap, …
Custom kernel windows
“Yes, these have been really helpful in understanding and debugging Linux device drivers. But is it possible for us to also provide some help? Yes, I mean can we create one such kernel window through /proc
?” asked Shweta.
“Why just one? You can have as many as you want. And it’s simple — just use the right set of APIs, and there you go.”
“For you, everything is simple,” Shweta grumbled.
“No yaar, this is seriously simple,” smiled Pugs. “Just watch me creating one for you,” he added.
And in a jiffy, Pugs created the proc_window.c
file below:
#include <linux/module.h> #include <linux/kernel.h> #include <linux/proc_fs.h> #include <linux/jiffies.h> static struct proc_dir_entry *parent, *file, *link; static int state = 0; int time_read(char *page, char **start, off_t off, int count, int *eof, void *data) { int len, val; unsigned long act_jiffies; len = sprintf(page, "state = %d\n", state); act_jiffies = jiffies - INITIAL_JIFFIES; val = jiffies_to_msecs(act_jiffies); switch (state) { case 0: len += sprintf(page + len, "time = %ld jiffies\n", act_jiffies); break; case 1: len += sprintf(page + len, "time = %d msecs\n", val); break; case 2: len += sprintf(page + len, "time = %ds %dms\n", val / 1000, val % 1000); break; case 3: val /= 1000; len += sprintf(page + len, "time = %02d:%02d:%02d\n", val / 3600, (val / 60) % 60, val % 60); break; default: len += sprintf(page + len, "<not implemented>\n"); break; } len += sprintf(page + len, "{offset = %ld; count = %d;}\n", off, count); return len; } int time_write(struct file *file, const char __user *buffer, unsigned long count, void *data) { if (count > 2) return count; if ((count == 2) && (buffer[1] != '\n')) return count; if ((buffer[0] < '0') || ('9' < buffer[0])) return count; state = buffer[0] - '0'; return count; } static int __init proc_win_init(void) { if ((parent = proc_mkdir("anil", NULL)) == NULL) { return -1; } if ((file = create_proc_entry("rel_time", 0666, parent)) == NULL) { remove_proc_entry("anil", NULL); return -1; } file->read_proc = time_read; file->write_proc = time_write; if ((link = proc_symlink("rel_time_l", parent, "rel_time")) == NULL) { remove_proc_entry("rel_time", parent); remove_proc_entry("anil", NULL); return -1; } link->uid = 0; link->gid = 100; return 0; } static void __exit proc_win_exit(void) { remove_proc_entry("rel_time_l", parent); remove_proc_entry("rel_time", parent); remove_proc_entry("anil", NULL); } module_init(proc_win_init); module_exit(proc_win_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Kernel window /proc Demonstration Driver");
And then Pugs did the following:
- Built the driver file (
proc_window.ko
) using the usual driver’sMakefile
. - Loaded the driver using
insmod
. - Showed various experiments using the newly created proc windows. (Refer to Figure 1.)
- And finally, unloaded the driver using
rmmod
.
Demystifying the details
Starting from the constructor proc_win_init()
, three proc entries have been created:
- Directory
anil
under/proc
(i.e., NULL parent) with default permissions 0755, usingproc_mkdir()
- Regular file
rel_time
in the above directory, with permissions 0666, usingcreate_proc_entry()
- Soft link
rel_time_l
to the filerel_time
, in the same directory, usingproc_symlink()
The corresponding removal of these is done with remove_proc_entry()
in the destructor, proc_win_exit()
, in chronological reverse order.
For every entry created under /proc
, a corresponding struct proc_dir_entry
is created. For each, many of its fields could be further updated as needed:
- mode — Permissions of the file
- uid — User ID of the file
- gid — Group ID of the file
Additionally, for a regular file, the following two function pointers for reading and writing over the file could be provided, respectively:
int (*read_proc)(char *page, char **start, off_t off, int count, int *eof, void *data)
int (*write_proc)(struct file *file, const char __user *buffer, unsigned long count, void *data)
write_proc()
is very similar to the character driver’s file operation write()
. The above implementation lets the user write a digit from 0 to 9, and accordingly sets the internal state. read_proc()
in the above implementation provides the current state, and the time since the system has been booted up — in different units, based on the current state. These are jiffies in state 0; milliseconds in state 1; seconds and milliseconds in state 2; hours, minutes and seconds in state 3; and <not implemented> in other states.
And to check the computation accuracy, Figure 2 highlights the system uptime in the output of top. read_proc
‘s page parameter is a page-sized buffer, typically to be filled up with count bytes from offset off. But more often than not (because of less content), just the page is filled up, ignoring all other parameters.
All the /proc
-related structure definitions and function declarations are available through <linux/proc_fs.h>
. The jiffies-related function declarations and macro definitions are in <linux/jiffies.h>
. As a special note, the actual jiffies are calculated by subtracting INITIAL_JIFFIES
, since on boot-up, jiffies is initialised to INITIAL_JIFFIES
instead of zero.
Summing up
“Hey Pugs! Why did you set the folder name to anil
? Who is this Anil? You could have used my name, or maybe yours,” suggested Shweta. “Ha! That’s a surprise. My real name is Anil; it’s just that everyone in college knows me as Pugs,” smiled Pugs.
Watch out for further technical romancing from Pugs a.k.a Anil.
Awesome
this can be used to modify the hardware register contents or read the register contents that can be really helpful in debugging drivers….
Yes, you are right – that could be one of its powerful usage, which in fact is one of the techniques for “debugging by querying”.
@anil_pugalia:disqus sir……what’s the jiffies and INITIAL_JIFFIES stands for…..and what’s the meaning of HZ…..actually i am trying to calculate the jiffies for the write and read operation in character driver…..but when i print the jiffies value….it will never chnages in the starting and ending of the write and read operation…….can u clearify the concept behind this//
jiffies is the unit of resolution of kernel time. Nowadays, on a typical PC, it is 1 msec, which could be around a million instructions. So, your read or write is finishing before that, and hence you do not see any change in jiffies.
@anil_pugalia:disqus if INITIAL_JIFFIES is the value of jiffies at the boot time…..then how INITIAL_JIFFIES would be greater then the jiffies…….actually when i am use to print the INITIAL_JIFFIES and jiffies then INITIAL_JIFFIES would be greater then the jiffies…..can you clearify sir//
jiffies is 32-bit variable, which would have a maximum, after which it would overflow and return back to 0. Moreover, INITIAL_JIFFIES is initialized to its maximum minus jiffies in 5 minutes, and hence most of the times you’d find it to be greater.
Hi Sir,
I am not able to understand the usage of len variable. can you please explain abt it.?