So, what was your guess on how Shweta would crack the problem? Obviously, with the help of Pugs. Wasn’t it obvious? In our previous article, we saw how Shweta was puzzled by not being able to read any data, even after writing into the /dev/mynull
character device file. Suddenly, a bell rang — not inside her head, but a real one at the door. And for sure, there was Pugs.
“How come you’re here?” exclaimed Shweta.
“I saw your tweet. It’s cool that you cracked your first character driver all on your own. That’s amazing. So, what are you up to now?” asked Pugs.
“I’ll tell you, on the condition that you do not play spoil sport,” replied Shweta.
Pugs smiled, “Okay, I’ll only give you advice.”
“And that too, only if I ask for it! I am trying to understand character device file operations,” said Shweta.
Pugs perked up, saying, “I have an idea. Why don’t you decode and then explain what you’ve understood about it?”
Shweta felt that was a good idea. She tail
‘ed the dmesg
log to observe the printk
output from her driver. Alongside, she opened her null driver code on her console, specifically observing the device file operations my_open
, my_close
, my_read
, and my_write
.
static int my_open(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: open()\n"); return 0; } static int my_close(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: close()\n"); return 0; } static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); return 0; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); return len; }
Based on the earlier understanding of the return value of the functions in the kernel, my_open()
and my_close()
are trivial, their return types being int, and both of them returning zero, means success.
However, the return types of both my_read()
and my_write()
are not int, rather, it is ssize_t
. On further digging through kernel headers, that turns out to be a signed word. So, returning a negative number would be a usual error. But a non-negative return value would have additional meaning. For the read operation, it would be the number of bytes read, and for the write operation, it would be the number of bytes written.
Reading the device file
To understand this in detail, the complete flow has to be given a relook. Let’s take the read operation first. When the user does a read from the device file /dev/mynull
, that system call comes to the virtual file system (VFS) layer in the kernel. VFS decodes the <major, minor>
tuple, and figures out that it needs to redirect it to the driver’s function my_read()
, that’s registered with it. So from that angle, my_read()
is invoked as a request to read, from us — the device-driver writers. And hence, its return value would indicate to the requesters (i.e., the users), how many bytes they are getting from the read request.
In our null driver example, we returned zero — which meant no bytes available, or in other words, the end of the file. And hence, when the device file is being read, the result is always nothing, independent of what is written into it.
“Hmmm… So, if I change it to 1, would it start giving me some data?” asked Pugs, by way of verifying.
Shweta paused for a while, looked at the parameters of the function my_read()
and answered in the affirmative, but with a caveat — the data sent would be some junk data, since my_read()
is not really populating data into buf
(the buffer variable that is the second parameter of my_read()
, provided by the user). In fact, my_read()
should write data into buf
, according to len
(the third parameter to the function), the count in bytes requested by the user.
To be more specific, it should write less than, or equal to, len
bytes of data into buf
, and the number of bytes written should be passed back as the return value. No, this is not a typo — in the read operation, device-driver writers “write” into the user-supplied buffer. We read the data from (possibly) an underlying device, and then write that data into the user buffer, so that the user can read it. “That’s really smart of you,” said Pugs, sarcastically.
Writing into the device file
The write operation is the reverse. The user provides len
(the third parameter of my_write()
) bytes of data to be written, in buf
(the second parameter of my_write()
). The my_write()
function would read that data and possibly write it to an underlying device, and return the number of bytes that have been successfully written.
“Aha!! That’s why all my writes into /dev/ mynull
have been successful, without actually doing any read or write,” exclaimed Shweta, filled with happiness at understanding the complete flow of device file operations.
Preserving the last character
With Shweta not giving Pugs any chance to correct her, he came up with a challenge. “Okay. Seems like you are thoroughly clear with the read/write fundamentals; so, here’s a question for you. Can you modify these my_read()
and my_write()
functions such that whenever I read /dev/mynull
, I get the last character written into /dev/mynull
?”
Confidently, Shweta took on the challenge, and modified my_read()
and my_write()
as follows, adding a static global character variable:
static char c; static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); buf[0] = c; return 1; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); c = buf[len – 1]; return len; }
“Almost there, but what if the user has provided an invalid buffer, or if the user buffer is swapped out. Wouldn’t this direct access of the user-space buf
just crash and oops the kernel?” pounced Pugs.
Shweta, refusing to be intimidated, dived into her collated material and figured out that there are two APIs just to ensure that user-space buffers are safe to access, and then updated them. With the complete understanding of the APIs, she rewrote the above code snippet as follows:
static char c; static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); if (copy_to_user(buf, &c, 1) != 0) return -EFAULT; else return 1; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); if (copy_from_user(&c, buf + len – 1, 1) != 0) return -EFAULT; else return len; }
Then Shweta repeated the usual build-and-test steps as follows:
- Build the modified “null” driver (
.ko
file) by runningmake
. - Load the driver using
insmod
. - Write into
/dev/mynull
, say, usingecho -n "Pugs" > /dev/ mynull
- Read from
/dev/mynull
using cat/dev/mynull
(stop by using Ctrl+C) - Unload the driver using
rmmod
.
On cat
‘ing /dev/mynull
, the output was a non-stop infinite sequence of s
, as my_read()
gives the last one character forever. So, Pugs intervened and pressed Ctrl+C to stop the infinite read, and tried to explain, “If this is to be changed to ‘the last character only once’, my_read()
needs to return 1 the first time, and zero from the second time onwards. This can be achieved using off (the fourth parameter of my_read()
).”
Shweta nodded her head obligingly, just to bolster Pugs’ ego.
Could you explain how the fourth parameter ‘off’ can be used to prevent the infinite output sequence. Thanks in advance.
For that please understand, why is the infinite sequence in the first place. It is because the read never says end of file by returning a zero, but always keep on giving data, whenever asked for. With this a user code doing a read till end of file would go into infinite loop. Hence, to fix this you need to have a case of returning a zero. In our case, we used the case to be “when you try to read the second time or second byte”, which is very well captured by the fourth parameter ‘off’, telling us where exactly was it already reading.
You might want to add the linux header uaccess.h to access those api calls.
You are right. I missed mentioning that.
Please mention to add the header in order for the copy_to_user and copy_from_user function calls to work properly. I’m following each and every step of your guide. :)
Thanks for the addendum.
I made the changes in my write and read functions and built the driver. However, after writing to the driver using echo, when I tried “cat /dev/mynull” nothing happened. The output was same as in last article. Any idea where I might have gone wrong?
I am able to compile successfully, then i insmod the module, then I did the write openration as you mentioned the “$ echo -1 “Pugs” > /dev/mynull “. Now I tried to read the file using “$ sudo cat /dev/mynull” but see something in infinite loop, but I am not able to see any character. I can’t see anything and the read() operation in loop. Where I am going wrong?
It should be “echo -n” not “echo -1”. Please correct it & try.
Firstly , great article. I bought the Linux device driver book buy Oreilly but could not understand much. Most of my learning has been from your website. If you write a book please do let me know. I will be the first to buy it :-).I tried to use the value of *off and I printed it to my logs, but it appears that it is always 0. I had to use a separate variable to make it print only once. Could you please explain as to how to do it using the long offset pointer
Thanks for reading & appreciating the article. *off would change
only if the driver changes it. So, in read when it is 0, you need to put
the value in buf & then increment the *off, i.e. do a (*off)++;
Okay. That clears things up. Thanks
thanks sir , for your contribution…
sir, is there any way to track down how control is going from when we call open to .open in operation struct
any tool or any other way to know the control flow in device driver..
You may try strace, printk in kernel, … to name a few.
Awesome article ….
Thanks for reading & appreciating.
I can’t understand what you are trying to say regarding stopping infinite loop using off parameter…Can you elaborate ..?
Basically, by using off parameter, one can return 0 (i.e. create an end of file scenario), say after returning the first character for the first time. It would make more sense, if you had tried Shweta’s test steps, above. Precisely writing, thereafter check out the following and see the difference for yourself:
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
printk(KERN_INFO “Driver: read()n”);
if (*off == 0)
{
if (copy_to_user(buf, &c, 1) != 0)
return -EFAULT;
else
{
(*off)++;
return 1;
}
}
else
return 0;
}
Hi
How to implement a select interface for a char driver?
Implement the poll system call.
The post is very very good.
But when I try to use your ….
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
printk(KERN_INFO “Driver: read()n”);
if (*off == 0)
{
if (copy_to_user(buf, &c, 1) != 0)
return -EFAULT;
else
{
(*off)++;
return 1;
}
}
else
return 0;
}
code ,it give me only the last character of the word what ever I enter as input for the file.
so, please tell me what to do to see the whole word ?
That’s an exercise for you. Try it out and post the solution. :)
did any one got the solution how to read the whole word , please post it ….. thanks :)
brajesh
Hint: You have to use a local buffer, big enough to store the whole word, written into it. And then in read you need copy from that to user’s buffer, as much requested by the user. Don’t forget taking care of the various length combinations of the various buffers.
My solution is to create a “word” buffer and write as many characters as in buf. len is stored in the global variable word_len so that it can be reused in my_read. I’ve cheated a little bit here as I wasn’t able to get the length of the word buffer with strlen().
static char *w;
size_t word_len = 0;
static ssize_t my_read(struct file *f, char __user *buf, size_t
len, loff_t *off)
{
printk(KERN_INFO “Driver: read()n”);
if (*off == 0)
{
if (copy_to_user(buf, &w, word_len) != 0)
{
return -EFAULT;
}
else
{
(*off)++;
return word_len;
}
}
else
return 0;
}
static ssize_t my_write(struct file *f, const char __user *buf,
size_t len, loff_t *off)
{
printk(KERN_INFO “Driver: write()n”);
if (copy_from_user(&w, buf, len) != 0)
{
return -EFAULT;
}
else
{
word_len = len;
printk(KERN_DEBUG “Written %d charactersn”, (int)len);
return len;
}
}
Logic flow is fine. However, many more conditions has to be take care of, e.g. what if len word_len, in case of both read & write – how to handle & manage such situations?
Hi,
I don’t the problem with word_len. Maybe I’m missing something.
In my_write I assign len to word_len, where len is provided from user space, through copy_from_user. Hence word_len can’t be different from len.
In my_read, I can get one of the two situations:
word_len=len (previously assigned in my_write)
word_len=0 (in case the write failed or hasn’t been done since the module load)
Can you please tell me what I am missing?
Thanks
#1) Where & How much are you allocated for w? I do not see any allocation for the buffer of word_len.
#2) The len in read could be any value depending on the user space application – it need not be 0 or word_len – note that it comes from the user space and not under driver’s control.
Hi, actually I didn’t allocate any buffer for w. I believe (correct me if I’m wrong) it gets allocated in copy_from_user.
Regarding #2, I’m getting whatever is passed from user space. I know, it is a terrible mistake.
Would you define a max buffer size and copy up to that limit if the data size from user is bigger?
#1) copy_from_user just verifies & copies – no allocation – it expects a valid buffer, so you’d have to allocate it for the kernel space – user space one anyways comes from the user.
#2) Yes, typically you’d have to do that – but then it depends on the requirement from your driver.
can i use
if(*pos!=len)
{
if (copy_to_user(buf, &w, word_len) != 0)
{
return -EFAULT;
}
else
{
(*off)++;
return word_len;
}
}
else
return 0;
*pos means *off
i done make
after that using insmod i inserted the .ko file
then done echo -n “Pugs” > /dev/mynull
then done
cat /dev/mynull
but not getting any output
Do strace cat /dev/mynull, to checkout what is the system call sequence happening.
Hi anil ;
can u tell me how can i read the complete data written in the device file.
say for example i follow the above example and and i write “my name is xyz ” in the mynull file created above . now can u tell me how can i read the same and complete data written in the file.
Check out the discussion below, regarding the same.
final code for copying all data?
Follow the discussions below for the same.
i made the code. its working fine.
it outputs infinite whitespace for me . not the last character !
UPDATE : Should be buf + len – 2 as newline is also taken as input?
Obviously, if you give newline as input, it would also be taken, as part of the data. Possibly, you didn’t use the -n option of echo, which directs echo to not write a newline at the end. Making it to buf + len – 2 is not a good solution. Instead, use -n with echo.
ah that’s what its for , brilliant , thank you :D
you should have given one header otherwise “copy_from_user” and “copy_to_user” will show error, and also ” echo -n “Pugs” > /dev/ mynull “is given in build steps ,the space before “mynull” shouldnt be there.
Thanks for pointing out the mistakes. Yes, needs to be included. And, I guess the space before mynull is a typo introduced while uploading the article.
hey thanx ,in my case i was getting error after including uassess header file also, later i solved the error by line- if (copy_from_user(&c, buf + len – 1, 1) != 0) to
if (copy_from_user(&c, &buf[ len-1 ] , 1) != 0) then it worked for me.
That doesn’t seem to be a reason. What was the error you were getting?
then what may be the issue for that error!!
error was after make
make -C /usr/src/linux-headers-3.11.0-26-generic SUBDIRS=/home/gowda/mydev/char_perm modules
make[1]: Entering directory `/usr/src/linux-headers-3.11.0-26-generic’
CC [M] /home/gowda/mydev/char_perm/charprm.o
/home/gowda/mydev/char_perm/charprm.c: In function ‘my_write’:
/home/gowda/mydev/char_perm/charprm.c:48:5: warning: passing argument 2 of ‘copy_from_user’ makes pointer from integer without a cast [enabled by default]
/usr/src/linux-headers-3.11.0-26-generic/arch/x86/include/asm/uaccess_64.h:55:42: note: expected ‘const void *’ but argument is of type ‘char’
/home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘342’ in program
/home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘200’ in program
/home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘223’ in program
/home/gowda/mydev/char_perm/charprm.c:49:35: error: expected ‘)’ before numeric constant
/home/gowda/mydev/char_perm/charprm.c:49:35: error: too few arguments to function ‘copy_from_user’
/usr/src/linux-headers-3.11.0-26-generic/arch/x86/include/asm/uaccess_64.h:55:42: note: declared here
/home/gowda/mydev/char_perm/charprm.c:50:6: error: expected ‘;’ before ‘return’
/home/gowda/mydev/char_perm/charprm.c:54:1: warning: control reaches end of non-void function [-Wreturn-type]
make[2]: *** [/home/gowda/mydev/char_perm/charprm.o] Error 1
make[1]: *** [_module_/home/gowda/mydev/char_perm] Error 2
make[1]: Leaving directory `/usr/src/linux-headers-3.11.0-26-generic’
make: *** [default] Error 2
Seems like you copy pasted the code from the above article. If you had done that, then you have copied the long dash (–) instead of minus (-) in the code. That is causing all the above errors. Just replace that and should be fine.
thank you sir,yes i copied :) now its working fine… even that line worked for me!!
It has to work. I cannot escape from working. :)
I’m unable to write into /dev/mynull. thx in advance
vik@Sony:~/work/swetha$ sudo insmod ofd.ko
vik@Sony:~/work/swetha$ chmod a+w /dev/mynull
chmod: changing permissions of `/dev/mynull’: Operation not permitted
vik@Sony:~/work/swetha$ sudo chmod a+w /dev/mynull
vik@Sony:~/work/swetha$ sudo echo “helloee” > /dev/mynull
echo: write error: Bad address
vik@Sony:~/work/swetha$
“Bad address” means the “return -EFAULT” statement is getting executed. Check out your code.
Hi,
I want to transfer a buffer content from the kernel space to the user space not only one character.
Can you help on this ?
If you already have an equivalent kernel space buffer for that, just go ahead and modify the copy_to_user, accordingly.
HI sir,
One thing I did not understand, Why did you force to use copy_from_user functions. What exactly is the problem if we use the earlier code w/o these functions. Please explain. And How to use last argument of my_read(). Thank you.
Reason for the two functions is explained in the article itself, in the paragraphs before the last piece of code. For how to use the last argument of read, check out my updated blog at http://sysplay.in/blog/linux-device-drivers/2013/07/decoding-the-character-device-file-operations/
Sir I am not able to understand how to use *off so that my_read returns 0 second time onwards
Check out the SysPlay’s blog at http://sysplay.in/blog/linux-device-drivers/2013/07/decoding-the-character-device-file-operations/
Sir, can you please explain the role of loff_t *off. I am not able to get its meaning, how is it initialised……….Thanx in Advance
sir,please write articles on interrupt context.
Like I say to my friends …….
Bhai Saab gajab!!!! :)
awesome article and over the top explanation, great work.