Developers

Device Drivers, Part 6: Decoding Character Device File Operations

May 1, 2011

20435

This article, which is part of the series on Linux device drivers, continues to cover the various concepts of character drivers and their implementation, which was dealt with in the previous two articles [1, 2].

So, what was your guess on how Shweta would crack the problem? Obviously, with the help of Pugs. Wasn’t it obvious? In our previous article, we saw how Shweta was puzzled by not being able to read any data, even after writing into the /dev/mynull character device file. Suddenly, a bell rang — not inside her head, but a real one at the door. And for sure, there was Pugs.

“How come you’re here?” exclaimed Shweta.

“I saw your tweet. It’s cool that you cracked your first character driver all on your own. That’s amazing. So, what are you up to now?” asked Pugs.

“I’ll tell you, on the condition that you do not play spoil sport,” replied Shweta.

Pugs smiled, “Okay, I’ll only give you advice.”

“And that too, only if I ask for it! I am trying to understand character device file operations,” said Shweta.

Pugs perked up, saying, “I have an idea. Why don’t you decode and then explain what you’ve understood about it?”

Shweta felt that was a good idea. She tail‘ed the dmesg log to observe the printk output from her driver. Alongside, she opened her null driver code on her console, specifically observing the device file operations my_open, my_close, my_read, and my_write.

static int my_open(struct inode *i, struct file *f)
{
    printk(KERN_INFO "Driver: open()\n");
    return 0;
}
static int my_close(struct inode *i, struct file *f)
{
    printk(KERN_INFO "Driver: close()\n");
    return 0;
}
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    return 0;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    return len;
}

Based on the earlier understanding of the return value of the functions in the kernel, my_open() and my_close() are trivial, their return types being int, and both of them returning zero, means success.

However, the return types of both my_read() and my_write() are not int, rather, it is ssize_t. On further digging through kernel headers, that turns out to be a signed word. So, returning a negative number would be a usual error. But a non-negative return value would have additional meaning. For the read operation, it would be the number of bytes read, and for the write operation, it would be the number of bytes written.

Reading the device file

To understand this in detail, the complete flow has to be given a relook. Let’s take the read operation first. When the user does a read from the device file /dev/mynull, that system call comes to the virtual file system (VFS) layer in the kernel. VFS decodes the <major, minor> tuple, and figures out that it needs to redirect it to the driver’s function my_read(), that’s registered with it. So from that angle, my_read() is invoked as a request to read, from us — the device-driver writers. And hence, its return value would indicate to the requesters (i.e., the users), how many bytes they are getting from the read request.

In our null driver example, we returned zero — which meant no bytes available, or in other words, the end of the file. And hence, when the device file is being read, the result is always nothing, independent of what is written into it.

“Hmmm… So, if I change it to 1, would it start giving me some data?” asked Pugs, by way of verifying.

Shweta paused for a while, looked at the parameters of the function my_read() and answered in the affirmative, but with a caveat — the data sent would be some junk data, since my_read() is not really populating data into buf (the buffer variable that is the second parameter of my_read(), provided by the user). In fact, my_read() should write data into buf, according to len (the third parameter to the function), the count in bytes requested by the user.

To be more specific, it should write less than, or equal to, len bytes of data into buf, and the number of bytes written should be passed back as the return value. No, this is not a typo — in the read operation, device-driver writers “write” into the user-supplied buffer. We read the data from (possibly) an underlying device, and then write that data into the user buffer, so that the user can read it. “That’s really smart of you,” said Pugs, sarcastically.

Writing into the device file

The write operation is the reverse. The user provides len (the third parameter of my_write()) bytes of data to be written, in buf (the second parameter of my_write()). The my_write() function would read that data and possibly write it to an underlying device, and return the number of bytes that have been successfully written.

“Aha!! That’s why all my writes into /dev/ mynull have been successful, without actually doing any read or write,” exclaimed Shweta, filled with happiness at understanding the complete flow of device file operations.

Preserving the last character

With Shweta not giving Pugs any chance to correct her, he came up with a challenge. “Okay. Seems like you are thoroughly clear with the read/write fundamentals; so, here’s a question for you. Can you modify these my_read() and my_write() functions such that whenever I read /dev/mynull, I get the last character written into /dev/mynull?”

Confidently, Shweta took on the challenge, and modified my_read() and my_write() as follows, adding a static global character variable:

static char c;
 
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    buf[0] = c;
    return 1;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    c = buf[len – 1];
    return len;
}

“Almost there, but what if the user has provided an invalid buffer, or if the user buffer is swapped out. Wouldn’t this direct access of the user-space buf just crash and oops the kernel?” pounced Pugs.

Shweta, refusing to be intimidated, dived into her collated material and figured out that there are two APIs just to ensure that user-space buffers are safe to access, and then updated them. With the complete understanding of the APIs, she rewrote the above code snippet as follows:

static char c;
 
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    if (copy_to_user(buf, &c, 1) != 0)
        return -EFAULT;
    else
        return 1;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    if (copy_from_user(&c, buf + len – 1, 1) != 0)
        return -EFAULT;
    else
        return len;
}

Then Shweta repeated the usual build-and-test steps as follows:

Build the modified “null” driver (.ko file) by running make.
Load the driver using insmod.
Write into /dev/mynull, say, using echo -n "Pugs" > /dev/ mynull
Read from /dev/mynull using cat /dev/mynull (stop by using Ctrl+C)
Unload the driver using rmmod.

On cat‘ing /dev/mynull, the output was a non-stop infinite sequence of s, as my_read() gives the last one character forever. So, Pugs intervened and pressed Ctrl+C to stop the infinite read, and tried to explain, “If this is to be changed to ‘the last character only once’, my_read() needs to return 1 the first time, and zero from the second time onwards. This can be achieved using off (the fourth parameter of my_read()).”

Shweta nodded her head obligingly, just to bolster Pugs’ ego.

61 COMMENTS

Jerrin Shaji George April 7, 2012 At 5:11 PM

Could you explain how the fourth parameter ‘off’ can be used to prevent the infinite output sequence. Thanks in advance.

Reply
- Anil Pugalia July 10, 2012 At 12:03 PM
  
  For that please understand, why is the infinite sequence in the first place. It is because the read never says end of file by returning a zero, but always keep on giving data, whenever asked for. With this a user code doing a read till end of file would go into infinite loop. Hence, to fix this you need to have a case of returning a zero. In our case, we used the case to be “when you try to read the second time or second byte”, which is very well captured by the fourth parameter ‘off’, telling us where exactly was it already reading.
  
  Reply
Guest April 24, 2012 At 1:50 PM

You might want to add the linux header uaccess.h to access those api calls.

Reply
- Anil Pugalia July 10, 2012 At 11:59 AM
  
  You are right. I missed mentioning that.
  
  Reply
Haris Ibrahim K. V. May 8, 2012 At 4:38 PM

Please mention to add the header in order for the copy_to_user and copy_from_user function calls to work properly. I’m following each and every step of your guide. :)

Reply
- Anil Pugalia July 10, 2012 At 11:58 AM
  
  Thanks for the addendum.
  
  Reply
Haris Ibrahim K. V. May 8, 2012 At 4:55 PM

I made the changes in my write and read functions and built the driver. However, after writing to the driver using echo, when I tried “cat /dev/mynull” nothing happened. The output was same as in last article. Any idea where I might have gone wrong?

Reply
PeterHiggs September 24, 2012 At 9:53 AM

I am able to compile successfully, then i insmod the module, then I did the write openration as you mentioned the “$ echo -1 “Pugs” > /dev/mynull “. Now I tried to read the file using “$ sudo cat /dev/mynull” but see something in infinite loop, but I am not able to see any character. I can’t see anything and the read() operation in loop. Where I am going wrong?

Reply
- anil_pugalia November 28, 2012 At 9:01 PM
  
  It should be “echo -n” not “echo -1”. Please correct it & try.
  
  Reply
Sab December 7, 2012 At 3:59 PM

Firstly , great article. I bought the Linux device driver book buy Oreilly but could not understand much. Most of my learning has been from your website. If you write a book please do let me know. I will be the first to buy it :-).I tried to use the value of *off and I printed it to my logs, but it appears that it is always 0. I had to use a separate variable to make it print only once. Could you please explain as to how to do it using the long offset pointer

Reply
- Anil Pugalia December 8, 2012 At 7:34 PM
  
  Thanks for reading & appreciating the article. *off would change
  only if the driver changes it. So, in read when it is 0, you need to put
  the value in buf & then increment the *off, i.e. do a (*off)++;
  
  Reply
  - Sab December 9, 2012 At 8:53 AM
    
    Okay. That clears things up. Thanks
    
    Reply
Amit January 3, 2013 At 5:24 PM

thanks sir , for your contribution…
sir, is there any way to track down how control is going from when we call open to .open in operation struct
any tool or any other way to know the control flow in device driver..

Reply
- Anil Pugalia January 4, 2013 At 9:38 AM
  
  You may try strace, printk in kernel, … to name a few.
  
  Reply
Rahul May 13, 2013 At 2:36 PM

Awesome article ….

Reply
- Anil Pugalia May 14, 2013 At 11:22 AM
  
  Thanks for reading & appreciating.
  
  Reply
Audhil June 1, 2013 At 3:57 PM

I can’t understand what you are trying to say regarding stopping infinite loop using off parameter…Can you elaborate ..?

Reply
- anil_pugalia June 7, 2013 At 8:11 PM
  
  Basically, by using off parameter, one can return 0 (i.e. create an end of file scenario), say after returning the first character for the first time. It would make more sense, if you had tried Shweta’s test steps, above. Precisely writing, thereafter check out the following and see the difference for yourself:
  
  static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
  {
  printk(KERN_INFO “Driver: read()n”);
  if (*off == 0)
  {
  if (copy_to_user(buf, &c, 1) != 0)
  return -EFAULT;
  else
  {
  (*off)++;
  return 1;
  }
  }
  else
  return 0;
  }
  
  Reply
Subash July 15, 2013 At 10:10 AM

Hi
How to implement a select interface for a char driver?

Reply
- anil_pugalia July 15, 2013 At 12:58 PM
  
  Implement the poll system call.
  
  Reply
ans July 24, 2013 At 4:45 PM

The post is very very good.
But when I try to use your ….

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
printk(KERN_INFO “Driver: read()n”);
if (*off == 0)
{
if (copy_to_user(buf, &c, 1) != 0)
return -EFAULT;
else
{
(*off)++;
return 1;
}
}
else
return 0;
}

code ,it give me only the last character of the word what ever I enter as input for the file.
so, please tell me what to do to see the whole word ?

Reply
- anil_pugalia July 25, 2013 At 5:09 PM
  
  That’s an exercise for you. Try it out and post the solution. :)
  
  Reply
  - Brajesh August 28, 2013 At 11:42 AM
    
    did any one got the solution how to read the whole word , please post it ….. thanks :)
    brajesh
    
    Reply
    - anil_pugalia August 31, 2013 At 12:46 PM
      
      Hint: You have to use a local buffer, big enough to store the whole word, written into it. And then in read you need copy from that to user’s buffer, as much requested by the user. Don’t forget taking care of the various length combinations of the various buffers.
      
      Reply
      - Gianluca Busiello September 19, 2013 At 4:59 AM
        
        My solution is to create a “word” buffer and write as many characters as in buf. len is stored in the global variable word_len so that it can be reused in my_read. I’ve cheated a little bit here as I wasn’t able to get the length of the word buffer with strlen().
        
        static char *w;
        size_t word_len = 0;
        
        static ssize_t my_read(struct file *f, char __user *buf, size_t
        len, loff_t *off)
        {
        printk(KERN_INFO “Driver: read()n”);
        if (*off == 0)
        {
        if (copy_to_user(buf, &w, word_len) != 0)
        {
        return -EFAULT;
        }
        else
        {
        (*off)++;
        return word_len;
        }
        }
        else
        return 0;
        }
        
        static ssize_t my_write(struct file *f, const char __user *buf,
        size_t len, loff_t *off)
        {
        printk(KERN_INFO “Driver: write()n”);
        if (copy_from_user(&w, buf, len) != 0)
        {
        return -EFAULT;
        }
        else
        {
        word_len = len;
        printk(KERN_DEBUG “Written %d charactersn”, (int)len);
        return len;
        }
        }
      - anil_pugalia September 19, 2013 At 9:11 AM
        
        Logic flow is fine. However, many more conditions has to be take care of, e.g. what if len word_len, in case of both read & write – how to handle & manage such situations?
      - Gianluca Busiello September 19, 2013 At 3:21 PM
        
        Hi,
        
        I don’t the problem with word_len. Maybe I’m missing something.
        
        In my_write I assign len to word_len, where len is provided from user space, through copy_from_user. Hence word_len can’t be different from len.
        
        In my_read, I can get one of the two situations:
        word_len=len (previously assigned in my_write)
        
        word_len=0 (in case the write failed or hasn’t been done since the module load)
        
        Can you please tell me what I am missing?
        Thanks
      - anil_pugalia September 19, 2013 At 7:58 PM
        
        #1) Where & How much are you allocated for w? I do not see any allocation for the buffer of word_len.
        
        #2) The len in read could be any value depending on the user space application – it need not be 0 or word_len – note that it comes from the user space and not under driver’s control.
      - Gianluca Busiello September 20, 2013 At 6:43 PM
        
        Hi, actually I didn’t allocate any buffer for w. I believe (correct me if I’m wrong) it gets allocated in copy_from_user.
        
        Regarding #2, I’m getting whatever is passed from user space. I know, it is a terrible mistake.
        Would you define a max buffer size and copy up to that limit if the data size from user is bigger?
      - anil_pugalia September 21, 2013 At 11:50 AM
        
        #1) copy_from_user just verifies & copies – no allocation – it expects a valid buffer, so you’d have to allocate it for the kernel space – user space one anyways comes from the user.
        
        #2) Yes, typically you’d have to do that – but then it depends on the requirement from your driver.
      - Noufal P February 15, 2014 At 11:12 PM
        
        can i use
        if(*pos!=len)
        {
        if (copy_to_user(buf, &w, word_len) != 0)
        {
        return -EFAULT;
        }
        else
        {
        (*off)++;
        return word_len;
        }
        }
        else
        return 0;
      - Noufal P February 15, 2014 At 11:12 PM
        
        *pos means *off
rama December 26, 2013 At 11:13 AM

i done make

after that using insmod i inserted the .ko file

then done echo -n “Pugs” > /dev/mynull

then done
cat /dev/mynull

but not getting any output

Reply
- anil_pugalia December 30, 2013 At 10:57 PM
  
  Do strace cat /dev/mynull, to checkout what is the system call sequence happening.
  
  Reply
AMIT KUMAR December 26, 2013 At 7:02 PM

Hi anil ;

can u tell me how can i read the complete data written in the device file.
say for example i follow the above example and and i write “my name is xyz ” in the mynull file created above . now can u tell me how can i read the same and complete data written in the file.

Reply
- anil_pugalia December 30, 2013 At 10:58 PM
  
  Check out the discussion below, regarding the same.
  
  Reply
Noufal P February 14, 2014 At 4:29 PM

final code for copying all data?

Reply
- anil_pugalia February 17, 2014 At 10:32 AM
  
  Follow the discussions below for the same.
  
  Reply
  - Noufal P March 4, 2014 At 3:37 PM
    
    i made the code. its working fine.
    
    Reply
karthik nayak May 19, 2014 At 7:49 PM

it outputs infinite whitespace for me . not the last character !
UPDATE : Should be buf + len – 2 as newline is also taken as input?

Reply
- anil_pugalia May 20, 2014 At 1:17 PM
  
  Obviously, if you give newline as input, it would also be taken, as part of the data. Possibly, you didn’t use the -n option of echo, which directs echo to not write a newline at the end. Making it to buf + len – 2 is not a good solution. Instead, use -n with echo.
  
  Reply
  - karthik nayak May 20, 2014 At 10:06 PM
    
    ah that’s what its for , brilliant , thank you :D
    
    Reply
Akhil Nandepu July 17, 2014 At 3:09 PM

you should have given one header otherwise “copy_from_user” and “copy_to_user” will show error, and also ” echo -n “Pugs” > /dev/ mynull “is given in build steps ,the space before “mynull” shouldnt be there.

Reply
- anil_pugalia July 31, 2014 At 11:37 AM
  
  Thanks for pointing out the mistakes. Yes, needs to be included. And, I guess the space before mynull is a typo introduced while uploading the article.
  
  Reply
abhishek gowda September 9, 2014 At 10:54 AM

hey thanx ,in my case i was getting error after including uassess header file also, later i solved the error by line- if (copy_from_user(&c, buf + len – 1, 1) != 0) to
if (copy_from_user(&c, &buf[ len-1 ] , 1) != 0) then it worked for me.

Reply
- anil_pugalia September 9, 2014 At 1:27 PM
  
  That doesn’t seem to be a reason. What was the error you were getting?
  
  Reply
  - abhishek gowda September 9, 2014 At 2:19 PM
    
    then what may be the issue for that error!!
    
    error was after make
    
    make -C /usr/src/linux-headers-3.11.0-26-generic SUBDIRS=/home/gowda/mydev/char_perm modules
    
    make[1]: Entering directory `/usr/src/linux-headers-3.11.0-26-generic’
    
    CC [M] /home/gowda/mydev/char_perm/charprm.o
    
    /home/gowda/mydev/char_perm/charprm.c: In function ‘my_write’:
    
    /home/gowda/mydev/char_perm/charprm.c:48:5: warning: passing argument 2 of ‘copy_from_user’ makes pointer from integer without a cast [enabled by default]
    
    /usr/src/linux-headers-3.11.0-26-generic/arch/x86/include/asm/uaccess_64.h:55:42: note: expected ‘const void *’ but argument is of type ‘char’
    
    /home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘342’ in program
    
    /home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘200’ in program
    
    /home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘223’ in program
    
    /home/gowda/mydev/char_perm/charprm.c:49:35: error: expected ‘)’ before numeric constant
    
    /home/gowda/mydev/char_perm/charprm.c:49:35: error: too few arguments to function ‘copy_from_user’
    
    /usr/src/linux-headers-3.11.0-26-generic/arch/x86/include/asm/uaccess_64.h:55:42: note: declared here
    
    /home/gowda/mydev/char_perm/charprm.c:50:6: error: expected ‘;’ before ‘return’
    
    /home/gowda/mydev/char_perm/charprm.c:54:1: warning: control reaches end of non-void function [-Wreturn-type]
    
    make[2]: *** [/home/gowda/mydev/char_perm/charprm.o] Error 1
    
    make[1]: *** [_module_/home/gowda/mydev/char_perm] Error 2
    
    make[1]: Leaving directory `/usr/src/linux-headers-3.11.0-26-generic’
    
    make: *** [default] Error 2
    
    Reply
    - anil_pugalia September 9, 2014 At 2:48 PM
      
      Seems like you copy pasted the code from the above article. If you had done that, then you have copied the long dash (–) instead of minus (-) in the code. That is causing all the above errors. Just replace that and should be fine.
      
      Reply
      - abhishek gowda September 9, 2014 At 3:07 PM
        
        thank you sir,yes i copied :) now its working fine… even that line worked for me!!
      - anil_pugalia September 10, 2014 At 1:50 PM
        
        It has to work. I cannot escape from working. :)
vikram_Pinto September 19, 2014 At 10:38 PM

I’m unable to write into /dev/mynull. thx in advance

vik@Sony:~/work/swetha$ sudo insmod ofd.ko
vik@Sony:~/work/swetha$ chmod a+w /dev/mynull
chmod: changing permissions of `/dev/mynull’: Operation not permitted
vik@Sony:~/work/swetha$ sudo chmod a+w /dev/mynull
vik@Sony:~/work/swetha$ sudo echo “helloee” > /dev/mynull
echo: write error: Bad address
vik@Sony:~/work/swetha$

Reply
- anil_pugalia September 20, 2014 At 8:39 AM
  
  “Bad address” means the “return -EFAULT” statement is getting executed. Check out your code.
  
  Reply
niz October 14, 2014 At 9:13 PM

Hi,
I want to transfer a buffer content from the kernel space to the user space not only one character.
Can you help on this ?

Reply
- anil_pugalia October 15, 2014 At 6:22 PM
  
  If you already have an equivalent kernel space buffer for that, just go ahead and modify the copy_to_user, accordingly.
  
  Reply
Aniket Anand January 30, 2015 At 1:27 AM

HI sir,
One thing I did not understand, Why did you force to use copy_from_user functions. What exactly is the problem if we use the earlier code w/o these functions. Please explain. And How to use last argument of my_read(). Thank you.

Reply
- anil_pugalia February 12, 2015 At 10:56 AM
  
  Reason for the two functions is explained in the article itself, in the paragraphs before the last piece of code. For how to use the last argument of read, check out my updated blog at http://sysplay.in/blog/linux-device-drivers/2013/07/decoding-the-character-device-file-operations/
  
  Reply
praneet February 11, 2015 At 10:14 PM

Sir I am not able to understand how to use *off so that my_read returns 0 second time onwards

Reply
- anil_pugalia February 12, 2015 At 10:57 AM
  
  Check out the SysPlay’s blog at http://sysplay.in/blog/linux-device-drivers/2013/07/decoding-the-character-device-file-operations/
  
  Reply
Abhinav Jain April 10, 2017 At 10:02 PM

Sir, can you please explain the role of loff_t *off. I am not able to get its meaning, how is it initialised……….Thanx in Advance

Reply
rajasekhar July 13, 2017 At 11:50 PM

sir,please write articles on interrupt context.

Reply
Shubham April 13, 2018 At 12:12 AM

Like I say to my friends …….
Bhai Saab gajab!!!! :)
awesome article and over the top explanation, great work.

Reply

Reading the device file

Writing into the device file

Preserving the last character

61 COMMENTS

LEAVE A REPLY Cancel reply

Thought Leaders

HOW TOs

MOST POPULAR

Open Journey

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY