Device Drivers, Part 6: Decoding Character Device File Operations

61
20230
Gearing up for character drivers

Gearing up for character drivers

This article, which is part of the series on Linux device drivers, continues to cover the various concepts of character drivers and their implementation, which was dealt with in the previous two articles [1, 2].

So, what was your guess on how Shweta would crack the problem? Obviously, with the help of Pugs. Wasn’t it obvious? In our previous article, we saw how Shweta was puzzled by not being able to read any data, even after writing into the /dev/mynull character device file. Suddenly, a bell rang — not inside her head, but a real one at the door. And for sure, there was Pugs.

“How come you’re here?” exclaimed Shweta.

“I saw your tweet. It’s cool that you cracked your first character driver all on your own. That’s amazing. So, what are you up to now?” asked Pugs.

“I’ll tell you, on the condition that you do not play spoil sport,” replied Shweta.

Pugs smiled, “Okay, I’ll only give you advice.”

“And that too, only if I ask for it! I am trying to understand character device file operations,” said Shweta.

Pugs perked up, saying, “I have an idea. Why don’t you decode and then explain what you’ve understood about it?”

Shweta felt that was a good idea. She tail‘ed the dmesg log to observe the printk output from her driver. Alongside, she opened her null driver code on her console, specifically observing the device file operations my_open, my_close, my_read, and my_write.

static int my_open(struct inode *i, struct file *f)
{
    printk(KERN_INFO "Driver: open()\n");
    return 0;
}
static int my_close(struct inode *i, struct file *f)
{
    printk(KERN_INFO "Driver: close()\n");
    return 0;
}
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    return 0;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    return len;
}

Based on the earlier understanding of the return value of the functions in the kernel, my_open() and my_close() are trivial, their return types being int, and both of them returning zero, means success.

However, the return types of both my_read() and my_write() are not int, rather, it is ssize_t. On further digging through kernel headers, that turns out to be a signed word. So, returning a negative number would be a usual error. But a non-negative return value would have additional meaning. For the read operation, it would be the number of bytes read, and for the write operation, it would be the number of bytes written.

Reading the device file

To understand this in detail, the complete flow has to be given a relook. Let’s take the read operation first. When the user does a read from the device file /dev/mynull, that system call comes to the virtual file system (VFS) layer in the kernel. VFS decodes the <major, minor> tuple, and figures out that it needs to redirect it to the driver’s function my_read(), that’s registered with it. So from that angle, my_read() is invoked as a request to read, from us — the device-driver writers. And hence, its return value would indicate to the requesters (i.e., the users), how many bytes they are getting from the read request.

In our null driver example, we returned zero — which meant no bytes available, or in other words, the end of the file. And hence, when the device file is being read, the result is always nothing, independent of what is written into it.

“Hmmm… So, if I change it to 1, would it start giving me some data?” asked Pugs, by way of verifying.

Shweta paused for a while, looked at the parameters of the function my_read() and answered in the affirmative, but with a caveat — the data sent would be some junk data, since my_read() is not really populating data into buf (the buffer variable that is the second parameter of my_read(), provided by the user). In fact, my_read() should write data into buf, according to len (the third parameter to the function), the count in bytes requested by the user.

To be more specific, it should write less than, or equal to, len bytes of data into buf, and the number of bytes written should be passed back as the return value. No, this is not a typo — in the read operation, device-driver writers “write” into the user-supplied buffer. We read the data from (possibly) an underlying device, and then write that data into the user buffer, so that the user can read it. “That’s really smart of you,” said Pugs, sarcastically.

Writing into the device file

The write operation is the reverse. The user provides len (the third parameter of my_write()) bytes of data to be written, in buf (the second parameter of my_write()). The my_write() function would read that data and possibly write it to an underlying device, and return the number of bytes that have been successfully written.

“Aha!! That’s why all my writes into /dev/ mynull have been successful, without actually doing any read or write,” exclaimed Shweta, filled with happiness at understanding the complete flow of device file operations.

Preserving the last character

With Shweta not giving Pugs any chance to correct her, he came up with a challenge. “Okay. Seems like you are thoroughly clear with the read/write fundamentals; so, here’s a question for you. Can you modify these my_read() and my_write() functions such that whenever I read /dev/mynull, I get the last character written into /dev/mynull?”

Confidently, Shweta took on the challenge, and modified my_read() and my_write() as follows, adding a static global character variable:

static char c;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    buf[0] = c;
    return 1;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    c = buf[len – 1];
    return len;
}

“Almost there, but what if the user has provided an invalid buffer, or if the user buffer is swapped out. Wouldn’t this direct access of the user-space buf just crash and oops the kernel?” pounced Pugs.

Shweta, refusing to be intimidated, dived into her collated material and figured out that there are two APIs just to ensure that user-space buffers are safe to access, and then updated them. With the complete understanding of the APIs, she rewrote the above code snippet as follows:

static char c;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    if (copy_to_user(buf, &c, 1) != 0)
        return -EFAULT;
    else
        return 1;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    if (copy_from_user(&c, buf + len – 1, 1) != 0)
        return -EFAULT;
    else
        return len;
}

Then Shweta repeated the usual build-and-test steps as follows:

  1. Build the modified “null” driver (.ko file) by running make.
  2. Load the driver using insmod.
  3. Write into /dev/mynull, say, using echo -n "Pugs" > /dev/ mynull
  4. Read from /dev/mynull using cat /dev/mynull (stop by using Ctrl+C)
  5. Unload the driver using rmmod.

On cat‘ing /dev/mynull, the output was a non-stop infinite sequence of s, as my_read() gives the last one character forever. So, Pugs intervened and pressed Ctrl+C to stop the infinite read, and tried to explain, “If this is to be changed to ‘the last character only once’, my_read() needs to return 1 the first time, and zero from the second time onwards. This can be achieved using off (the fourth parameter of my_read()).”

Shweta nodded her head obligingly, just to bolster Pugs’ ego.

61 COMMENTS

  1. Could you explain how the fourth parameter ‘off’ can be used to prevent the infinite output sequence. Thanks in advance. 

    • For that please understand, why is the infinite sequence in the first place. It is because the read never says end of file by returning a zero, but always keep on giving data, whenever asked for. With this a user code doing a read till end of file would go into infinite loop. Hence, to fix this you need to have a case of returning a zero. In our case, we used the case to be “when you try to read the second time or second byte”, which is very well captured by the fourth parameter ‘off’, telling us where exactly was it already reading.

  2. Please mention to add the header in order for the copy_to_user and copy_from_user function calls to work properly. I’m following each and every step of your guide. :)

  3. I made the changes in my write and read functions and built the driver. However, after writing to the driver using echo, when I tried “cat /dev/mynull” nothing happened. The output was same as in last article. Any idea where I might have gone wrong?

  4. I am able to compile successfully, then i insmod the module, then I did the write openration as you mentioned the “$ echo -1 “Pugs” > /dev/mynull “. Now I tried to read the file using “$ sudo cat /dev/mynull” but see something in infinite loop, but I am not able to see any character. I can’t see anything and the read() operation in loop. Where I am going wrong?

  5. Firstly , great article. I bought the Linux device driver book buy Oreilly but could not understand much. Most of my learning has been from your website. If you write a book please do let me know. I will be the first to buy it :-).I tried to use the value of *off and I printed it to my logs, but it appears that it is always 0. I had to use a separate variable to make it print only once. Could you please explain as to how to do it using the long offset pointer

  6. thanks sir , for your contribution…
    sir, is there any way to track down how control is going from when we call open to .open in operation struct
    any tool or any other way to know the control flow in device driver..

  7. I can’t understand what you are trying to say regarding stopping infinite loop using off parameter…Can you elaborate ..?

    • Basically, by using off parameter, one can return 0 (i.e. create an end of file scenario), say after returning the first character for the first time. It would make more sense, if you had tried Shweta’s test steps, above. Precisely writing, thereafter check out the following and see the difference for yourself:

      static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
      {
      printk(KERN_INFO “Driver: read()n”);
      if (*off == 0)
      {
      if (copy_to_user(buf, &c, 1) != 0)
      return -EFAULT;
      else
      {
      (*off)++;
      return 1;
      }
      }
      else
      return 0;
      }

  8. The post is very very good.
    But when I try to use your ….

    static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
    {
    printk(KERN_INFO “Driver: read()n”);
    if (*off == 0)
    {
    if (copy_to_user(buf, &c, 1) != 0)
    return -EFAULT;
    else
    {
    (*off)++;
    return 1;
    }
    }
    else
    return 0;
    }

    code ,it give me only the last character of the word what ever I enter as input for the file.
    so, please tell me what to do to see the whole word ?

        • Hint: You have to use a local buffer, big enough to store the whole word, written into it. And then in read you need copy from that to user’s buffer, as much requested by the user. Don’t forget taking care of the various length combinations of the various buffers.

          • My solution is to create a “word” buffer and write as many characters as in buf. len is stored in the global variable word_len so that it can be reused in my_read. I’ve cheated a little bit here as I wasn’t able to get the length of the word buffer with strlen().

            static char *w;
            size_t word_len = 0;

            static ssize_t my_read(struct file *f, char __user *buf, size_t
            len, loff_t *off)
            {
            printk(KERN_INFO “Driver: read()n”);
            if (*off == 0)
            {
            if (copy_to_user(buf, &w, word_len) != 0)
            {
            return -EFAULT;
            }
            else
            {
            (*off)++;
            return word_len;
            }
            }
            else
            return 0;
            }

            static ssize_t my_write(struct file *f, const char __user *buf,
            size_t len, loff_t *off)
            {
            printk(KERN_INFO “Driver: write()n”);
            if (copy_from_user(&w, buf, len) != 0)
            {
            return -EFAULT;
            }
            else
            {
            word_len = len;
            printk(KERN_DEBUG “Written %d charactersn”, (int)len);
            return len;
            }
            }

          • Logic flow is fine. However, many more conditions has to be take care of, e.g. what if len word_len, in case of both read & write – how to handle & manage such situations?

          • Hi,

            I don’t the problem with word_len. Maybe I’m missing something.

            In my_write I assign len to word_len, where len is provided from user space, through copy_from_user. Hence word_len can’t be different from len.

            In my_read, I can get one of the two situations:
            word_len=len (previously assigned in my_write)

            word_len=0 (in case the write failed or hasn’t been done since the module load)

            Can you please tell me what I am missing?
            Thanks

          • #1) Where & How much are you allocated for w? I do not see any allocation for the buffer of word_len.

            #2) The len in read could be any value depending on the user space application – it need not be 0 or word_len – note that it comes from the user space and not under driver’s control.

          • Hi, actually I didn’t allocate any buffer for w. I believe (correct me if I’m wrong) it gets allocated in copy_from_user.

            Regarding #2, I’m getting whatever is passed from user space. I know, it is a terrible mistake.
            Would you define a max buffer size and copy up to that limit if the data size from user is bigger?

          • #1) copy_from_user just verifies & copies – no allocation – it expects a valid buffer, so you’d have to allocate it for the kernel space – user space one anyways comes from the user.

            #2) Yes, typically you’d have to do that – but then it depends on the requirement from your driver.

          • can i use
            if(*pos!=len)
            {
            if (copy_to_user(buf, &w, word_len) != 0)
            {
            return -EFAULT;
            }
            else
            {
            (*off)++;
            return word_len;
            }
            }
            else
            return 0;

  9. i done make

    after that using insmod i inserted the .ko file

    then done echo -n “Pugs” > /dev/mynull

    then done
    cat /dev/mynull

    but not getting any output

  10. Hi anil ;

    can u tell me how can i read the complete data written in the device file.
    say for example i follow the above example and and i write “my name is xyz ” in the mynull file created above . now can u tell me how can i read the same and complete data written in the file.

  11. it outputs infinite whitespace for me . not the last character !
    UPDATE : Should be buf + len – 2 as newline is also taken as input?

    • Obviously, if you give newline as input, it would also be taken, as part of the data. Possibly, you didn’t use the -n option of echo, which directs echo to not write a newline at the end. Making it to buf + len – 2 is not a good solution. Instead, use -n with echo.

  12. you should have given one header otherwise “copy_from_user” and “copy_to_user” will show error, and also ” echo -n “Pugs” > /dev/ mynull “is given in build steps ,the space before “mynull” shouldnt be there.

    • Thanks for pointing out the mistakes. Yes, needs to be included. And, I guess the space before mynull is a typo introduced while uploading the article.

  13. hey thanx ,in my case i was getting error after including uassess header file also, later i solved the error by line- if (copy_from_user(&c, buf + len – 1, 1) != 0) to
    if (copy_from_user(&c, &buf[ len-1 ] , 1) != 0) then it worked for me.

      • then what may be the issue for that error!!

        error was after make

        make -C /usr/src/linux-headers-3.11.0-26-generic SUBDIRS=/home/gowda/mydev/char_perm modules

        make[1]: Entering directory `/usr/src/linux-headers-3.11.0-26-generic’

        CC [M] /home/gowda/mydev/char_perm/charprm.o

        /home/gowda/mydev/char_perm/charprm.c: In function ‘my_write’:

        /home/gowda/mydev/char_perm/charprm.c:48:5: warning: passing argument 2 of ‘copy_from_user’ makes pointer from integer without a cast [enabled by default]

        /usr/src/linux-headers-3.11.0-26-generic/arch/x86/include/asm/uaccess_64.h:55:42: note: expected ‘const void *’ but argument is of type ‘char’

        /home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘342’ in program

        /home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘200’ in program

        /home/gowda/mydev/char_perm/charprm.c:49:1: error: stray ‘223’ in program

        /home/gowda/mydev/char_perm/charprm.c:49:35: error: expected ‘)’ before numeric constant

        /home/gowda/mydev/char_perm/charprm.c:49:35: error: too few arguments to function ‘copy_from_user’

        /usr/src/linux-headers-3.11.0-26-generic/arch/x86/include/asm/uaccess_64.h:55:42: note: declared here

        /home/gowda/mydev/char_perm/charprm.c:50:6: error: expected ‘;’ before ‘return’

        /home/gowda/mydev/char_perm/charprm.c:54:1: warning: control reaches end of non-void function [-Wreturn-type]

        make[2]: *** [/home/gowda/mydev/char_perm/charprm.o] Error 1

        make[1]: *** [_module_/home/gowda/mydev/char_perm] Error 2

        make[1]: Leaving directory `/usr/src/linux-headers-3.11.0-26-generic’

        make: *** [default] Error 2

        • Seems like you copy pasted the code from the above article. If you had done that, then you have copied the long dash (–) instead of minus (-) in the code. That is causing all the above errors. Just replace that and should be fine.

  14. I’m unable to write into /dev/mynull. thx in advance

    vik@Sony:~/work/swetha$ sudo insmod ofd.ko
    vik@Sony:~/work/swetha$ chmod a+w /dev/mynull
    chmod: changing permissions of `/dev/mynull’: Operation not permitted
    vik@Sony:~/work/swetha$ sudo chmod a+w /dev/mynull
    vik@Sony:~/work/swetha$ sudo echo “helloee” > /dev/mynull
    echo: write error: Bad address
    vik@Sony:~/work/swetha$

  15. HI sir,
    One thing I did not understand, Why did you force to use copy_from_user functions. What exactly is the problem if we use the earlier code w/o these functions. Please explain. And How to use last argument of my_read(). Thank you.

  16. Sir, can you please explain the role of loff_t *off. I am not able to get its meaning, how is it initialised……….Thanx in Advance

LEAVE A REPLY

Please enter your comment!
Please enter your name here