In this fifth article in the FOSS security series, we will explore the use of the binwalk tool.
Binwalk is used to scan firmware for embedded files and executable code. This includes compressed files, firmware headers, archive files, bootloaders, file systems, etc. It is supported on GNU/Linux, Mac OS X, Windows, and FreeBSD. An API allows you to access its classes and methods to write your own binwalk scripts. The source code is written in Python and released under the MIT licence.
Installation
A Parabola GNU/Linux-libre (x86_64) system is used to install binwalk and its dependency packages.
$ cat /etc/os-release NAME=Parabola PRETTY_NAME=”Parabola GNU/Linux-libre” ID=parabola ID_LIKE=arch BUILD_ID=rolling VARIANT=”x86_64 SystemD Edition” VARIANT_ID=”x86_64-systemd” ANSI_COLOR=”1;35” HOME_URL=”https://www.parabola.nu/” DOCUMENTATION_URL=”https://wiki.parabola.nu/” SUPPORT_URL=”ircs://irc.libera.chat/#parabola” BUG_REPORT_URL=”https://labs.parabola.nu/” LOGO=”parabola-logo”
You can use the pacman package manager to install binwalk as shown below:
$ sudo pacman -S binwalk squashfs-tools python-matplotlib python-gobject
The squashfs-tools package is needed during the extraction of a squashfs file system present in firmware. The python-matplotlib and python-gobject packages are required for producing entropy charts.
Help
Binwalk version 2.4.1 is installed at the time of writing this article on Parabola GNU/Linux-libre. You can view the command line options using the ‘-h’ help option as shown below:
$ binwalk -h Binwalk v2.4.1 Original author: Craig Heffner, ReFirmLabs https://github.com/OSPG/binwalk Usage: binwalk [OPTIONS] [FILE1] [FILE2] [FILE3] ... Signature Scan Options: -B, --signature Scan target file(s) for common file signatures -R, --raw=<str> Scan target file(s) for the specified sequence of bytes -A, --opcodes Scan target file(s) for common executable opcode signatures
Signature
Consider the Buffalo WHR-G125 2007 router firmware available at the dd-wrt.com website. You can view the firmware signature with the ‘-B’ option as follows:
$ binwalk -B dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION ------------------------------------------------------------- 0 0x0 TRX firmware header, little endian, image size: 3543040 bytes, CRC32: 0x85472C8C, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0x8E4, rootfs offset: 0x8E7CC 28 0x1C gzip compressed data, maximum compression, from Unix, last modified: 1970-01-01 00:00:00 (null date) 2276 0x8E4 LZMA compressed data, properties: 0x5D, dictionary size: 8388608 bytes, uncompressed size: 1982464 bytes 583628 0x8E7CC Squashfs filesystem, little endian, DD-WRT signature, version 3.0, size: 2954340 bytes, 587 inodes, blocksize: 65536 bytes, created: 2007-06-14 00:34:20
Extract
The ‘-e’ option with binwalk will extract the individual files in the firmware as shown below:
$ binwalk -e dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------- 0 0x0 TRX firmware header, little endian, image size: 3543040 bytes, CRC32: 0x85472C8C, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0x8E4, rootfs offset: 0x8E7CC 28 0x1C gzip compressed data, maximum compression, from Unix, last modified: 1970-01-01 00:00:00 (null date) 2276 0x8E4 LZMA compressed data, properties: 0x5D, dictionary size: 8388608 bytes, uncompressed size: 1982464 bytes
The list of files can be viewed from the extracted folder:
$ ls _dd-wrt.v24_whr-g125.bin.extracted/ 1C 8E4 8E4.7z 8E7CC.squashfs squashfs-root
Recursive extraction
You can also perform recursive extraction using the ‘-M’ option with ‘-e’ on the dd-wrt.v24_whr-g125.bin firmware as illustrated below:
$ binwalk -Me dd-wrt.v24_whr-g125.bin Scan Time: 2024-08-18 17:24:57 Target File: /home/guest/dd-wrt.v24_whr-g125.bin MD5 Checksum: 8a60810685fa5be6221936034b81fd3a Signatures: 436 ..DECIMAL HEXADECIMAL DESCRIPTION ------------------------------------------------------------- 13309 0x33FD JBOOT STAG header, image id: 2, timestamp 0x12A200, image size: 134217728 bytes, image JBOOT checksum: 0x200, header JBOOT checksum: 0x6824 ...280470 0x44796 ESP Image (ESP32): flash mode: QUIO, flash speed: 40MHz, flash size: 1MB, entry address: 0x90000 ...620573 0x9781D JBOOT STAG header, image id: 5, timestamp 0x5008310, image size: 1064960 bytes, image JBOOT checksum: 0x0, header JBOOT checksum: 0x2000 ... 1014905 0xF7C79 JBOOT STAG header, image id: 6, timestamp 0x803E3, image size: 83886080 bytes, image JBOOT checksum: 0xA000, header JBOOT checksum: 0x10 ...1632592 0x18E950 CRC32 polynomial table, little endian 1634928 0x18F270 Linux kernel version 2.4.34 1648720 0x192850 Unix path: /usr/lib/libc.so.1
Extract <type> signature
If you would like to extract a specific signature type file, you need to specify the same with the ‘-D’ option. For example, the following command extracts all .7z compressed files from the input firmware file.
$ binwalk -D ‘7z’ dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 0 0x0 TRX firmware header, little endian, image size: 3543040 bytes, CRC32: 0x85472C8C, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0x8E4, rootfs offset: 0x8E7CC 28 0x1C gzip compressed data, maximum compression, from Unix, last modified: 1970-01-01 00:00:00 (null date) 2276 0x8E4 LZMA compressed data, properties: 0x5D, dictionary size: 8388608 bytes, uncompressed size: 1982464 bytes 583628 0x8E7CC Squashfs filesystem, little endian, DD-WRT signature, version 3.0, size: 2954340 bytes, 587 inodes, blocksize: 65536 bytes, created: 2007-06-14 00:34:20 $ ls _dd-wrt.v24_whr-g125.bin.extracted/ 8E4 8E4.7z Hexdump
The ‘-W’ option provides a hexdump of the binary file as shown below:
$ binwalk -W dd-wrt.v24_whr-g125.bin OFFSET dd-wrt.v24_whr-g125.bin ------------------------------------------------------------- 0x00000000 48 44 52 30 00 10 36 00 8C 2C 47 85 00 00 01 00 |HDR0..6..,G.....| 0x00000010 1C 00 00 00 E4 08 00 00 CC E7 08 00 1F 8B 08 00 |................| 0x00000020 00 00 00 00 02 03 A5 56 41 6C 1C 67 15 FE F6 9F |.......VAl.g....| 0x00000030 B1 BD 76 BC 61 BC DE 44 9B 10 45 F3 67 C7 EB 55 |..v.a..D..E.g..U| 0x00000040 9C C3 14 96 E0 A2 39 0C BB 9B CA 87 22 19 A7 87 |......9.....”...| 0x00000050 1E 22 B4 38 16 58 A8 07 AB 58 C2 07 0E 23 27 95 |.”.8.X...X...#’.| 0x00000060 0C DA 66 96 CA 15 0B A7 D5 DA 4E 73 70 BC A6 A5 |..f.......Nsp...|<span style="color: #ec008b;"><span style="font-family: Liberation Mono, serif;"><span style="font-size: xx-small;">.</span></span></span>
Display of selective lines
You can filter the hexdump output based on colour code bytes as illustrated below:
Colour |
Description |
Red |
Bytes different in all files |
Blue |
Bytes different in some files |
Green |
Bytes same in all files |
$ binwalk -W --blue dd-wrt.v24_whr-g125.bin OFFSET dd-wrt.v24_whr-g125.bin ------------------------------------------------------------- * $ binwalk -W --green dd-wrt.v24_whr-g125.bin OFFSET dd-wrt.v24_whr-g125.bin ------------------------------------------------------------- * $ binwalk -W --red dd-wrt.v24_whr-g125.bin OFFSET dd-wrt.v24_whr-g125.bin ------------------------------------------------------------- 0x00000000 48 44 52 30 00 10 36 00 8C 2C 47 85 00 00 01 00 |HDR0..6..,G.....| 0x00000010 1C 00 00 00 E4 08 00 00 CC E7 08 00 1F 8B 08 00 |................| 0x00000020 00 00 00 00 02 03 A5 56 41 6C 1C 67 15 FE F6 9F |.......VAl.g....| 0x00000030 B1 BD 76 BC 61 BC DE 44 9B 10 45 F3 67 C7 EB 55 |..v.a..D..E.g..U| 0x00000040 9C C3 14 96 E0 A2 39 0C BB 9B CA 87 22 19 A7 87 |......9.....”...|
Raw LZMA compression stream
The ‘-Z’ option identifies raw LZMA compression data streams from the firmware file as shown below:
$ binwalk -Z dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION ------------------------------------------------------------- 2289 0x8F1 Raw LZMA compression stream, properties: 0x5D [pb: 2, lp: 0, lc: 0], dictionary size: 1048576
Partial compression streams
You can search only for compression streams using the
‘–partial’ option to speed the LZMA scan as follows:
$ binwalk --partial -Z dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION ------------------------------------------------------------- 2289 0x8F1 Raw LZMA compression stream, properties: 0x5D [pb: 2, lp: 0, lc: 0], dictionary size: 33554432 583747 0x8E843 Raw LZMA compression stream, properties: 0x5D [pb: 2, lp: 0, lc: 0], dictionary size: 33554432 607501 0x9450D Raw LZMA compression stream, properties: 0x5D [pb: 2, lp: 0, lc: 0], dictionary size: 33554432 633288 0x9A9C8 Raw LZMA compression stream, properties: 0x5D [pb: 2, lp: 0, lc: 0], dictionary size: 33554432 658249 0xA0B49 Raw LZMA compression stream, properties: 0x5D [pb: 2, lp: 0, lc: 0], dictionary size: 33554432
Length limit
The ‘–length’ option is used to limit the size of bytes in the firmware file for analysis. For example:
$ binwalk --length=0x500 dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION ------------------------------------------------------------- 0 0x0 TRX firmware header, little endian, image size: 3543040 bytes, CRC32: 0x85472C8C, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0x8E4, rootfs offset: 0x8E7CC 28 0x1C gzip compressed data, maximum compression, from Unix, last modified: 1970-01-01 00:00:00 (null date)
Opcodes
The CPU architecture opcodes can be viewed using the ‘-A’ option as indicated below:
$ binwalk -A dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION ------------------------------------------------------------- 2322822 0x237186 ARM instructions, function prologue
Raw string
The ‘-R’ option allows you to search for a custom string in the target file. The header string ‘HDR0’ is matched in the following example:
$ binwalk -R “HDR0” dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION ------------------------------------------------------------- 0 0x0 Raw signature (HDR0)
Disabling signature match
You can disable the ‘smart’ signature match with the ‘-b’ option as indicated below:
$ binwalk -b dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL DESCRIPTION ------------------------------------------------------------- 0 0x0 TRX firmware header, little endian, image size: 3543040 bytes, CRC32: 0x85472C8C, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0x8E4, rootfs offset: 0x8E7CC 28 0x1C gzip compressed data, maximum compression, from Unix, last modified: 1970-01-01 00:00:00 (null date) 2276 0x8E4 LZMA compressed data, properties: 0x5D, dictionary size: 8388608 bytes, uncompressed size: 1982464 bytes 583628 0x8E7CC Squashfs filesystem, little endian, DD-WRT signature, version 3.0, size: 2954340 bytes, 587 inodes, blocksize: 65536 bytes, created: 2007-06-14 00:34:20 1674548 0x198D34 rzip compressed data - version 59.-98 (1020936805 bytes)
Entropy
The ‘-E’ option performs entropy analysis on the target file and generates the entropy graph.
$ binwalk -E dd-wrt.v24_whr-g125.bin DECIMAL HEXADECIMAL ENTROPY ------------------------------------------------------------- 0 0x0 Rising entropy edge (0.984773) 3536896 0x35F800 Falling entropy edge (0.626462)
You can also use the ‘–verbose’ option for displaying a more detailed entropy calculation as follows:
$ binwalk -E --verbose dd-wrt.v24_whr-g125.bin Scan Time: 2024-08-18 21:06:24 Target File: /home/shakthi/operation-gladiator/osfy/security-series/5-binwalk/dd-wrt.v24_whr-g125.bin MD5 Checksum: 8a60810685fa5be6221936034b81fd3a MD5 Checksum: 8a60810685fa5be6221936034b81fd3a DECIMAL HEXADECIMAL ENTROPY ------------------------------------------------------------- 0 0x0 0.984773 2048 0x800 0.988507 4096 0x1000 0.988300 6144 0x1800 0.989242 8192 0x2000 0.988839 10240 0x2800 0.989643 12288 0x3000 0.989161 14336 0x3800 0.987165 16384 0x4000 0.988030 18432 0x4800 0.989496
Do read the binwalk GitHub documentation to learn more on its options, usage and API.