Bencode, Bdecode in shell script

I have been playing around with OpenWRT recently, to see what IT tools you could deploy to the bush using cheap, low power computers as a base.

One of the really interesting things about writing systems to live in the embedded Linux world is that you want to try to do as much as possible with the stuff that’s there out of the box. This is because smarter people than me have stuffed the tiny 8 meg flash images with useful tools, leaving no room for my bloated Perl-ware. So the question becomes, how can I live with what they gave me and still get what I want done.

I wanted to do some secure file exchange stuff, and have it work over multiple protocols. So I was investigating OpenSSL. It turns out you can do everything you need to run a custom PKI with the commandline OpenSSL program and shell scripts. You need a way to marshal the various pieces together (the public key, the session key, the data itself, a hash, etc) and wrap them up, then unwrap them at the other end. The professionals use complicated systems based on BER and so on. Then they need to link against huge binaries to make it all work right. But I only had busybox available to me. Could I still make something useful?

I started off thinking about tar, which can put a bundle of data together. But I really like the tight encoding in BitTorrent’s Bencode format, and sending around tarfiles seemed kind of hokey (but don’t laugh, that’s all .deb files are, and RPMs are just CPIO files). So I thought it would be nice to have Bencode/Bdecode available for me to use. But the only standalone tools I could find used interpreted languages, and adding Python to my little Linksys box was exactly what I was trying to avoid.

I ended up, basically as a kind of sick joke, writing my own Bencode/Bdecode in shell script. Don’t laugh… people were doing this in the 80’s with shar files, which had the added benefit that they could uncompress themselves! Let’s see a Bencoded structure do THAT!

I succeeded, at least for what I needed. My Bencode/Bdecode pair are below. Note that these guys only implement non-recursive dictionaries, which was all I needed for my application. Adding full support would be easy, though deciding on a filesystem representation of an ordered list is a bit of a problem.

Bencode in shell script

#!/bin/sh

# usage: bencode [dir]
# encodes all the files in [dir] in Bencode format.
# See http://en.wikipedia.org/wiki/Bencode for the defn.
# Note this tool does not implement recursive structures,
# only dictionaries.

# for the inevitable port because some echos are less compatible
e="echo -n"

if [ "$1" = "" ]; then
  echo "usage: bencode [dir]"
  exit 1
fi

dir=$1
if [ ! -d "$dir" ]; then
echo "$dir is not a directory."
exit 1
fi

keys=`echo $dir/* | sort`

$e d
for f in $keys
do
  if [ ! -f $f ]; then
    echo "$f is not a file."
    exit 1
  fi

  key=`basename $f`
  klen=`echo $key | wc -c`
  klen=`expr $klen - 1`

  vlen=`ls -l $f | awk '{print $5}'`

  $e "$klen:$key"
  $e "$vlen:"
  cat $f

done
$e e

Bdecode in shell script

#!/bin/sh

# usage: bdecode [dir]

# decodes stdin into [dir]
# Stdin should be a a Bencoded dictionary of strings (but not general
# Bencoded forms, just a dict of strings)

# dd wrapper
substr () {
  if [ "$2" != "" ]; then
    count="count=$2"
  else
    count=
  fi
  dd bs=1 skip=$1 $count 2>/dev/null
}

# put tmp2 on top of tmp
m () {
  mv $tmp2 $tmp
}

dir=$1
if [ "$dir" = "" ]; then
  echo "usage: bdecode [dir]"
  exit 1
fi

mkdir -p $dir

tmp=/tmp/bd1.$
tmp2=/tmp/bd2.$
cat > $tmp

d=`substr 0 1 < $tmp`
if [ "$d" != "d" ]; then
  echo "Not a dictionary."
  exit 1
fi
substr 1 < $tmp > $tmp2
m

while [ -f $tmp ]
do
  n=`awk -F: '{print $1; exit}' < $tmp`

if [ "$n" != "e" ]; then
  nlen=`echo $n | wc -c`
  substr $nlen < $tmp > $tmp2
  m
  key=`substr 0 $n < $tmp`
  substr $n < $tmp > $tmp2
  m

  n=`awk -F: '{print $1; exit}' < $tmp`
  nlen=`echo $n | wc -c`
  substr $nlen < $tmp > $tmp2
  m

  substr 0 $n < $tmp > $dir/$key
  substr $n < $tmp > $tmp2
  m
else
  rm $tmp
  fi
done

Note: these routines won’t win any speed records, but they do work reliably when all you have is BusyBox’s ash, dd, and awk.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *