A simpler way to embed data

In my post about how to efficiently put data into a Go binary, I mentioned that strings are immutable, and can be accessed without causing the Go runtime to copy them. This turns out to be the key to a simpler way to achieve what I wanted to do.

By simpler I mean, “no cgo”. That’s a nice simplification, because up until recently, your final static binary image linked to the cgo code dynamically, and that made using my technique impossible in the context of the Tiny runtime, where there is no dynamic linker. Recently cgo has changed, but at the same time, I’ve discovered how to use native strings to do what I want, so let’s see how it works.

I shied away from strings at first because I understood them to be “unicode strings”, and thus not eligible to hold arbitrary bytes (i.e. bytes which turn out to create a non-valid unicode rune). That’s not true at all. In Go, the string type is in some ways an alias for “an immutable array of 8-bit octets”, i.e. an […]byte. True, many of the built-in functions that operate on string expect what’s inside of it to be valid UTF-8, and might malfunction if you give them random bytes. But there’s nothing to keep you from putting bad UTF-8 in, then never using the functions that expect good UTF-8.

So, that’s the plan. We put our individual bytes into the string, then we do brain surgery to turn them into a []byte:

package main

import (
        "unsafe"
        "reflect"
)

var empty [0]byte
var str1 string = "the string"
var str2 string = "another string"

func fix(s string) (b []byte) {
        sx := (*reflect.StringHeader)(unsafe.Pointer(&s))
        b = empty[:]
        bx := (*reflect.SliceHeader)(unsafe.Pointer(&b))
        bx.Data = sx.Data
        bx.Len = len(s)
        bx.Cap = len(s)
        return
}

func main() {
        b := fix(str1)
        println(b[0])
        b = fix(str2)
        println(b[0])
        b[0] = 'x'              // crash: write to ro segment
}

When you read the assembly of that program, there’s not a memcpy to be seen. The []byte you get points directly at the original bytes. You could also see that by taking the address of str[0] and b[0] and seeing that they are the same byte in memory.

The last line shows why Go is going to so much trouble to prevent me from doing this: the memory that is now underlying my []byte is read-only. At link time, the linker put it into a read-only segment, and now when I write to it, I get this (the equivalent of a segfault in Go):

unexpected fault address 0x80640f8
throw: fault

panic PC=0xf765b048
runtime.throw+0x3e /home/jeffall/go/src/pkg/runtime/runtime.c:73
	runtime.throw(0x80a3916, 0x80640f8)
runtime.sigpanic+0xc7 /home/jeffall/go/src/pkg/runtime/linux/thread.c:288
	runtime.sigpanic()
main.main+0xd8 /home/jeffall/go-stuff/str.go:27
	main.main()
runtime.mainstart+0xf /home/jeffall/go/src/pkg/runtime/386/asm.s:84
	runtime.mainstart()
runtime.goexit /home/jeffall/go/src/pkg/runtime/proc.c:148
	runtime.goexit()

Working on this has made me ask myself a few times, “why am I so intent on turning read-only memory into a []byte, thereby corrupting Go’s type safety?” I’m still grappling with that, stay tuned. (One reason why is that this whole idea came from working in the Tiny Go environment, where there’s currently almost no memory protection offered anyway. But that’s a dumb reason; if the non-existent OS can’t save you from yourself, you certainly should NOT stop the compiler from saving you!) Maybe there’s a third version coming which manages to keep it type safe and still do what I want. I suspect it’s going to have something to do with changing the interface of my filesystem object to keep the string itself internal, and only expose a method that returns an io.Reader.

RTM’s puzzle

Here is a little program to that implements RTM’s series for Cliff Stoll.

package main

import "fmt"

func count(s []int) int {
  i := 1
  x := s[0]
  for ; i < len(s); i++ {
    if s[i] != x {
      break
    }
  }
  return i
}

func next(s []int) []int {
  res := []int{}
  for len(s) > 0 {
    n := count(s)
    res = append(res, n, s[0])
    if n == len(s) {
      break
    }
    s = s[n:]
  }
  return res
}

func max(s []int) (m int) {
  for _, x := range s {
    if x > m { m = x }
  }
  return
}

func main() {
  s := []int{1}
  for i := 0; i < 200; i++ {
    s = next(s)
    fmt.Println(len(s), max(s))
  }
}

Where is bytes.NewReaderAt?

I have a nice source of []byte slices (see last post), and now I’d like to do something with them that resembles a filesystem. I was planning on just using a map from name to file contents, which would work ok. But then I remembered seeing the archive/zip package, and I thought how much cooler it would be to just make my prototype filesystem, zip it up, put the zip file into my Go program (see last post) and then access the filesystem via the archive/zip package later.

zip.Decode doesn’t want an io.Reader like lots of Go libraries. Instead it wants an io.ReaderAt. This makes sense, because a Zip file has a table of contents, allowing you to efficiently jump around the archive and pull out just one file. So then I went looking for how to turn a []byte into an io.ReaderAt. I looked all through the bytes package and came up short.

So I did it myself. Here’s how, in case you come across the same kind of problem:

type seekable []byte

func NewSeekable(buf []byte) seekable {
  return buf
}

func (r seekable)ReadAt(p []byte, off int64) (n int, err os.Error) {
  o := int(off)
  copy(p, r[o:o+len(p)])
  return len(p), nil
}

Now you can do something like this:

  b := fs.FileMap["/tmp/test.zip"]
  n, err := zip.NewReader(NewSeekable(b), int64(len(b)))

Warning: The implementation of ReadAt probably has some bugs around the edge cases. I wrote it very fast while just trying to see if this approach would work. There’s also a mismatch of integer sizes that is not handled well by this code. It could be you could reason that you’d never see a problem, but the way it is currently written (by me!) does not pass my “smell test”.

Fat Constants, Thin Constants

I play from time to time with a patch for Go that makes the Tiny runtime more capable. My current goal is to get a new backend for exp/draw working which writes to the SVGA screen. It would be cool to be able to decode the Go mascot and have him flying around the screen or something.

In the Tiny runtime, there’s no OS, so there’s no disk drivers, so there’s no filesystem, so there’s no files. Which makes decoding a PNG and showing it kind of hard. But if the program carries along the data with it, in the form of a []byte, then you could use bytes.NewReader to turn it into an io.Reader and then pass it to image/png.Decode. So that’s what I set out to do.

I got surprised by something, and that’s what I’m writing about today. Here’s the code that surprised me:

package main
func main() {
  println([...]byte{'f', 'o', 'o'}[:])
}

The resulting assembly is:

...
8048c26 8b442404             | (4)  MOVL  4(SP),AX
8048c2a 89c3                 | (4)  MOVL  AX,BX
8048c2c 83c300               | (4)  ADDL  $0,BX
// 102 = 'f'
8048c2f c60366               | (4)  MOVB  $102,(BX)
// put it in the array
8048c32 89c3                 | (4)  MOVL  AX,BX
8048c34 43                   | (4)  INCL  ,BX
// increment and do it again...
8048c35 c6036f               | (4)  MOVB  $111,(BX)
8048c38 89c3                 | (4)  MOVL  AX,BX
8048c3a 83c302               | (4)  ADDL  $2,BX
8048c3d c6036f               | (4)  MOVB  $111,(BX)
8048c40 890424               | (4)  MOVL  AX,(SP)
8048c43 c744240403000000     | (4)  MOVL  $3,4(SP)
8048c4b c744240800000000     | (4)  MOVL  $0,8(SP)
8048c53 c744240c00000000     | (4)  MOVL  $0,12(SP)
8048c5b c744241000000000     | (4)  MOVL  $0,16(SP)
8048c63 c744241403000000     | (4)  MOVL  $3,20(SP)
8048c6b c744241800000000     | (4)  MOVL  $0,24(SP)
8048c73 c744241c01000000     | (4)  MOVL  $1,28(SP)
8048c7b c744242000000000     | (4)  MOVL  $0,32(SP)
8048c83 e89d840000           | (4)  CALL  ,8051125+runtime.slicearray

This is a pretty inefficient way to build a constant! I wouldn’t even have noticed, except that when I changed the simple “foo” above into the actual PNG I wanted to include in my program, 8g took so much memory the out of memory killer woke up and killed my Firefox! And we were talking about a little 80 kb PNG! The resulting ELF binary was 1.3 megs!

The syntax I used, namely “make an array then take a slice of it” turns out to be excessively complicated. The routine I was using expected a slice, so that’s how I ended up tacking the [:] on the end. Word to the wise: when the compiler is complaining about wanting a slice instead of an array, don’t just add the [:] on the end, give the compiler a chance to make the slice itself:

package main
func main() {
  // there are 57 0's here:
  println([]byte{ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, })
}

Here’s the resulting assembly:

8048c17 83ec2c               | (3)  SUBL  $44,SP
// allocate a new array that's 57 bytes long
8048c1a c7042439000000       | (4)  MOVL  $57,(SP)
8048c21 e85d1d0000           | (4)  CALL  ,804a983+runtime.new
8048c26 8b542404             | (4)  MOVL  4(SP),DX
// load the data from this place in the program's image (my zeros are there; the file would be later)
8048c2a 8d35dc590508         | (4)  LEAL  main.statictmp_0001+0(SB),SI
8048c30 89d7                 | (4)  MOVL  DX,DI
8048c32 fc                   | (4)  CLD ,
// with a loop count of 14
8048c33 b90e000000           | (4)  MOVL  $14,CX
// repeat
8048c38 f3                   | (4)  REP ,
// a move long
8048c39 a5                   | (4)  MOVSL ,
// and then one more byte (4 * 14 + 1 = 57)
8048c3a a4                   | (4)  MOVSB ,
// slice cap = 57
8048c3b c744242439000000     | (4)  MOVL  $57,main.autotmp_0000+36(SP)
// slice len = 57
8048c43 c744242839000000     | (4)  MOVL  $57,main.autotmp_0000+40(SP)
// underlying array = the new array
8048c4b 89542420             | (4)  MOVL  DX,main.autotmp_0000+32(SP)

OK, so this is closer to Not Totally Insane, which is what we want. But why? I suppose this is either a bug or a very deep part of the compiler. In the first program, the compiler thinks it needs to compute and put into the array every item. In the second program, the compiler understands that every item is unchanging and can live in the program text.

Though it is no longer assembling the array at runtime byte by byte, it is still calling runtime.new and then making a copy from the program text into the new array. Can we do better? If we move the array out of the function call and give it a local variable, it doesn’t change anything, it is still copied (not shown, check for yourself if you want). If we move the initialization of the array out to the global scope, then the Go compiler moves it into main.init() for us, doing the copy just the same.

So, can we do it without the copy at all? Yes! But it requires unsafe, slice brain surgery, and Cgo. Hang on tight…

A slice is a data structure on top of an underlying zone of memory. The definition of a slice from reflect.SliceHeader is:

type SliceHeader struct {
    Data uintptr
    Len  int
    Cap  int
}

So what we want to do is get a pointer to a place in memory where our contiguous bytes are stored, then set the Data pointer of the slice to that pointer (as well as making sure Len and Cap are correct). Mucking around with pointers like this can get you in trouble: if the garbage collector decides that the only reference it has to the thing at slice.Data, and it is ready to garbage collect your slice, then it’s going to try to garbage collect your underlying array as well. If that array came from someplace other than the Go heap, you’re going to corrupt something. But it is relatively easy to prevent a slice from ever getting GC’d; all you have to do is hold it in a map at global scope. For my application (a set of byte arrays representing contents of files), I already want to hold them in a map with keys equal to the filename and values equal to the byte slice.

I said we’d need Cgo too. But why? The answer is that if we let the Go compiler put the bytes away into the program text, it will know where they are, but we won’t. We can’t reference a symbol like “main.statictmp_0001” to find the bytes themselves. And no matter what we do when we are trying to find out where the bytes are stored, Go copies them into memory alloced with new. Why? Probably because it needs to guarantee that the array will be writeable later, since it is not marked as immutable as strings are (the fact that strings are immutable means there might be another way to achieve our goal; we’ll have to come back to that in a bit) it can’t leave the bytes pointing to the read-only memory where the program text is stored.

So we need Cgo not because we want to call into a C subroutine, but because we want to have C put the bytes into the read-only segment of the program text with a symbol on them that we know, and can pass to unsafe.Pointer when we are doing our slice brain surgery.

Here is the resulting test program:

package main

/*
unsigned char const data__tmp_test_zip[] = { 0, 1, 2 };
*/
import "C"

import (
	"unsafe"
	"reflect"
)

var FileMap map[string][]byte

func main() {
	FileMap = make(map[string][]byte)

	s1 := C.data__tmp_test_zip[:]
	s1x := (*reflect.SliceHeader)(unsafe.Pointer(&s1))

	var buf [1]byte
	s2 := buf[:]
	s2x := (*reflect.SliceHeader)(unsafe.Pointer(&s2))
	s2x.Data = s1x.Data
	s2x.Len = s1x.Len
	s2x.Cap = s1x.Cap

	FileMap["/tmp/test.zip"] = s2
        println(FileMap["/tmp/test.zip"])
}

In this case, we are trying to store the three bytes 0, 1, 2. We do that inside the /* */ block right before the import "C". The slice brain surgery happens in main(). C.data__tmp_test_zip is type [3]C.uchar. To get a slice on top of it, we just ask for C.data__tmp_test_zip[:] and put it in s1. We use reflect to dig inside the slice and find the pointer to the underlying bytes.

(Thanks to Gustavo, who showed me how to do this with his mmap module.)

The next part, with s2 and s2x, is a second piece of brain surgery. The underlying array is a [3]C.uchar. But we want an API based on []byte, because that way we can pass them to things like bytes.NewReader to get an io.Reader that we can pass onwards to things like image/png.Decode. To do that, we make a new slice with an underlying [1]byte array, then we swap out the pointer to point to the array of bytes the C made for us. Voila, a []byte that points to the contents of our file, without a copy.

Passing function pointers through channels in Go

Is is possible? Of course!

package main

func add(x, y int) int {
	return x + y
}

func mul(x, y int) int {
	return x * y
}

func runner(ch chan func(int, int)(int)) {
	for f := range ch {
		println("f(1, 2) = ", f(1, 2))
	}
}

func main() {
	ch := make(chan func(int, int)(int))
	go runner(ch)

	ch <- add
	ch <- mul
	ch <- add
	close(ch)
}

Is it useful? Probably, but this demo doesn't show how yet... have to think about what patterns this makes possible.