String Building the Go way

String Building the Go way

To build strings gradually, CSharp has StringBuilder. What does Go have? Superficially, Go appears not having/providing such thing as StringBuilder, but actually, it has something much more powerful.

Let’s see the Go’s idiosyncrasy of string building.

My intuitive code

Here is my pretty intuitive code that clearly shows what I’m trying to do:

Problems

The code is so intuitive that no explanation is necessary — I’m just building strings piece by piece. However, when talking about efficiency, this is not good. Why? Because,

The Go way

The How to efficiently concatenate strings in Go suggests to use var buffer bytes.Buffer and buffer.WriteString, which is “incredibly fast. Made some naive “+” string concat in my program go from 3 minutes to 1.3 seconds.” That’s a huge improvement. However, I do need to use some kind of printf to do the appending. Shall I go with buf.WriteString(fmt.Sprintf(...))) or fmt.Fprint(buf, "%s more %s", buf, more)…?

Long story short, here it is, the Go way:

By Dan Kortschak @adelaide.edu.au posted to golang-nuts (http://play.golang.org/p/8Jq_dPuwNF).

On first look, I wouldn’t believe that it is working, because it keeps writing to the same place &buf, i.e., the address of buf. I thought doing this way, the later output will overwrite previous ones. If it works, then each fmt.Fprintf(&buf will advance buf magically, while in the end,
return buf.String() will magically return buf to the top of buffer
of the original place.

I.e., there are to many magics happening here for me to understand, but it actually works, http://play.golang.org/p/aG925WNos7 is a quick proof. Here are the things helped me to understand, most of which are from Dan Kortschak:

  • Looking from the outside, the buf is of type bytes.Buffer, which always points to the beginning of the buffer; whereas &buf is treated as the type of io.Writer. I.e., “all the semantics of writing to a file-like object hold here”.
  • For inside behavior, Buffer.Write advances the write offset (just the length of the []byte used internally). Buffer.String returns the string conversion of the bytes from the read offset of the buffer to the current write offset – in the case here, the read offset is zero since no read has been done. (The String method is not a read operation BTW)

So that is the perfect solution I was looking for, which I can’t
possibly come up with myself today.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s