Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Go's a System language. Doesn't that mean it's more suitable for writing a web server, database, driver or OS than as a web site scripting language?

I mean, no one would write System code in PHP, and it's rare to see web sites written in C, so why write a website in Go?



Go turned out to be a pretty nice general purpose language. You got garbage collection, sane native string handling, native maps and lists. Mix that with static typing and you got a pretty nice language that enables fast and sane web development.


Having done a lot of CJK development, I felt that Go's unicode strings were pretty kludgey last time I looked. Go's strings are all utf8, so unless you're working in its ASCII subset alone, you have to manually iterate over multi-byte runes to get the unicode codepoints out. That's really not what I'd call a friendly unicode handling comparable to scripting languages, or even Java.

Please correct if things have changed. I haven't revisited Go for a little while now, and would be very interested as to any updates to its unicode handling.


1. Range on a string iterates over Unicode code points (runes):

    s := "Какая-то строка"
    for _, rune := range s {
        // do something with rune 'К', 'a', ...
    }

2. Converting to []int gives you a slice of runes:

   s := "Какая-то строка"
   runes := []int(s)

   sub := string(runes[:8]) // "Какая-то"
however, slicing a string directly will slice it by byte:

   s[:8] // "Кака"

3. With package utf8 (http://golang.org/pkg/utf8/) you can manipulate runes manually.

While this is all not intuitive (you have to know what does what), I find it rather easy.


Iteration over strings is rune-by-rune in Go. However, (somewhat counter-intuitively) string indexing/slicing is byte-by-byte, so you can't just go "str[0:10]" and get the first 10 code points. Then again, that's true in utf16 also, if I'm not mistaken. But if you want an array of runes instead of a utf8 encoded string, you can just do "[]int(mystring)" and it'll do the conversion for you.


Same in Java, C# and Python. To get the codepoints out or access the nth codepoint you need to iterate over the string if you want your code to be correct and safe.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: