2011/09/25

Great time in Kanda, Tokyo: Erlounge

Yesterday (9/23) was a great day that Erlang workshop as a satellite of ICFP/ACM SIGPLAN international conference. Although I did not participated in the workshop, I joined the party because Francesco Cesarini and Ryosuke Nakai said me to join. Seeing living legends in Europe (indeed just community members in Western countries) was very exciting.
I was introduced by Kenji Rikitake (AKA @kenji_rikitake) as an author of MessagePack Erlang port - That made me think I should output more and more to the open source community and the Erlang/OTP community. Until this day I was thinking of stopping reading, writing and saying anything about Erlang/OTP because of baby sitting (many thanks to my wife for helping me work for community and study that does not make money to live along) and some my personal disgust about my work... But their activeness, amount of beers they drunk, the time when the party was finished, speaking English in a positive way and their positive attitude made me think positive to keep in touch with Erlang.
Don't ask permission. Ask for forgiveness.
All thanks to Erlangers who were in Kanda, Tokyo at 2011/9/23.

2011/07/20

Use MessagePack/Erlang and write message queue in an hour

I wrote a toy software within an hour (and additional debug time), which is message queue server accessible from clients with many kind of languages: C, C++, Ruby, Java, Python and so on. Erezrdfh (pronounces "e-re' zerd f") is a simple, on-memory message queue with 9-nines availability of Erlang/OTP. It doesn't need particular client library but users can use MessagePack-RPC to write client in a minute. Ruby one-liner is as follows:
c = MessagePack::RPC::Client.new(host,port); c.call(:push, "name", "message"); c.call(:pop, "name);

and C++ code is like this:
msgpack::rpc::client c(host,port);
c.call("pop", "name", "message").get<bool>();
c.call("pop", "name").get<std::string>();

and Java code is like:
Client c = new Client(host,port);
c.callApply("push", new Object[]{"name","message"});
c.callApply("pop", new Object[]{"name"});

MessagePack is a software suite of serializer, RPC and IDL compiler. This is a great library due to its performance, simplicity and language diversity. Erlang is also a great software that promises scalability, simplicity and solidness. Why don't you miss these great technologies?

Its performance is also so great that I can't believe it is less than of 250 LOC. With my quad-core Phenom machine, load-generation tool and erezrdfh server running in one machine, its performance of push/pop was 20000 qps. Due to Erlang/OTP's scalability if you install on dedicated machine with more cores, erezrdfh will scale more. The source code includes basho_bench driver and just try it!

2010/03/01

Had a talk on 4th Tokyo Erlang Workshop

(This post is a translation of my Japanese post)
I participated to 4th Tokyo Erlang Workshop in Aoyama Oracle Center Tokyo, Japan. I had a talk about Yatce in sessions and went to the party. The organizer was cooldaemon, who did a great contribution and prepare for the workshop and the party. All people there including me appreciated very much on his contribution. Also I appreciate Oracle Japan, Inc. for it's contribution of providing its great conference room and understanding on open-source communities.

The workshop:
Takeru Inoue's "(maybe) useful Algorithms for distributed storage" was first and the most difficult speech. He talked about 'Sinphonia', which took best paper in SOSP'07, overcoming the Amazon's famous Dynamo paper. This algorithm is very difficult but revolutionary because it shows a method that is much faster than 2-phase commit and much consistent than Vector clocks. He also introduced BDD and ZDD. ZDD is the only algorithm found by Japanese that is booked in Knuth's "the Art of Programming".

Higepon's talk about his implementation of Skipgraph Key-value-storage was also excellent. I've been very interested in his simple design of concurrent-join in SkipGraph ring. It admits three broken status in SkipList when reading and earns read-throughput. Moreover, he had created sample application of bulletin board, whose CSS design was cool. Stay tuned on mio!
My only question was mio's design for fault-tolerance and replication (and had forgotten asking). And what he said was "Good programmers never forget automation of unit-testing," ...

My talk on Yatce and general Erlang-C bindings was like this (partially Japanese, partially English):

I have a few additional topics: Linked-in driver may be best choice because ERTS's prim_file, prim_inet (which are the backend of file I/O and net I/O) are implemented with Linked-in driver. Usually for I/O intensive tasks linked-in driver is suitable and CPU intensive tasks NIF seems suitable. It will be the style. And my talk was Ust'ed.

@sleepy_yoshi's "Badly-educated guy seems in tutorial of Erlang" was a great laughter in hard-boiled workshop (which was organized by hard-algorithm, hard-software-design, hard-implementation). Of course he is far from badly-educated.

The pardy was great time because many great hackers around Erlang and other cool technologies such as linux-kernel, linux-distribution, OCaml and Python.

2010/02/27

Re: Toke

Matthew gave me replies with some objections. I'm posting to my blog both because failed posting a comment to the original blog post and because it's a bit long, late and not so cool.
I do take objection to your comment in your blog post: “It’s a good example case that not expressing nor publishing in English leads products/opinions shall be ignored, and bad practice”. The simple fact is that I was not aware of yatce, and indeed, googling for “erlang tokyo cabinet” does not return any result for yatce until the 3rd page. If I had known about yatce in advance, then I would have tried to use it in preference to writing Toke: we have absolutely no desire to reinvent the wheel unnecessarily.
Yes, I meant 'ignore' as not being aware, just because I'm not so good at English. I'm sorry.

I don’t however understand your question about 16 tables. I’ve not come across limits on Erlang drivers and ports but maybe there’s something I’ve missed?
No, I had a memory of some limitation, seeing crypto module is using 16 drivers for more calculation speed. But again I search the erlang documents around and found no special statement about limitation of number of linkedin drivers.

If anything, it’s slightly slower than the current code (as well as being slightly more messy). It would seem the context switch is not hurting me at all. Even for tests which I would have thought would most expose control as being faster (eg lots of gets), it’s in fact no faster at all.
Yes, I was wrong. I thought too simply that context switching makes erts slower but context switching breaks some optimization (I don't know details and what it is) of Erlang runtime schedulers. They're using spinlocks...

After all totally you're right, matthew.

2009/12/23

Toke: an alternative to Yatce

Toke - TokyoCabinet driver for Erlang has been just announced by LShift Inc., as Voluntas informed me. Lshift Inc. is famous for its relationship with Rabbit Tech. with its excellent product RabbitMQ.

Most famous and early TC-Erlang driver is tcerl. Their claim about tcerl is buggy, hard to make it work, slow(because of port-driver), and even seems not maintained. I agree. And they say they still want to use TC because bare TC is blazingly fast. That's all.

I've been developing yatce since April 2009, and it's getting more stable since around August 2009. For readers of my erlang diary since then, Toke seems nothing but a re-invention of wheel... It's a good example case that not expressing nor publishing in English leads products/opinions shall be ignored, and bad practice. Even @kenji_rikitake introduced me in erlang-questions. orz...

(P.S: mention; Writer of Toke matthew answered me that he was not ignoring me but just he wasn't aware of yatce. Ignoring and being not aware is different, sorry for my poor English and thanks for replying!)

BTW without any look at specs I went source code of Toke and found Toke too simple, no, surprised. Pros are:
  • so fast because it's using linked driver
  • the code and interface is so simple that will make it stable and with fewer bugs
  • Reliable in a viewpoint of maintenance because Erlang professionals are full-time dedicated to Toke (and other Erlang products)
And as a creator of a software of much the same, the problem is well-known (Con):
  • There is only one port per table, so toke can't realize the TC's real potential in multi-threaded environment. (as they say 1/3 performance got.)
I know how can we get multi-threaded, doing like crypto module will be the answer. But NIF will blow them... As a rival of a software of much the same, there're many differences:
  • pure difference:
    • Toke requires binary as key/value while yatce can deal with raw term.
    • Yatce uses port_control for driver-erlang communication and seems superior in latency (in a design level) because port_control doesn't make any context-switch. While Toke uses port_command for driver-erlang communication seems good at throughput (as its interface insert_async indicates... no doubt in RabbitMQ!). Eventual consistency.
  • toke is better:
    • putcat, fold, insert_async, tunings supported.
    • good at error code (indicating errors), beautiful interface.
    • Reliable maintainers, more than me:P
  • yatce is better:
    • size, sync, more TC-like interface.
    • It's using TCADB so supports on-disk B+-tree, on-mem Hash, on-mem B+-tree (while toke only Hash)
    • Multiple-tables in one driver (is it good?!)
    • also a NIF'd version!!!
This post is a translation of my Japanese post.