2009/12/23

Toke: an alternative to Yatce

Toke - TokyoCabinet driver for Erlang has been just announced by LShift Inc., as Voluntas informed me. Lshift Inc. is famous for its relationship with Rabbit Tech. with its excellent product RabbitMQ.

Most famous and early TC-Erlang driver is tcerl. Their claim about tcerl is buggy, hard to make it work, slow(because of port-driver), and even seems not maintained. I agree. And they say they still want to use TC because bare TC is blazingly fast. That's all.

I've been developing yatce since April 2009, and it's getting more stable since around August 2009. For readers of my erlang diary since then, Toke seems nothing but a re-invention of wheel... It's a good example case that not expressing nor publishing in English leads products/opinions shall be ignored, and bad practice. Even @kenji_rikitake introduced me in erlang-questions. orz...

(P.S: mention; Writer of Toke matthew answered me that he was not ignoring me but just he wasn't aware of yatce. Ignoring and being not aware is different, sorry for my poor English and thanks for replying!)

BTW without any look at specs I went source code of Toke and found Toke too simple, no, surprised. Pros are:
  • so fast because it's using linked driver
  • the code and interface is so simple that will make it stable and with fewer bugs
  • Reliable in a viewpoint of maintenance because Erlang professionals are full-time dedicated to Toke (and other Erlang products)
And as a creator of a software of much the same, the problem is well-known (Con):
  • There is only one port per table, so toke can't realize the TC's real potential in multi-threaded environment. (as they say 1/3 performance got.)
I know how can we get multi-threaded, doing like crypto module will be the answer. But NIF will blow them... As a rival of a software of much the same, there're many differences:
  • pure difference:
    • Toke requires binary as key/value while yatce can deal with raw term.
    • Yatce uses port_control for driver-erlang communication and seems superior in latency (in a design level) because port_control doesn't make any context-switch. While Toke uses port_command for driver-erlang communication seems good at throughput (as its interface insert_async indicates... no doubt in RabbitMQ!). Eventual consistency.
  • toke is better:
    • putcat, fold, insert_async, tunings supported.
    • good at error code (indicating errors), beautiful interface.
    • Reliable maintainers, more than me:P
  • yatce is better:
    • size, sync, more TC-like interface.
    • It's using TCADB so supports on-disk B+-tree, on-mem Hash, on-mem B+-tree (while toke only Hash)
    • Multiple-tables in one driver (is it good?!)
    • also a NIF'd version!!!
This post is a translation of my Japanese post.