2014-06-30

Named let performance

When I was playing around with Sagittarius compiler, I've found that some of named let expression was compiled to (looks) slow code. Following piece of code is one of them;
(lambda (pred lst)
  (let loop ((lst lst))
    (cond ((null? lst) '())
          ((pred (car lst)) (loop (cdr lst)))
          (else (cons (car lst) (loop (cdr lst)))))))
You can see this type of code anywhere (although it's not tail recursive so I don't like it). And you would expect this not to have any performance issue other than stack consumption because of the recursive call.

Now, I was also thinking like that however things are bit worse. The compiled code describes everything;
;; size: 3
;;    0: CLOSURE #<code-builder #f (2 0 0)>
;;   size: 13
;;      0: UNDEF
;;      1: PUSH
;;      2: BOX(0) !!!! IMPLICIT BOXING !!!!
;;      3: LREF_PUSH(0)
;;      4: LREF_PUSH(2)
;;      5: CLOSURE #<code-builder loop (1 0 2)> !!!! CLOSURE CREATION !!!!
;;     size: 26
;;        0: LREF(0)
;;        1: BNNULL 3                  ; ((null? lst) '())
;;        3: CONST_RET ()
;;        5: FRAME 4
;;        7: LREF_CAR_PUSH(0)
;;        8: FREF(1)
;;        9: CALL(1)
;;       10: TEST 6                    ; ((pred (car lst)) (loop (cdr l ...
;;       12: LREF_CDR_PUSH(0)
;;       13: FREF(0)
;;       14: UNBOX
;;       15: LOCAL_TAIL_CALL(1)
;;       16: RET
;;       17: LREF_CAR_PUSH(0)
;;       18: FRAME 5
;;       20: LREF_CDR_PUSH(0)
;;       21: FREF(0)
;;       22: UNBOX
;;       23: LOCAL_CALL(1)
;;       24: CONS
;;       25: RET
;;      7: LSET(2)
;;      8: LREF_PUSH(1)
;;      9: LREF(2)
;;     10: UNBOX
;;     11: LOCAL_TAIL_CALL(1)
;;     12: RET
;;    2: RET
I've put comments where it causes performance issue.

The first one, implicit boxing, is because the loop is transformed to letrec and it needs to refer loop itself inside of it. The second one is creating a procedure each time the top procedure (I haven't named it though) is called because it has a free variable pred inside. Sounds reasonable? No, it doesn't at all to me now. If you write a piece of code with named let, then you would expect the calling named procedure would be compiled to just a jump. However this is compiled to a procedure call. Well, on the other hand this is not a tail recursive call so you wouldn't expect to be a jump though. Actually if the code was tail recursive then compiler would compile it to a simple jump without implicit boxing and creating a closure.

Now, then should users always be careful or adjust their code how compiler compiles to maximumised performance instructions? My answer is yes and no. In above case, I expect at least compiler should emit non implicit boxing code. In general, compiler should be smart enough to emit good quality code. Then problem is how? ... that's something I need to think, though.

2014-06-28

SRFI-17の紹介

(LISP Library 365参加エントリ)

SRFI-17は一般化したset!です。発案者はKawaの製作者Per Bothner氏ですね。(他のSRFIでも既にGuileやらLarcenyやらの製作者が出ていたのですが、文量の水増しです。)

CL使ってる方はsetfに近いものだと思ってもらえればいいです。使い方は以下のようになります。
(set! (car a) 'b) ;; -> (set-car! a 'b)
(set! (cdr a) 'c) ;; -> (set-cdr! a 'b)

(set! (vector-ref v 0) 'a) ;; -> (vector-set! v 0 'a)
実際にこのように展開されるかは実装次第なところもあるのですが、SRFI内では以下のようになると書かれています。
(set! (proc args ...) value) 
 ;; -> ((setter proc) args ... value)
例えば、Gaucheでは(set! (car a) 'b))はSRFIのように展開されますが、Sagittariusはset-car!が直接呼ばれるようにコンパイルされます(それがいいかどうかは別の問題ですが)。他の処理系は調べてません。

あまりに文量が少ないので多少余計な情報も。

このSRFIはどうやらかなり不評だったようです。
I have gotten no support for my proposal. While the srfi process allows for srfis that are controversal and don't have consensus, at this stage it seems kind of pointless to pursue it. I have no particular desire to push for something most people dislike, at least in this context. So if anybody out there thinks the generalized set! is a good idea, now is the time to speak up.
http://srfi.schemers.org/srfi-17/mail-archive/msg00049.html
この投稿が投げられる前にあったMatthias Felleisen氏が投げた投稿から始まる議論(罵り合い?)がなかなか面白いです。個人的には便利なSRFIだと思っているのですが、Scheme的ではないかもしれませんね。

今回はSRFI-17を紹介しました。

2014-06-15

How to emulate pthread_cond_wait with SRFI-18

I was implementing multi thread queue like Gauche has but in pure Scheme with SRFI-18. Then I've got confused how to wait condition variable. If it's POSIX then you can use pthread_cond_wait. However it's SRFI-18 and it doesn't provide a procedure to wait condition variable directly. Well, in the end there are some example code which explains how to do it but at that moment I couldn't figure it out. So there may be some people who also have the same issue as me. (It's basically because of my lack of knowledge of multi threading and, well, if there aren't at least this could be my memo to remember...)

The answer was a combination of mutex-unlock! and mutex-lock!. I was always thinking that pthread_cond_wait or similar procedure is the only way to wait for a condition variable. I skipped reading this section (and caused my confusion...)
NOTE: mutex-unlock! is related to the "wait" operation on condition variables available in other thread systems. The main difference is that "wait" automatically locks mutex just after the thread is unblocked. This operation is not performed by mutex-unlock! and so must be done by an explicit call to mutex-lock!. This has the advantages that a different timeout and exception handler can be specified on the mutex-lock! and mutex-unlock! and the location of all the mutex operations is clearly apparent. A typical use with a condition variable is:
(I don't complain this is very confusing but don't you think?) So what I needed to do is instead of searching something that specifically says condition-wait! or something, I needed to use mutex-unlock! and mutex-lock!. Following piece of code is more concrete example;
(import (rnrs) (srfi :18))

(define-record-type (<foo> make-foo foo?)
  (fields (immutable mutex foo-mutex)
          (immutable cv    foo-cv)
          (mutable   count foo-count foo-count-set!))
  (protocol (lambda (p)
              (lambda ()
                (p (make-mutex)
                   (make-condition-variable)
                   0)))))

;; Utilities for above foo record
(define-syntax lock-foo!
  (syntax-rules ()
    ((_ foo) (mutex-lock! (foo-mutex foo)))))
(define-syntax unlock-foo!
  (syntax-rules ()
    ((_ foo) (mutex-unlock! (foo-mutex foo)))))
(define (with-locking-foo foo thunk)
  (dynamic-wind
      (lambda () (lock-foo! foo))
      thunk 
      (lambda () (unlock-foo! foo))))

(define-syntax wait-cv
  (syntax-rules ()
    ((_ foo)
     (let ((r (mutex-unlock! (foo-mutex foo) (foo-cv foo))))
       (display "unlocked! with cv") (newline)
       ;; mutex is not locked so lock it if you need it.
       (mutex-lock! (foo-mutex foo))
       r))))

(define-syntax notify-foo
  (syntax-rules ()
    ((_ foo)
     (condition-variable-broadcast! (foo-cv foo)))))

(let ()
  (define foo (make-foo))

  (define (producer)
    ;; increment count
    (with-locking-foo foo
     (lambda ()
       (display "Increment count!") (newline)
       (foo-count-set! foo 1)
       (notify-foo foo))))

  (define (consumer)
    ;; wait until the count is one
    (with-locking-foo foo 
     (lambda ()
       (let loop ()
         (cond ((zero? (foo-count foo))
                (display "It's zero need to wait!") (newline)
                (if (wait-cv foo)
                    (loop)
                    (error #f "something went wrong")))
               (else (foo-count foo)))))))

  (let ((ct (thread-start! (make-thread consumer)))
        (pt (make-thread producer)))
    ;; consumer is waiting but make sure
    (thread-sleep! 10)
    ;; let producer increment
    (thread-start! pt)
    (thread-join! pt)
    (display (thread-join! ct)) (newline)))

#|
It's zero need to wait!
Increment count!
unlocked! with cv
1
|#
This just tries to emulate pthread_cond_wait using mutex-unlock! and mutex-lock!. The wait-cv is the emulation macro. notify-foo assumes that given foo
's mutex is locked. It took couple of hours to figure out this simple thing for me...
I haven't met any case that this mutex model is convenient but if this how it is I need to get used to it. (though, my case was just the name confusion...)

2014-06-14

SRFI-16の紹介

(LISP Library 365参加エントリ)

SRFI-16は可変長引数を受け取る手続きに対する構文です。要するにcase-lambdaです。R6RSから標準になった構文なので、あまり目新しいものは無いかもしれません。

以下のように使います。
;; because it's very famous I've become a bit lazy
;; so just a bit of taste...
(define foo
  (case-lambda
    ((c key) (foo c key #f))
    ((c key default) (ref c key default))))
上記の(いい加減な)例では、束縛される手続きは2つもしくは3つの引数を受け取ります。2つの場合は3つ目の引数にデフォルトの値として#fを設定して自身を呼びなおします。以下のように書くのとそんなに変わりません。
(define (foo c key . rest)
  (let ((default (if (null? rest) #f (car rest))))
    (ref c key default)))
ちなみに、参照実装だと上記のcase-lambdaは以下のように展開されます。
(define foo
  (lambda args
    (let ((l (length args)))
      (if (= l (length '(c key)))
          (apply (lambda (c key) (foo c key #f)) args)
          (if (= l (length '(c key default)))
              (apply (lambda (c key default) (ref c key default)) args)
              (error #f "Wrong number of arguments to CASE-LAMBDA"))))))
これだと多少手間が減る程度の恩恵しかありません。(もちろん処理系によっては引数のパックとapplyが異常なまでに安い処理系もあるかもしれませんが・・・) また、コンパイル時に手続きの呼び出しを行わない処理系だと、lengthの呼び出しが定義された分だけ呼び出されるので精神衛生上あまり好ましくありません。

さすがにこれはどうかとも思ったらしく、以前syntax-rulesのみを使った多少効率のいいcase-lambdaの実装を作っていました。(ちなみに、Sagittariusではもう少し前に書き直しています。やってることは一緒なのでsyntax-rules版にしてもいいかとは思いますが。)

これならば引数がパックされるだけでapplyを呼び出すことがないので、参照実装よりは多少効率がいいはずです(オプション引数として受け取るのと同等程度)。もう少し進めるのであれば、組み込みの構文にしてしまってコンパイラが受け取る引数をうまいこと解決するということも可能かもしれません。

今回はSRFI-16を紹介しました。RnRS準拠でもっと効率のいい実装があるよ!という方がいらしたらご一報くださいm(_ _)m