Graph 500のコードを動かしてみた

Graph 500 | large-scale benchmarks

Graph 500というスパコンベンチがあって、これのコードは公開されているので触ってみた。 スパコン持ってないので、実験機材は普通のノートPCにUbuntu 16.04が入ったもの。

まずはv3.0.0のコードをもらってくる。 https://github.com/graph500/graph500/archive/graph500-3.0.0.tar.gz

mpi関連をインストールする。

apt install mpich

展開する。

tar xzf graph500-3.0.0.tar.gz
cd graph500-graph500-3.0.0/

このままビルドするとリンクでエラーになるので、Makefile内のコンパイル引数に指定されている -lpthread とか -lm を後ろに持っていく。

ビルドする。

make

ビルド結果としては4つの実行ファイルが出来上がるが、名前に reference と付いている方は実行可能で、 custom と付いている方は動かない(本来の目的である、自分の実装をするためのものなので)。

ローカルでmpiの4並列実行してみる。

time mpiexec -n 4 ./graph500_reference_bfs 20

結果は長いので最後だけ抜粋。

SCALE:                          20
edgefactor:                     16
NBFS:                           64
graph_generation:               4.61012
num_mpi_processes:              4
construction_time:              1.36205
bfs  min_time:                  0.277266
bfs  firstquartile_time:        0.283024
bfs  median_time:               0.291698
bfs  thirdquartile_time:        0.302793
bfs  max_time:                  0.349763
bfs  mean_time:                 0.295806
bfs  stddev_time:               0.0167295
min_nedge:                      16775818
firstquartile_nedge:            16775818
median_nedge:                   16775818
thirdquartile_nedge:            16775818
max_nedge:                      16775818
mean_nedge:                     16775818
stddev_nedge:                   0
bfs  min_TEPS:                  4.79634e+07
bfs  firstquartile_TEPS:        5.54037e+07
bfs  median_TEPS:               5.75108e+07
bfs  thirdquartile_TEPS:        5.92735e+07
bfs  max_TEPS:                  6.05044e+07
bfs  harmonic_mean_TEPS:     !  5.67123e+07
bfs  harmonic_stddev_TEPS:      404095
bfs  min_validate:              1.18584
bfs  firstquartile_validate:    1.21473
bfs  median_validate:           1.22582
bfs  thirdquartile_validate:    1.26528
bfs  max_validate:              1.44657
bfs  mean_validate:             1.24752
bfs  stddev_validate:           0.0512569

real    1m50.265s
user    6m41.675s
sys 0m25.753s

なんとなく動いた。

Complete Results | Graph 500 を見るとパソコンレベルの1ノード構成でも1GTEPSくらいは出るみたいなので、max 0.06GTEPSはかなり遅い気がする。 reference だとそんなもんなんだろうか?

Common Lispで最速のfizzbuzzを実装した話

Kazuho氏のblogにこういうのがあったのでCommon Lispでやってみた。

blog.kazuhooku.com

Common Lispで実装する。

(defmacro fizzbuzz (n)
  (format nil "~{~a ~}" (loop for i from 1 below n
                              collect (cond
                                        ((= 0 (mod i 15)) "fizzbuzz")
                                        ((= 0 (mod i 5)) "buzz")
                                        ((= 0 (mod i 3)) "fizz")
                                        (t i)))))

(defun main ()
  (print (fizzbuzz 100)))

SBCL 1.4.1でディスアセンブルする。

* (disassemble #'main)

; disassembly for MAIN
; Size: 33 bytes. Origin: #x100195397C
; 7C:       498B4C2460       MOV RCX, [R12+96]                ; no-arg-parsing entry point
                                                              ; thread.binding-stack-pointer
; 81:       48894DF8         MOV [RBP-8], RCX
; 85:       488B15A4FFFFFF   MOV RDX, [RIP-92]                ; "1 2 fizz 4 buzz fizz 7 8 fizz buzz 11 fizz 13 14 fizzbuzz 16 17 fizz 19 buzz fizz 22 23 fizz buzz 26 fizz 28 29 fizzbuzz 31 32 fizz 34 buzz fizz 37 38 fizz buzz 41 fizz 43 44 fizzbuzz 46 47 fizz 49 buzz fizz 52 53 fizz buzz 56 fizz 58 59 fizzbuzz 61 62 fizz 64 buzz fizz 67 68 fizz buzz 71 fizz 73 74 fizzbuzz 76 77 fizz 79 buzz fizz 82 83 fizz buzz 86 fizz 88 89 fizzbuzz 91 92 fizz 94 buzz fizz 97 98 fizz "
; 8C:       B902000000       MOV ECX, 2
; 91:       FF7508           PUSH QWORD PTR [RBP+8]
; 94:       B8D8213620       MOV EAX, #x203621D8              ; #<FDEFN PRINT>
; 99:       FFE0             JMP RAX
; 9B:       CC10             BREAK 16                         ; Invalid argument count trap
NIL

文字列をprintするだけのmain関数が生成された。

Pythonのmultiprocessingおよびpipe実行時間

Pythonのmultiprocessingおよびpipeの実行時間計測。

multiprocessing

code

#!/usr/bin/env python
#
# see also : http://docs.python.jp/2.7/library/multiprocessing.html

from sys import stdout
from time import time
from multiprocessing import Process, Pipe

NUMBER = 1000

def measurement(f):
    time_start = time()
    for n in range(NUMBER):
        f()
    time_end = time()
    return time_end - time_start

def dummy():
    pass

def create_process():
    Process()

def start_process():
    Process().start()

def create_pipe():
    Pipe()

def join_process():
    process = Process()
    process.start()
    process.join()

def null_process(pipe1, number):
    for n in range(number):
        pipe1.recv()

def measurement_simplex_pipe():
    time_start = time()
    pipe0, pipe1 = Pipe()
    process = Process(target=null_process, args=(pipe1, NUMBER))
    process.start()
    for n in range(NUMBER):
        pipe0.send(n)
    process.join()
    time_end = time()
    return time_end - time_start

def echo_process(pipe1, number):
    for n in range(number):
        pipe1.send(pipe1.recv())

def measurement_duplex_pipe():
    time_start = time()
    pipe0, pipe1 = Pipe()
    process = Process(target=echo_process, args=(pipe1, NUMBER))
    process.start()
    for n in range(NUMBER):
        pipe0.send(n)
        pipe0.recv()
    process.join()
    time_end = time()
    return time_end - time_start

stdout.write("pass, %f\n" % measurement(dummy))
stdout.write("create process, %f\n" % measurement(create_process))
stdout.write("start process, %f\n" % measurement(start_process))
stdout.write("join process, %f\n" % measurement(join_process))
stdout.write("create pipe, %f\n" % measurement(create_pipe))
stdout.write("simplex pipe, %f\n" % measurement_simplex_pipe())
stdout.write("duplex pipe, %f\n" % measurement_duplex_pipe())

Python 2.7

処理 実行時間 (ms)
pass 0.000079
create process 0.003357
start process 0.245117
join process 0.976598
create pipe 0.011329
simplex pipe 0.003099
duplex pipe 0.011246

Python 3.5

処理 実行時間 (ms)
pass 0.000067
create process 0.003868
start process 0.415079
join process 1.492507
create pipe 0.035219
simplex pipe 0.007089
duplex pipe 0.047015

pipe

code

#!/usr/bin/env python
#
# see also : http://docs.python.jp/2.7/library/multiprocessing.html
#            http://docs.chainer.org/en/stable/tutorial/basic.html

from sys import stdout
from time import time
from multiprocessing import Process, Pipe
from chainer import Chain, ChainList
from chainer.links import Linear

NUMBER = 1000

def measurement(f):
    time_start = time()
    for n in range(NUMBER):
        f()
    time_end = time()
    return time_end - time_start

def null_process(pipe1, number):
    for n in range(number):
        pipe1.recv()

def measurement_simplex_pipe(klass):
    time_start = time()
    pipe0, pipe1 = Pipe()
    process = Process(target=null_process, args=(pipe1, NUMBER))
    process.start()
    for n in range(NUMBER):
        pipe0.send(klass)
    process.join()
    time_end = time()
    return time_end - time_start

def echo_process(pipe1, number):
    for n in range(number):
        pipe1.send(pipe1.recv())

def measurement_duplex_pipe(klass):
    time_start = time()
    pipe0, pipe1 = Pipe()
    process = Process(target=echo_process, args=(pipe1, NUMBER))
    process.start()
    for n in range(NUMBER):
        pipe0.send(klass)
        pipe0.recv()
    process.join()
    time_end = time()
    return time_end - time_start

class MyChain(Chain):
    def __init__(self):
        super(MyChain, self).__init__(
            l1=Linear(784, 100),
            l2=Linear(100, 100),
            l3=Linear(100, 10)
        )

    def __call__(self, x):
        h1 = relu(self.l1(x))
        h2 = relu(self.l2(h1))
        return self.l3(h2)

class MyChainList(ChainList):
    def __init__(self):
        super(MyChainList, self).__init__(
            Linear(784, 100),
            Linear(100, 100),
            Linear(100, 10)
        )

    def __call__(self, x):
        h1 = relu(self[0](x))
        h2 = relu(self[1](h1))
        return self.l3(h2)

classies = [Chain, ChainList, MyChain, MyChainList]

for k in classies:
    stdout.write("create instance %s, %f\n" % (k, measurement(lambda: k())))

for k in classies:
    stdout.write("simplex pipe %s, %f\n" % (k, measurement_simplex_pipe(k)))

for k in classies:
    stdout.write("duplex pipe %s, %f\n" % (k, measurement_duplex_pipe(k)))

Python 2.7

処理 実行時間 (ms)
create instance Chain 0.003006
create instance ChainList 0.001593
create instance MyChain 3.788916
create instance MyChainList 3.780761
simplex pipe Chain 0.005608
simplex pipe ChainList 0.005627
simplex pipe MyChain 0.006494
simplex pipe MyChainList 0.006871
duplex pipe Chain 0.029776
duplex pipe ChainList 0.034982
duplex pipe MyChain 0.033020
duplex pipe MyChainList 0.035376

Python 3.5

処理 実行時間 (ms)
create instance Chain 0.002134
create instance ChainList 0.001779
create instance MyChain 4.150180
create instance MyChainList 4.152344
simplex pipe Chain 0.013076
simplex pipe ChainList 0.013038
simplex pipe MyChain 0.013047
simplex pipe MyChainList 0.010454
duplex pipe Chain 0.065646
duplex pipe ChainList 0.070427
duplex pipe MyChain 0.076342
duplex pipe MyChainList 0.079283

ZenBook3のベンチマーク

SysBenchを使ってZenBook3のベンチマークを取ってみた。

インストール

apt install sysbench

ファイルIO

準備する。

cd /tmp
sysbench --num-threads=16 --test=fileio --file-total-size=3G --file-test-mode=rndrw prepare
sysbench 0.4.12:  multi-threaded system evaluation benchmark

128 files, 24576Kb each, 3072Mb total
Creating files for the test...

ランダム読み書き

helpにあったとおりにベンチマークを実行する。

sysbench --num-threads=16 --test=fileio --file-total-size=3G --file-test-mode=rndrw run
sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16

Extra file open flags: 0
128 files, 24Mb each
3Gb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Done.

Operations performed:  6013 Read, 4009 Write, 12805 Other = 22827 Total
Read 93.953Mb  Written 62.641Mb  Total transferred 156.59Mb  (16.71Mb/sec)
 1069.46 Requests/sec executed

Test execution summary:
    total time:                          9.3711s
    total number of events:              10022
    total time taken by event execution: 0.0953
    per-request statistics:
         min:                                  0.00ms
         avg:                                  0.01ms
         max:                                  5.05ms
         approx.  95 percentile:               0.01ms

Threads fairness:
    events (avg/stddev):           626.3750/185.14
    execution time (avg/stddev):   0.0060/0.00

全然読み方がわからないけど、これ3GB指定してるけど3GBアクセスしてない?なんか読み書きのサイズ調整が必要なのか? とりあえずスレッド数を変更してたら数字が良かったのが8192。

sysbench --num-threads=8192 --test=fileio --file-total-size=3G --file-test-mode=rndrw run
sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 8192

Extra file open flags: 0
128 files, 24Mb each
3Gb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Done.

Operations performed:  4091 Read, 6009 Write, 11916 Other = 22016 Total
Read 63.922Mb  Written 93.891Mb  Total transferred 157.81Mb  (448.62Mb/sec)
28711.73 Requests/sec executed

Test execution summary:
    total time:                          0.3518s
    total number of events:              10100
    total time taken by event execution: 275.5066
    per-request statistics:
         min:                                  0.00ms
         avg:                                 27.28ms
         max:                                148.04ms
         approx.  95 percentile:             134.86ms

Threads fairness:
    events (avg/stddev):           1.2329/4.82
    execution time (avg/stddev):   0.0336/0.04

56MB/sくらい。 本当に8192スレッドで実行してるんだろうか。

シーケンシャル読み

sysbench --num-threads=16 --test=fileio --file-total-size=3G --file-test-mode=seqrd run
sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16

Extra file open flags: 0
128 files, 24Mb each
3Gb total file size
Block size 16Kb
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing sequential read test
Threads started!
FATAL: Too large position discovered in request!
Done.

Operations performed:  196607 Read, 0 Write, 0 Other = 196607 Total
Read 3Gb  Written 0b  Total transferred 3Gb  (16.015Gb/sec)
1049527.74 Requests/sec executed

Test execution summary:
    total time:                          0.1873s
    total number of events:              196607
    total time taken by event execution: 2.5037
    per-request statistics:
         min:                                  0.00ms
         avg:                                  0.01ms
         max:                                 41.00ms
         approx.  95 percentile:               0.00ms

Threads fairness:
    events (avg/stddev):           12287.9375/3411.79
    execution time (avg/stddev):   0.1565/0.04

2GB/sくらい。 仕様上PCIe 3.0 x4接続でバスの片道理論性能が4GB/sだから、実効性能限界出てるかはわからないけどそれなりのオーダで計れてそうなのは分かった。

シーケンシャル書き

sysbench --num-threads=16 --test=fileio --file-total-size=3G --file-test-mode=seqwr run
sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16

Extra file open flags: 0
128 files, 24Mb each
3Gb total file size
Block size 16Kb
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing sequential write (creation) test
Threads started!
Done.

Operations performed:  0 Read, 196608 Write, 128 Other = 196736 Total
Read 0b  Written 3Gb  Total transferred 3Gb  (581.51Mb/sec)
37216.69 Requests/sec executed

Test execution summary:
    total time:                          5.2828s
    total number of events:              196608
    total time taken by event execution: 25.3832
    per-request statistics:
         min:                                  0.00ms
         avg:                                  0.13ms
         max:                                 48.05ms
         approx.  95 percentile:               0.03ms

Threads fairness:
    events (avg/stddev):           12288.0000/1694.04
    execution time (avg/stddev):   1.5864/0.01

73MB/sくらい。

片付け

sysbench --num-threads=16 --test=fileio --file-total-size=3G --file-test-mode=rndrw cleanup

結論

これ真面目に色々やらないと正しい値が取れないな。。。

Ubuntu 16.04でdockerを使う

Circle CIとかで色々やりたいので、今更Ubuntu 16.04にdocker入れる。 まずは Install Docker on Ubuntu - Docker にある通りにインストールしてhello world

  • apt install docker.io
  • service docker start
  • usermod -aG docker $USER
  • ログアウトしてログイン
  • docker run hello-world

こんなのが出れば成功。

Hello from Docker!

まずは、もなにもこれで終わりか。

Raspberry PiのBluetoothをCommon Lispから使えなかった

rp3bのbluetoothcommon lispから使いたい。 bluezのラッパーから使うだけだけどc2ffiを実行できるのか?

とりあえずhu.dwim.bluezをロード

ros use sbcl-bin
ros run
(ql:quickload :hu.dwim.bluez)
; caught ERROR:
;   READ error during COMPILE-FILE: unmatched close parenthesisLine: 82, Column: 19, File-Position: 2874Stream: #<SB-INT:FORM-TRACKING-STREAM for "file /home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/toolchain/c-toolchain.lisp" {520B3241}>
While evaluating the form starting at line 30, column 0
  of #P"/home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/cffi-libffi.asd":

debugger invoked on a LOAD-SYSTEM-DEFINITION-ERROR: Error while trying to load definition for system cffi-libffi from pathname /home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/cffi-libffi.asd: COMPILE-FILE-ERROR while compiling #<CL-SOURCE-FILE "cffi-toolchain" "toolchain" "c-toolchain">

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [RETRY                        ] Retry compiling #<CL-SOURCE-FILE "cffi-toolchain" "toolchain" "c-toolchain">.
  1: [ACCEPT                       ] Continue, treating compiling #<CL-SOURCE-FILE "cffi-toolchain" "toolchain" "c-toolchain"> as having been successful.
  2: [RETRY                        ] Retry EVAL of current toplevel form.
  3: [CONTINUE                     ] Ignore error and continue loading file "/home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/cffi-libffi.asd".
  4: [ABORT                        ] Abort loading file "/home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/cffi-libffi.asd".
  5:                                 Retry ASDF operation.
  6: [CLEAR-CONFIGURATION-AND-RETRY] Retry ASDF operation after resetting the configuration.
  7:                                 Give up on "hu.dwim.bluez"
  8:                                 Exit debugger, returning to top level.

((FLET #:H0 :IN LOAD-ASD) #<COMPILE-FILE-ERROR {5219A129}>)
0] 

閉じ括弧が合わないコンパイルエラーになるぞ。どういうことだ。。。

c2ffiのインストール

GitHub - rpav/c2ffi: Clang-based FFI wrapper generatorをビルドしたいのでllvmとかをインストール。

sudo apt-get install clang llvm libtool-bin libclang-dev

masterのreadme見ると3.5はサポートされてないがどうするか。 と、おもったらllvm-3.5のブランチもあった。

git clone https://github.com/rpav/c2ffi
cd c2ffi
git checkout llvm-3.5
./autogen
mkdir build
cd build
../configure
make
./src/c2ffi -h
make install

hu.dwim.bluezの修正を試みる

とりあえずソースコードをもらってくる。

git clone https://github.com/attila-lendvai/hu.dwim.bluez
cd .roswell/local-projects/
ln -s ~/hu.dwim.bluez/
cd
ros run
(ql:quickload :hu.dwim.bluez)
To load "hu.dwim.bluez":
  Load 1 ASDF system:
    hu.dwim.bluez
; Loading "hu.dwim.bluez"
; 
; caught ERROR:
;   READ error during COMPILE-FILE: unmatched close parenthesisLine: 82, Column: 19, File-Position: 2874Stream: #<SB-INT:FORM-TRACKING-STREAM for "file /home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/toolchain/c-toolchain.lisp" {516C5241}>
While evaluating the form starting at line 30, column 0
  of #P"/home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/cffi-libffi.asd":

debugger invoked on a LOAD-SYSTEM-DEFINITION-ERROR: Error while trying to load definition for system cffi-libffi from pathname /home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/cffi-libffi.asd: COMPILE-FILE-ERROR while compiling #<CL-SOURCE-FILE "cffi-toolchain" "toolchain" "c-toolchain">

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [RETRY                        ] Retry compiling #<CL-SOURCE-FILE "cffi-toolchain" "toolchain" "c-toolchain">.
  1: [ACCEPT                       ] Continue, treating compiling #<CL-SOURCE-FILE "cffi-toolchain" "toolchain" "c-toolchain"> as having been successful.
  2: [RETRY                        ] Retry EVAL of current toplevel form.
  3: [CONTINUE                     ] Ignore error and continue loading file "/home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/cffi-libffi.asd".
  4: [ABORT                        ] Abort loading file "/home/pi/.roswell/lisp/quicklisp/dists/quicklisp/software/cffi_0.17.1/cffi-libffi.asd".
  5:                                 Retry ASDF operation.
  6: [CLEAR-CONFIGURATION-AND-RETRY] Retry ASDF operation after resetting the configuration.
  7:                                 Give up on "hu.dwim.bluez"
  8:                                 Exit debugger, returning to top level.

((FLET #:H0 :IN LOAD-ASD) #<COMPILE-FILE-ERROR {517AB901}>)
0] 

よく見たらcffiか。。。

githubからcffiをもらってくる

git clone https://github.com/cffi/cffi
cd ~/.roswell/local-projects
ln -s ~/cffi
cd
ros run
(ql:quickload :hu.dwim.bluez)
To load "hu.dwim.bluez":
  Load 1 ASDF system:
    hu.dwim.bluez
; Loading "hu.dwim.bluez"
; pkg-config libffi --cflags

.; cc -marm -o /home/pi/.cache/common-lisp/sbcl-1.3.9-linux-arm/sbcl-bin-1.3.9/home/pi/cffi/libffi/libffi-types__grovel-tmp7LQ0A0VI -I/home/pi/cffi/ /home/pi/.cache/common-lisp/sbcl-1.3.9-linux-arm/sbcl-bin-1.3.9/home/pi/cffi/libffi/libffi-types__grovel.c
/home/pi/.cache/common-lisp/sbcl-1.3.9-linux-arm/sbcl-bin-1.3.9/home/pi/cffi/libffi/libffi-types__grovel.c: In function ‘main’:
/home/pi/.cache/common-lisp/sbcl-1.3.9-linux-arm/sbcl-bin-1.3.9/home/pi/cffi/libffi/libffi-types__grovel.c:82:41: error: ‘FFI_UNIX64’ undeclared (first use in this function)
   fprintf(output, "%"PRIiMAX, (intmax_t)FFI_UNIX64);
                                         ^
/home/pi/.cache/common-lisp/sbcl-1.3.9-linux-arm/sbcl-bin-1.3.9/home/pi/cffi/libffi/libffi-types__grovel.c:82:41: note: each undeclared identifier is reported only once for each function it appears in

debugger invoked on a CFFI-GROVEL:GROVEL-ERROR: Subprocess #<UIOP/RUN-PROGRAM::PROCESS-INFO {524CBF21}>
 with command ("cc" "-marm" "-o" "/home/pi/.cache/common-lisp/sbcl-1.3.9-linux-arm/sbcl-bin-1.3.9/home/pi/cffi/libffi/libffi-types__grovel-tmp7LQ0A0VI" "-I/home/pi/cffi/" "/home/pi/.cache/common-lisp/sbcl-1.3.9-linux-arm/sbcl-bin-1.3.9/home/pi/cffi/libffi/libffi-types__grovel.c")
 exited with error code 1

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [RETRY                        ] Retry PROCESS-OP on #<GROVEL-FILE "cffi-libffi" "libffi" "libffi-types">.
  1: [ACCEPT                       ] Continue, treating PROCESS-OP on #<GROVEL-FILE "cffi-libffi" "libffi" "libffi-types"> as having been successful.
  2:                                 Retry ASDF operation.
  3: [CLEAR-CONFIGURATION-AND-RETRY] Retry ASDF operation after resetting the configuration.
  4: [ABORT                        ] Give up on "hu.dwim.bluez"
  5:                                 Exit debugger, returning to top level.

(CFFI-GROVEL:GROVEL-ERROR "~a" #<UIOP/RUN-PROGRAM:SUBPROCESS-ERROR {524CC061}>)
0] 

FFI_UNIX64 はなんか32ビットarmのlibffiにはないっぽい。

diff --git a/libffi/libffi-types.lisp b/libffi/libffi-types.lisp
index 939a87b..1b47c98 100644
--- a/libffi/libffi-types.lisp
+++ b/libffi/libffi-types.lisp
@@ -69,6 +69,7 @@
 (cenum abi
  ((:default-abi "FFI_DEFAULT_ABI"))
  ((:sysv "FFI_SYSV"))
+ #-arm
  ((:unix64 "FFI_UNIX64")))
 
 (ctype ffi-abi "ffi_abi")

修正してもう一度 hu.dwim.bleuz をロード。

(ql:quickload :hu.dwim.bluez)
; Loading "hu.dwim.bluez"
; CFFI/C2FFI is generating the file #P"/home/pi/hu.dwim.bluez/c2ffi-spec/bluez.arm-pc-linux-gnu.lisp"
........
debugger invoked on a COMMON-LISP:SIMPLE-ERROR: Key :STORAGE-CLASS not found in json entry ((:TAG . "function") (:NAME . "close") (:LOCATION . "/usr/include/unistd.h:353:12") (:VARIADIC) (:INLINE) (:STORAGE--CLASS . "extern") (:PARAMETERS ((:TAG . "parameter") (:NAME . "__fd") (:TYPE (:TAG . ":int")))) (:RETURN-TYPE (:TAG . ":int"))).

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [RETRY                        ] Retry GENERATE-LISP-OP on #<C2FFI-FILE "hu.dwim.bluez" "c2ffi-spec" "bluez.h">.
  1: [ACCEPT                       ] Continue, treating GENERATE-LISP-OP on #<C2FFI-FILE "hu.dwim.bluez" "c2ffi-spec" "bluez.h"> as having been successful.
  2:                                 Retry ASDF operation.
  3: [CLEAR-CONFIGURATION-AND-RETRY] Retry ASDF operation after resetting the configuration.
  4: [ABORT                        ] Give up on "hu.dwim.bluez"
  5:                                 Exit debugger, returning to top level.

(CFFI/C2FFI::JSON-VALUE #<unavailable argument> #<unavailable argument> :OTHERWISE COMMON-LISP:NIL)
0] 

:STORAGE-CLASS not found?

なかなか上手く行かない。。。

Raspberry PiでCommon Lispの起動が遅い

Raspberry Pi 3 Model B買ってCommon Lisp入れた - gos-k’s blog でちょっと触っていて起動が遅い気がするのでバージョン確認出来たものについてとりあえず時間計測する。

ros use sbcl-bin
time ros run -e "(quit)"
real   0m26.860s
user    0m24.720s
sys 0m0.480s
ros use ccl-bin
time ros run -e "(quit)"
real 0m8.136s
user    0m7.890s
sys 0m0.170s
ros use ecl
time ros run -e "(quit)"
real   0m7.412s
user    0m7.050s
sys 0m0.100s

sbclが極端に遅い?

pc側で何となくstraceして、何となくwcしてみる。

strace -o sbcl-strace-quit.txt ros run -L sbcl-bin -e "(quit)"
strace -o ccl-strace-quit.txt ros run -L ccl-bin -e "(quit)"
strace -o ecl-strace-quit.txt ros run -L ecl -e "(quit)"
wc *-strace-quit.txt
   1407    9939  106260 ccl-strace-quit.txt
   6254   38242  632461 ecl-strace-quit.txt
   7705   46700  728444 sbcl-strace-quit.txt

cclが少なめだな。 raspi側で何となくstraceして、何となくwcしてみる。

   1216    7387   82418 ccl-strace-quit.txt
   6867   40908  699180 ecl-strace-quit.txt
  25343  151090 1769082 sbcl-strace-quit.txt

他は数割だけど、sbclの行数が3倍以上違うのなんだろ? プロセッサのアーキテクチャが違うとはいえ、同じosの同じ処理系でこんなに呼び出し回数違うもんなのか?

raspi上でpython, ruby, nodeあたりを起動してみて時間は計ってないけど、体感的にはenterキー押すのと処理系の起動にタイムラグがほぼないので1秒より遥に短いはず。

sbclの結果をざっと眺めると cacheflush を連発してるからこれが遅いのか? あとopenの前にパスの全部にlstat出しまくってるのもこれなんんだろ? まあいずれにしてもCommon Lispは現状処理系選んで起動に8秒か。。。