July 2011

なんか、glibc 2.14のリリースノートでmalloc hookはthread unsafeだから
そのうち消すよーとか言ってますな

http://sourceware.org/ml/libc-alpha/2011-05/msg00103.html

* The malloc hook implementation is marked deprecated and will be removed
from the default implementation in the next version. The design never
worked ever since the introduction of threads. Even programs which do
not create threads themselves can use multiple threads created internally.




libc-alphaで最近文句つけてる人がいた


From: Emery Berger
Date: Tue, 26 Jul 2011 16:41:14 +0200
Message-ID:
Subject: glibc malloc hook deprecation considered harmful
To: libc-alpha@sourceware.org

Hi all,

I see that malloc hooks have been deprecated and are scheduled for removal:

http://sourceware.org/ml/libc-alpha/2011-05/msg00103.html

Unless there is a better replacement in the offing, I would like to
argue that they remain.

The claim has been made that these hooks do not work in the presence
of threads. I agree that it is unsafe for a program to change these on
the fly.

However, as far as I am aware, they are the only safe / reliable means
of intercepting malloc for an entire program, at least on Linux
platforms.

Invoking them at startup (via init_hook) works well. Hoard relies on
it (www.hoard.org), and it works just fine for massively multithreaded
programs used in a large variety of industrial settings (British
Telecom, Credit Suisse, Reuters, to cite just a few). In addition to
Hoard and DieHard (www.diehard-software.org), I am sure there are
other allocators that rely on malloc hooks.

Without malloc hooks, there is no way to be certain that all
allocation functions have been intercepted. Authors of malloc
implementations otherwise have to play catch-up with every new version
of glibc to (hopefully) intercept all functions that accessed malloc
internally (via __libc_malloc or somesuch). The hooks work well --
please keep them or provide a suitable replacement.

Best,
-- Emery Berger

このエントリーをはてなブックマークに追加

あんまり発表ネタがなくて困ったのですが、なんとか30分もたせることができました。
濃い内容を期待していた人にはちょっと申し訳なかったです。


このエントリーをはてなブックマークに追加

やっぱり、--disable-gemsが重要。
あと、192ブランチの先頭ってmutexの性能が 192p180より落ちてる気がする。
なんでベンチ実行に3000秒とかサックリかかってんの?測定おわらNEEE


% /usr/bin/ruby ../benchmark/driver.rb -v --executables="192r31932-nogems::~/ruby/bin/ruby-192-r31932 --disable-gems; 192r31932::~/ruby/bin/ruby-192-r31932; trunk-nogems::~/ruby/bin/ruby-trunk --disable-gems; trunk::~/ruby/bin/ruby-trunk -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext --" --pattern='bm_vm_thread3' --directory=../benchmark -r 1
Sat Jul 02 20:27:37 +0900 2011
target 0: 192r31932-nogems
target 1: 192r31932
target 2: trunk-nogems
target 3: trunk


name 192r31932-nogems 192r31932 trunk-nogems trunk
------------------------------------------------------------------
app_answer 0.138 0.139 0.155 0.227
app_erb 2.493 2.537 2.538 2.795
app_factorial 2.596 2.602 2.481 2.980
app_fib 1.826 1.822 1.766 1.961
app_mandelbrot 3.218 3.277 3.470 3.685
app_pentomino 32.446 32.607 32.352 36.201
app_raise 1.129 1.153 1.009 1.381
app_strconcat 2.725 2.795 2.897 2.999
app_tak 2.808 2.773 2.721 2.813
app_tarai 2.154 2.139 2.160 2.251
app_uri 1.636 1.623 1.651 1.953
io_file_create 2.445 2.419 1.934 2.354
io_file_read 7.095 7.250 7.816 8.636
io_file_write 2.280 2.243 2.204 2.328
io_select 2.242 2.277 2.555 2.753
io_select2 6.327 6.385 7.355 6.452
io_select3 0.843 0.857 0.890 0.995
loop_for 2.977 2.986 2.867 2.997
loop_generator 1.107 1.074 0.896 0.969
loop_times 2.587 2.611 2.692 2.745
loop_whileloop 1.591 1.594 1.572 1.622
loop_whileloop2 0.342 0.346 0.321 0.404
so_ackermann 1.992 2.010 1.978 2.165
so_array 2.368 2.391 2.414 2.562
so_binary_trees 0.888 0.904 0.893 1.057
so_concatenate 7.344 7.307 7.466 7.634
so_count_words 0.537 0.564 0.529 0.615
so_exception 2.100 2.150 2.001 2.181
so_fannkuch 2.785 2.822 2.801 3.635
so_fasta 4.173 4.222 4.316 4.853
so_k_nucleotide 3.348 3.375 3.232 3.463
so_lists 1.666 1.657 1.655 1.751
so_mandelbrot 11.346 11.704 11.628 11.946
so_matrix 1.515 1.548 1.530 1.628
so_meteor_contest 9.426 8.618 9.199 9.242
so_nbody 8.235 8.526 8.310 8.488
so_nested_loop 2.492 2.455 2.464 2.510
so_nsieve 7.357 7.330 7.616 7.456
so_nsieve_bits 4.865 4.914 4.875 5.058
so_object 1.741 1.772 1.706 1.833
so_partial_sums 10.470 10.792 10.632 11.158
so_pidigits 1.718 1.747 1.665 2.002
so_random 1.816 1.894 1.953 2.034
so_reverse_complement 3.146 3.320 3.518 3.731
so_sieve 2.136 2.156 2.068 2.208
so_spectralnorm 7.949 8.035 7.764 7.464
vm1_block* 3.850 3.512 3.994 3.595
vm1_const* 0.959 0.989 1.047 1.140
vm1_ensure* 1.029 1.008 1.148 1.177
vm1_ivar* 1.099 1.274 0.850 0.619
vm1_ivar_set* 1.876 1.291 1.094 1.130
vm1_length* 1.397 1.474 1.277 0.983
vm1_neq* 0.937 0.887 1.054 0.514
vm1_not* 0.349 0.372 0.366 0.208
vm1_rescue* 0.135 0.146 0.314 0.428
vm1_simplereturn* 2.430 2.418 2.964 2.215
vm1_swap* 1.996 2.042 2.047 2.100
vm2_array* 1.429 1.420 1.438 1.969
vm2_case* 0.322 0.328 0.329 0.287
vm2_defined_method* 6.156 6.156 6.816 6.433
vm2_eval* 29.259 29.789 31.640 40.637
vm2_method* 3.465 3.484 3.800 3.662
vm2_mutex* 2.014 1.742 2.081 2.111
vm2_poly_method* 5.346 5.250 5.999 5.595
vm2_poly_method_ov* 0.581 0.558 0.529 0.478
vm2_proc* 0.907 0.926 1.026 1.007
vm2_regexp* 2.341 2.358 2.452 2.465
vm2_send* 0.573 0.574 0.624 0.502
vm2_super* 1.158 1.097 1.336 1.185
vm2_unif1* 0.545 0.551 0.603 0.530
vm2_zsuper* 1.250 1.274 1.457 1.316
vm3_clearmethodcache 4.810 5.028 0.820 1.197
vm3_gc 2.094 2.198 2.295 5.449
vm_thread_alive_check1 0.248 0.356 0.469 0.610
vm_thread_create_join 5.634 5.658 5.986 5.994
vm_thread_mutex1 1.579 1.560 1.678 1.772
vm_thread_mutex2 1.613 1.671 7.859 5.480
vm_thread_mutex3 2968.896 3245.605 4.160 4.415
vm_thread_pass 0.139 0.163 1.660 1.872
vm_thread_pass_flood 0.324 0.307 0.535 0.781
vm_thread_pipe 1.114 1.103 2.486 2.452
このエントリーをはてなブックマークに追加

--disable-gemsをつけるとtrunkのほうが若干速いけど、なにもつけないとtrunkのが遅い。これは gem preludeを廃止した副作用らしく

http://redmine.ruby-lang.org/issues/4962

で議論されてる。


/usr/bin/ruby ../benchmark/driver.rb -v --executables="192r31932-nogems::~/ruby/bin/ruby-192-r31932 --disable-gems; 192r321932::~/ruby/bin/ruby-192-r31932; trunk-nogems::~/ruby/bin/ruby-trunk --disable-gems; trunk::~/ruby/bin/ruby-trunk -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext --" --pattern='bm_' --directory=../benchmark -r 5

Elapesed time: 6812.316032 (sec)
-----------------------------------------------------------
benchmark results:
minimum results in each 5 measurements.
name 192r31932-nogems 192r321932 trunk-nogems trunk
app_answer 0.065 0.067 0.066 0.110
app_erb 1.694 1.717 1.797 1.794
app_factorial 0.988 0.992 0.993 1.170
app_fib 0.812 0.816 0.792 0.849
app_mandelbrot 2.041 2.128 2.231 2.090
app_pentomino 25.944 25.357 25.277 24.689
app_raise 0.830 0.835 0.808 0.847
app_strconcat 1.703 1.725 1.796 1.862
app_tak 1.131 1.142 1.131 1.157
app_tarai 0.916 0.919 0.921 0.930
app_uri 1.332 1.343 1.383 1.389
io_file_create 3.075 3.088 3.105 3.133
io_file_read 2.402 2.420 2.392 2.702
io_file_write 1.088 1.086 1.078 1.123
io_select 2.982 2.839 3.373 3.364
io_select2 7.102 6.991 7.307 7.265
io_select3 0.110 0.111 0.116 0.159
loop_for 1.733 1.810 1.755 1.765
loop_generator 0.611 0.615 1.050 1.096
loop_times 1.608 1.585 1.582 1.622
loop_whileloop 0.719 0.721 0.712 0.755
loop_whileloop2 0.151 0.152 0.150 0.194
so_ackermann 0.907 0.901 0.896 0.952
so_array 1.910 1.905 1.856 1.918
so_binary_trees 0.475 0.481 0.480 0.546
so_concatenate 5.122 5.094 5.233 5.237
so_count_words 0.351 0.339 0.331 0.380
so_exception 1.481 1.508 1.505 1.578
so_fannkuch 2.250 2.303 2.456 2.434
so_fasta 2.938 2.949 2.816 2.932
so_k_nucleotide 1.768 1.838 1.885 1.963
so_lists 1.286 1.298 1.290 1.373
so_mandelbrot 6.907 7.051 7.155 6.774
so_matrix 1.107 1.101 1.060 1.105
so_meteor_contest 4.922 4.927 4.940 4.896
so_nbody 5.052 5.126 5.328 5.060
so_nested_loop 1.394 1.402 1.395 1.453
so_nsieve 2.963 2.998 2.956 3.004
so_nsieve_bits 3.688 3.726 3.647 3.718
so_object 1.066 1.077 1.079 1.103
so_partial_sums 5.771 5.778 5.837 5.569
so_pidigits 0.672 0.675 0.676 0.821
so_random 0.970 0.993 1.019 1.064
so_reverse_complement 2.016 2.058 2.050 2.119
so_sieve 1.059 1.047 1.010 1.044
so_spectralnorm 4.123 4.204 4.358 4.018
vm1_block* 2.103 2.088 2.036 2.085
vm1_const* 0.730 0.727 0.690 0.700
vm1_ensure* 0.083 0.082 0.053 0.048
vm1_ivar* 0.954 0.950 0.959 0.916
vm1_ivar_set* 1.067 1.072 1.015 1.035
vm1_length* 0.872 0.880 0.852 0.845
vm1_neq* 0.517 0.525 0.509 0.516
vm1_not* 0.297 0.305 0.293 0.302
vm1_rescue* 0.129 0.134 0.129 0.130
vm1_simplereturn* 1.358 1.309 1.187 1.206
vm1_swap* 0.278 0.284 0.271 0.278
vm2_array* 1.288 1.293 1.304 1.353
vm2_case* 0.200 0.198 0.208 0.209
vm2_defined_method* 4.688 4.970 4.025 4.188
vm2_eval* 25.813 25.985 26.154 26.571
vm2_method* 1.940 1.950 1.928 1.918
vm2_mutex* 1.380 1.378 1.398 1.407
vm2_poly_method* 2.767 2.767 2.611 2.594
vm2_poly_method_ov* 0.222 0.222 0.214 0.213
vm2_proc* 0.678 0.671 0.693 0.703
vm2_regexp* 1.549 1.610 1.559 1.554
vm2_send* 0.312 0.308 0.309 0.311
vm2_super* 0.585 0.585 0.506 0.524
vm2_unif1* 0.260 0.268 0.248 0.247
vm2_zsuper* 0.646 0.635 0.539 0.550
vm3_clearmethodcache 2.032 2.050 0.596 0.682
vm3_gc 1.041 1.072 1.104 2.055
vm_thread_alive_check1 0.280 0.292 0.269 0.393
vm_thread_create_join 8.251 8.169 4.863 11.306
vm_thread_mutex1 1.096 1.096 1.116 1.172
vm_thread_mutex2 1.127 1.129 2.141 2.777
vm_thread_mutex3 273.686 274.093 1.711 1.817
vm_thread_pass 0.668 0.649 1.173 1.209
vm_thread_pass_flood 0.629 0.630 0.338 0.440
vm_thread_pipe 0.977 0.982 1.147 1.164
このエントリーをはてなブックマークに追加

↑このページのトップヘ