April 2011

今日初めて aspellつかったわー

% cat ~/.aspell.conf
lang en_US

って書いておいて、

aspell-en

パッケージをインストール。
それからemacs上で M-x flyspell-region, M-x flyspell-buffer でいいらしい


追記:
http://lcw-pon.blogspot.com/2009/12/emacs.html をみると .emacsに


view plainprint?
;; Flyspell
;;; FlySpellの逐次スペルチェックを使用するモードの指定
(defun my-flyspell-mode-enable ()
(flyspell-mode 1))
(mapc
(lambda (hook)
(add-hook hook 'my-flyspell-mode-enable))
'(
changelog-mode-hook
text-mode-hook
latex-mode-hook
)
)
;;; Flyspellの逐次スペルチェックをコメントにのみ使用するモード
;;; (コメントかどうかの判断は各モードによる)
(mapc
(lambda (hook)
(add-hook hook 'flyspell-prog-mode))
'(
c-mode-common-hook
emacs-lisp-mode-hook
)
)


とか書いておくともっと幸せになれるらしい


Date: Mon, 25 Apr 2011 03:06:50 +0100
From: Al Viro
To: linux-kernel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org,
Linus Torvalds ,
Christoph Hellwig
Subject: back to life (mostly)

I'd been offline since Mar 25 for a very nasty reason - popped
aneurism in right choroid arthery. IOW, a hemorrhagic stroke. A month
in ICU was not fun, to put it very mildly. A shitty local network hadn't
been fun either... According to the hospital folks I've ended up
neurologically intact, which is better (for me) than expected.

Said state is unlikely to continue if I try to dig through ~15K
pending messages in my mailbox; high pressure is apparently _the_ cause
for repeated strokes. So what I'm going to do is
a) delete all pending mail
b) ask people to resend anything important that would get lost in
process.

If/when I end up kicking the bucket (which looked like a very real
possibility for a while during these weeks), as far as I'm concerned hch
inherits the fun job of holding the hordes of morons off and keeping VFS
alive.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



病気ってのは聞いてたけど、ガンですか。早く完治できることを祈ります。

今週末はRubyのGVLいじりで潰してしまった。今こんな感じ


make benchmark COMPARE_RUBY="before::./ruby-193-0416" OPTS='-r 5'

name before newgvl
loop_whileloop 1.132 1.171
vm1_block* 3.794 3.504
vm1_const* 0.781 0.706
vm1_ensure* 1.005 0.972
vm1_ivar* 1.321 1.277
vm1_ivar_set* 1.561 1.611
vm1_length* 1.539 1.468
vm1_neq* 1.133 1.138
vm1_not* 0.634 0.645
vm1_rescue* 0.155 0.175
vm1_simplereturn* 2.859 2.594
vm1_swap* 1.123 0.991
vm2_array 2.146 2.227
vm2_case 0.600 0.568
vm2_eval 39.022 39.539
vm2_method 4.224 4.069
vm2_mutex 2.091 2.138
vm2_poly_method 5.890 5.422
vm2_poly_method_ov 0.748 0.753
vm2_proc 1.343 1.318
vm2_regexp 2.699 2.696
vm2_send 0.930 0.868
vm2_super 1.425 1.367
vm2_unif1 0.859 0.949
vm2_zsuper 1.526 1.537
vm3_gc 3.299 3.331
vm3_thread_create_join 5.358 5.382
vm3_thread_mutex 120.601 3.426

gvl_pipe 23.955 15.006
gvl_thread_pass 19.750 3.079


たぶん、あと二、三日安定化頑張れば人に見せられるレベルに行きそう

RHEL6のドキュメントページ(※)みてたのだけど

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/

Technical Notes: http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Technical_Notes/index.html に面白いチューニングパラメタの記述をみつけたので解説してみる。S390向けのチューニング推奨値の部分だ

ーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーー
System z Performance

Some of the default tunables in Red Hat Enterprise Linux 6 are currently not optimally configured for System z workloads. Under most circumstances, System z machines will perform better using the following recommendations.
Dirty Ratio

It is recommended that the dirty ratio be set to 40 (Red Hat Enterprise Linux 6 default 20) Changing this tunable tells the system to not spend as much process time too early to write out dirty pages. Add the following line to /etc/sysctl.conf to set this tunable:
 
vm.dirty_ratio = 40


Scheduler

To increase the average time a process runs continuously and also improve the cache utilization and server style workload throughput at minor latency cost it is recommended to set the following higher values in /etc/sysctl.conf.
 
kernel.sched_min_granularity_ns = 10000000
kernel.sched_wakeup_granularity_ns = 15000000
kernel.sched_tunable_scaling = 0
kernel.sched_latency_ns = 80000000

Additionally, deactivating the Fair-Sleepers feature improves performance on a System z machine. To achieve this, set the following value in /etc/sysctl.conf
 
kernel.sched_features = 15834234


False positive hung task reports

It is recommended to prevent false positive hung task reports (which are rare, but might occur under very heavy overcommitment ratios). This feature can be used, but to improve performance, deactivate it by default by setting the following parameter in /etc/sysctl.conf:
 
kernel.hung_task_timeout_secs = 0



ーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーー

このへん、実のところ特殊なハードに依存した部分は一切無い。S390が標準環境だと仮定しているのはようするに

・仮想環境前提。仮想環境ではCPU能力に比べてメモリ量がプアになりがち。
CPUはゲスト間でダイナミックに譲り合いができるけど、メモリ使用量減らすのは大変だから
・バッチジョブ。IO性能を限界まで酷使するようジョブを大量につめこむ。ここのジョブの処理時間は
反応速度はどうでもいい。全体のスループット重要

そこを踏まえると

・Dirty ratio: そもそもRHEL5で40だったdirty ratioがRHEL6で20に減っているのは、レスポンス速度向上の
ためである。現在デスクトップでもメモリが4Gとかあるので、4G x 0.4 だと1.6Gで大きすぎる。これが
Xがガタつく原因だーってLinusが主張して強権発動で一回5%まで落ちて、その後数々のregression報告に
対応する形で 5 → 10 → 20 と段階的に数値が戻ってきた経緯がある。

1)バッチジョブばかり流すとわかっているならレスポンス速度は重要じゃない
2)仮想環境だとCPUは潤沢だけどメモリ512Mとか普通にあるので、20%だと100Mとかなくて簡単に制限にあたってしまう。なので、RHEL5から性能劣化するジョブが沢山検出される

とかそういう事態になり、RHEL5と同じ数値を推奨する経緯になったと思われる

・スケジューラ関係も同じ。レスポンス時間が重要では無いなら、コンテキストスイッチの回数を減らして
一回一回のタイムスライスを増やす作戦が有効。sched_latency_nsがまさにそのためのパラメタで
若干特殊な場合に使われる補助パラメタ、sched_min_granularity_ns, sched_wakeup_granularity_nsも
似たような割合で増やしている。

・kernel.sched_tunable_scaling は特殊なパラメータでスケジューラのレイテンシターゲットとかそのへんの
値はCPU数依存なので、CPUを抜き差しすると変動する。仮想環境なら負荷に応じてCPU数変えたいよね!
なんだけど、CPUが増減したときにどのぐらいレイテンシターゲットをいじるかってのは、まあ、ポリシーが
いろいろあるでござる。デフォルト(sched_tunable_scaling=1)ではレイテンシターゲットがCPU数に
log比例するようになっている。でこれを0にするとCPU数を無視するようになって手動で指定した
sched_latency_ns 通りに動くので、まあ分かりやすい挙動になる。どうせS390でCPU 1000個のマシンとか
ないので分かりやすさ優先でいいという判断なのだろう。あと、CPU増やしたときに増やしたCPUまったく
使ってないはずなのに、ジョブの性能が変わるとアドミンが発狂するのでその対処と言うことだろうか

・kernel.sched_features = 15834234 は何を言っているのか分からないマジック値に見えると思うが、
これはビットマスクでデフォルト値が 15834235 なの。ようするに最下位1ビットをOFFにしてるという
意味。最下位ビットは linux/kernel/sched_features.h を見ると
/*
* Only give sleepers 50% of their service deficit. This allows
* them to run sooner, but does not allow tons of sleepers to
* rip the spread apart.
*/
SCHED_FEAT(GENTLE_FAIR_SLEEPERS, 1)

となっていて使用箇所は

static void
place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
{
u64 vruntime = cfs_rq->min_vruntime;

/*
* The 'current' period is already promised to the current tasks,
* however the extra weight of the new task will slow them down a
* little, place the new task so that it fits in the slot that
* stays open at the end.
*/
if (initial && sched_feat(START_DEBIT))
vruntime += sched_vslice(cfs_rq, se);

/* sleeps up to a single latency don't count. */
if (!initial) {
unsigned long thresh = sysctl_sched_latency;

/*
* Halve their sleep time's effect, to allow
* for a gentler effect of sleepers:
*/
if (sched_feat(GENTLE_FAIR_SLEEPERS))
thresh >>= 1;

vruntime -= thresh;
}

/* ensure we never gain time by being placed backwards. */
vruntime = max_vruntime(se->vruntime, vruntime);

se->vruntime = vruntime;
}


ようするに、sleepタスクが起床時にもらうタイムスライスボーナスが
GENTLE_FAIR_SLEEPERS=1(default)だとsysctl_sched_latency/2 で
GENTLE_FAIR_SLEEPERS=0だと sysctl_sched_latency になるということ。
なんで、これがバッチジョブ有利になるのかはちょっと分からなかった。

・最後は一番しょーもなくて、hung_task_timeout_secs という機能はプロセスのハングアップ監視を
してくれる機能なんだけど、バッチジョブをぎゅうぎゅうに詰めるとfalse positiveが頻発するので
disableにしてる。というかサーバOSなんだからデフォルトdisableでもよかったんじゃないかなぁ



こんな感じでした。チューニングパラメタの検討をするときの参考になれば幸いです。

昔はgdmsetupで変更できたのに、今はコマンド自体がない。どうもFedora11からの問題らしい
http://ranjith.zfs.in/fedora-11_gdm_background/

Fedora15でGnomeがGnome3に変わったら解決するのかな?

Twitterで @kazuhisya さんに教えてもらったところによると以下のオペレーションで回避可能とのこと。うちではこれでうまくいきました。関係各位に感謝します。

sudo cp /usr/share/applications/gnome-appearance-properties.desktop /usr/share/gdm/autostart/LoginWindow


備忘録その2
Rikがチャットで話してたのってこれだな


On Thursday after LSF, Hugh, Minchan, Mel, Johannes and I were
sitting in the hallway talking about yet more VM things.

During that discussion, we came up with a way to redesign the
swap cache. During my flight home, I came with ideas on how
to use that redesign, that may make the changes worthwhile.

Currently, the page table entries that have swapped out pages
associated with them contain a swap entry, pointing directly
at the swap device and swap slot containing the data. Meanwhile,
the swap count lives in a separate array.

The redesign we are considering moving the swap entry to the
page cache radix tree for the swapper_space and having the pte
contain only the offset into the swapper_space. The swap count
info can also fit inside the swapper_space page cache radix
tree (at least on 64 bits - on 32 bits we may need to get
creative or accept a smaller max amount of swap space).

This extra layer of indirection allows us to do several things:

1) get rid of the virtual address scanning swapoff; instead
we just swap the data in and mark the pages as present in
the swapper_space radix tree

2) free swap entries as the are read in, without waiting for
the process to fault it in - this may be useful for memory
types that have a large erase block

3) together with the defragmentation from (2), we can always
do writes in large aligned blocks - the extra indirection
will make it relatively easy to have special backend code
for different kinds of swap space, since all the state can
now live in just one place

4) skip writeout of zero-filled pages - this can be a big help
for KVM virtual machines running Windows, since Windows zeroes
out free pages; simply discarding a zero-filled page is not
at all simple in the current VM, where we would have to iterate
over all the ptes to free the swap entry before being able to
free the swap cache page (I am not sure how that locking would
even work)

with the extra layer of indirection, the locking for this scheme
can be trivial - either the faulting process gets the old page,
or it gets a new one, either way it'll be zero filled

5) skip writeout of pages the guest has marked as free - same as
above, with the same easier locking

Only one real question remaining - how do we handle the swap count
in the new scheme? On 64 bit systems we have enough space in the
radix tree, on 32 bit systems maybe we'll have to start overflowing
into the "swap_count_continued" logic a little sooner than we are
now and reduce the maximum swap size a little?

--
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: email@kvack.org


Jan Kara が書いた IO-less dirty throttleのまとめ
忘れそうになるから、はっとく


Hi,

below is my summary of problems observed with current throttling scheme.
I guess other guys will have some changes but this I what I remembered from
top of my head.

Honza
---

The problems people see with current balance_dirty_pages() implementation:
a) IO submission from several processes in parallel causes suboptimal IO
patterns in some cases because we interleave several sequential IO streams
which causes more seeks (IO scheduler does not have queue long enough to sort
out everything) and more filesystem fragmentation (currently worked around
in each filesystem separately by writing more than asked for etc.).
b) IO submission from several threads causes increased pressure on shared
data structures and locks. For example inode_bw_list_lock seems to be a
bottleneck on large systems.
c) In cases where there are only a few large files dirty, throttled process
just walks the list of dirty inodes, moves all of them to b_more_io because all
of the inodes have I_SYNC set (other processes are already doing writeback
against these files) and resorts to waiting for some time after which it just
tries again. This creates a possibility for basically locking out a process
in balance_dirty_pages() for arbitrarily long time.

All of the above issues get resolved when IO submission happens from just a
single thread so for the above problems basically any IO-less throttling will
do. We get only a single stream of IO, less contention on shared data
structures from writeback, no problems with not having another inode to write
out.

IO less throttling also offers further possibilities for improvement. If we do
not submit IO from a throttled thread, we have more flexibility in choosing how
often and for how long do we throttle a thread since we are no longer limited
by trying to achieve a sensible IO pattern. This creates a possibility for
achieving lower latencies and smoother wait time behavior. Fengguang is taking
advantage of this in his patch set.
--
Jan Kara
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: email@kvack.org


↑このページのトップヘ