April 2009


Steven Rostedt が動作中のカーネルのlsmod情報から、必要なハードウェアのみONの最小 .configを生成してくれるスクリプトを開発した。

As it has been brought up last Kernel Summit, we want to make it easier
for those that report bugs to build their own kernels, and maybe even
bisect with git. Some of these people are not programmers and do not
understand the complexity of the configuration options. But to compile
a distribution configured kernel on their boxes can take hours.

This patch series comes to the rescue. I wrote the first instance of
streamline config when I bought a new box in 2005 and got frustrated
with finding all the necessary configurations to boot it. It is a
small (yet powerful) perl script.

Here's what it does:

* Reads the modules that are load by using lsmod.
* Reads all Makefiles to map modules to CONFIG_* options
* Reads the Kconfig files to find dependencies and selects
* Figures out what CONFIGS are needed to compile the loaded modules
* Reads the .config and prints out a version with all module configurations
that not needed, disabled.

The next two patches add options to make.

localmodconfig - this will run streamline_config.pl on the .config file
and replace it at the end.

localyesconfig - this will do the same as localmodconfig but will also
sed -i s/=m/=y/ to turn all modules to core. It will also run
the 'make oldcondfig' to fix it up and let the user handle
andything that was changed by converting a module to core.

Anyway, this is now in git and as a series of patches here. My git tree
is based off of the latest Linus git tree.

Have fun!

-- Steve

The following patches are in:


branch: kconfig

Steven Rostedt (3):
kconfig: add streamline_config.pl to scripts
kconfig: make localmodconfig to run streamline_config.pl
kconfig: add make localyesconfig option

scripts/kconfig/Makefile | 24 +++-
scripts/kconfig/streamline_config.pl | 291 ++++++++++++++++++++++++++++++++++
2 files changed, 314 insertions(+), 1 deletions(-)
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Revert "linux.conf.au 2009: Tuz"

This reverts commit 8032b526d1a3bd91ad633dd3a3b5fdbc47ad54f1.

Hey, it was only meant to be a single release. Now they can all die as
far as I'm concerned.

[ Just kidding. They're cute and cuddly.

Except when they have horrible nasty facial diseases. Oh, and I guess
they're not actually that cuddly even when disease-free. ]

Signed-off-by: Linus Torvalds

From: Alexey Dobriyan 
Date: Sun, 29 Mar 2009 21:56:42 +0400

> Ingo, what are you doing?

Alexey, please crawl back into your cave.

Enough already.

Alexey Dobriyan: やあ Ingo、このパッチはこうすればもっと改善できるんじゃないかな
David Miller: お前の星にカエレ(・A・)。このままで十分だよ

David ひでー

> > I agree that we need to be frugal with the addition of trace points. But
> > I don't think the bugs that can be solved with this is always reproducible
> > by the developer.
> >
> > If you have a distribution kernel that is running at a customers location,
> > you may not have the privilege of shutting down that kernel, patching the
> > code, recompiling and booting up this temporary kernel. It would be nice
> > to have strategic locations in the kernel where we can easily enable a
> > trace point and monitor what is going on.
> >
> > If the customer calls and tells you there's some strange performance
> > issues when running such and such a load, it would be nice to look at
> > things like workqueues to analyze the situation.
> Would it? What's the probability that anyone anywhere will *really*
> solve an on-site problem using workqueue tracepoints? Just one person?
> I think the probability is quite small, and I doubt if it's high enough
> to add permanent code to the kernel.
> Plus: what we _really_ should be looking at is
> p(someone uses this for something) -
> p(they could have used a kprobes-based tracer)

This is starting to sound a lot like catch 22. We don't want it in the
kernel if nobody is using it. But nobody is using it because it is not in
the kernel.

Andrew Morton: そんなにトレースポイントって必要か? kprobe based tracerでいいじゃん
Steven Rostedt: キャッチ22的な状況だね。誰も使わないトレースポイントは入れたくないが


I think that's probably the dominant effect on x86 systems, because
Intel doesn't recommend using the branch hint prefixes as far as I can
tell (their consumption of icache space outweighs any benefit of priming
the predictor).

イマドキのx86はbranch hint prefixを使うとicacheを浪費するから非推奨なんだと

なんか、僕がNAKしたからOOM killerの改善が入らなかったとか言ってたが全然記憶にないぞ。



Now that 64-bit e2fsck can run to completion on a (newly-minted, never
mounted) filesystem, here are some numbers. They must be taken with
a large grain of salt of course, given the unrealistict situation, but
they might be reasonable lower bounds of what one might expect.

First, the disks are 300GB SCSI 15K rpm - there are 28 disks per RAID
controller and they are striped into 2TiB volumes (that's a limitation
of the hardware). 16 of these volumes are striped together using LVM, to
make a 32TiB volume.

The machine is a four-slot quad core AMD box with 128GB of memory and
dual-port FC adapters.

The filesystem was created with default values for everything, except
that the resize_inode feature is turned off. I cleared caches before the

# time e2fsck -n -f /dev/mapper/bigvg-bigvol
e2fsck 1.41.4-64bit (17-Apr-2009)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/mapper/bigvg-bigvol: 11/2050768896 files (0.0% non-contiguous), 128808243/8203075584 blocks

real 23m13.725s
user 23m8.172s
sys 0m4.323s

Most of the time (about 22 minutes) is in pass 5. I was taking snapshots

/proc/{pid of e2fsck}/statm

every 10 seconds during the run[1]. It starts out like this:

27798 3293 217 42 0 3983 0
609328 585760 263 42 0 585506 0
752059 728469 272 42 0 728237 0
752059 728469 272 42 0 728237 0
752059 728469 272 42 0 728237 0
752059 728469 272 42 0 728237 0
752059 728469 272 42 0 728237 0
752059 728469 272 42 0 728237 0
752059 728469 272 42 0 728237 0
717255 693666 273 42 0 693433 0
717255 693666 273 42 0 693433 0
717255 693666 273 42 0 693433 0

and stays at that level for most of the run (the drop occurs a short
time after pass 5 starts). Here is what it looks like at the end:

717255 693666 273 42 0 693433 0
717255 693666 273 42 0 693433 0
717255 693666 273 42 0 693433 0
717499 693910 273 42 0 693677 0
717499 693910 273 42 0 693677 0
717499 693910 273 42 0 693677 0

So in this very simple case, memory required tops out at about 3 GB for the
32Tib filesystem, or 0.4 bytes per block.


[1] The numbers are numbers of pages. The format is described in

Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
Field Content
size total program size (pages) (same as VmSize in status)
resident size of memory portions (pages) (same as VmRSS in status)
shared number of pages that are shared (i.e. backed by a file)
trs number of pages that are 'code' (not including libs; broken,
includes data segment)
lrs number of pages of library (always 0 on 2.6)
drs number of pages of data/stack (including libs; broken,
includes library text)
dt number of dirty pages (always 0 on 2.6)
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html