まあ、ようするにDiscard Requestはまだまだ揉めていますよ。って事なんだけど

Hi Christoph,

(I've added Ccs, hoping for more expertise than we have in linux-mm.)

On Fri, 30 Oct 2009, Christoph Hellwig wrote:
> since 6a6ba83175c029c7820765bae44692266b29e67a the swap code
> unconditionally calls blkdev_issue_discard when swap clusters get freed.
> So far this was harmless because only the mtd driver has discard support
> wired up and it's pretty fast there (entirely done in-kernel).
> We're now adding support for real UNMAP/TRIM support for SCSI arrays and
> SSDs, and so far all the real life ones we've dealt with have too many
> performance issues to just issue the discard requests on the fly.
> Because of that unconditionally enabling this code is a bad idea, it
> really needs an option to disable it or even better just leave it
> disabled by default for now with an option to enable it.

Thanks for the info.

Yes, in practice TRIM seems a huge disappointment: is there a device on
which it is really implemented, and not more trouble than it's worth?

I'd been waiting for OCZ to get a Vertex 1.4* firmware out of Beta
before looking at swap discard again; but even then, the Linux ATA
support is still up in the air, so far as I know.

You don't mention swap's discard of the whole partition (or all
extents of the swapfile) at swapon time: do you believe that usage
is okay to retain? Is it likely on some devices to take so long,
that I ought to make it asynchronous?

Assuming that initial swap discard is good, I wonder whether just
to revert the discard of swap clusters for now: until such time as
we find devices (other than mtd) that can implement it efficiently.

If we do retain the discard of swap clusters, under something more
than an #if 0, any ideas for what I should make it conditional upon?

Something near /sys/block/sda/queue/rotational (nicely rw these days)
seems appropriate: any chance of a /sys/block/sda/queue/discard_is_useful?
I think I'd prefer that to a new option to swapon.

Or is there a sensible measurement I could make in swapfile.c: for
example, does discard of a range complete faster than write of the
same range? (But my guess is that those devices we'd want to avoid
discard on, would give erratic answers to any such test; never mind
the noise of what other I/Os are concurrent to the same device.)

Something I should almost certainly revert: at one stage I made the
non-rotational case spread its swapping evenly over the partition,
in case the device's wear-levelling was inadequate (localized).

But now I think it's better to ignore that possibility, and anchor
swapping to the start of the partition just as in the rotational case:
in the rotational case it's done to minimize seeking, in the non-
rotational case it would be to minimize encroaching upon that
initially discarded total extent.


To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to For more info on Linux MM,
see: .
Don't email: