Bug #1087

cross-device copying / USB improvements

Added by listen over 6 years ago. Updated over 1 year ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

dear dragonflyers!

first of all: congratulations for such a great release!

I tried hammer and it worked without a glitch,
and I tried USB plugging deplugging, replugging
and was impressed that the shaky USB implementation
inherited from FreeBSD has finally been stabilized!

So this is a mixture of a bug report, a success report and
some ideas how to improve the user experience :)

I tried the following:
- plug in a USB stick
- mount -t msdos /dev/da0s1 /mnt/usbstick
- cd /mnt/usbstick
- ls
- unplug the stick
... lot of warnings, but no panic
- ls
=> shows the old content, even if the stick isn't there any longer
- dh
=> shows /mnt/usbstick still mounted
- plug in USB stick again
=> surprise! now is da1 ! (same slot though)

here I was able to unmount the "ghost" mount on /mnt/usbstick
and to remount the "new" da1 under /mnt/usbstick again.

however, I then tried to unplug the device while copying a larger
file to it.
no panic, but when I did
ls /mnt/usbstick
the system froze.

I think it could be related with the fact that ls showed the old
files on the previous test (some caching issue?).

So I have an idea about what I would consider the best and most
logical behaviour from a users point of view:

It would be great, if the mounted device would be "remembered" for
a while (say 10 seconds) in a way that it:

- would be silently* umounted when the device is plugged off,
so ls should show nothing in the path

- would put the processes trying to access the device into a
waiting loop

- would be silently* remounted if the same device** is plugged in
again in the same slot within a certain timeframe (the 10 seconds)

- would return errors to the waiting processes after the timeout

This would be a cool way to "recover" from accidently interrupted
connections or unreliable devices.

Furthermore, for such things as copying or moving files across
devices it would be really cool, if cp behaved like or would use
rsync.

That would allow to recover reliably from an interrupted cp/mv
process - without user interaction.

If the device comes up again, only the remaining bits would be
transfered. In the case of mv only the files that were successfully
transfered would be unlinked.

Even if the timeout has come the user could plugin the device again,
mount it again, copy again and would not start from zero (without
needing to know tools like rsync).

If you got this far, it would be possible to show the user the
power of the system by telling him to plug in the device again, if
the transfer is not yet complete (maybe this could be delivered to
desktop users as well via dbus).

Such a behaviour would really make difference compared to how the
current linux and bsd variants do (not) handle such situations.

Just a thought. My experiences with C are very limited, so I do not
qualify to implement such a beast, nor do I know how difficult it is
or if it might break Posix.

Whow, that one got longer than expected.

Anyway: keep up the great work!

Regards,
Benny

______________________

* with silently I mean: with a warning on the console but no
interruption

** I don't know if its possible, but there are usually some specific
bits that could be read off the device (the informations shown
in the console or dmesg)

History

#1 Updated by dillon about 6 years ago

:dear dragonflyers!
:
:first of all: congratulations for such a great release!
:
:I tried hammer and it worked without a glitch,
:and I tried USB plugging deplugging, replugging
:and was impressed that the shaky USB implementation
:inherited from FreeBSD has finally been stabilized!
:
:So this is a mixture of a bug report, a success report and
:some ideas how to improve the user experience :)
:
:I tried the following:
:- plug in a USB stick
:- mount -t msdos /dev/da0s1 /mnt/usbstick
:- cd /mnt/usbstick
:- ls
:- unplug the stick
:... lot of warnings, but no panic
:- ls
:=> shows the old content, even if the stick isn't there any longer
:- dh
:=> shows /mnt/usbstick still mounted
:- plug in USB stick again
:=> surprise! now is da1 ! (same slot though)

Yah, because there are still references on da0. Even though
the filesystem has failed you still have to unmount it.
(umount -f ought to work, theoretically).

:here I was able to unmount the "ghost" mount on /mnt/usbstick
:and to remount the "new" da1 under /mnt/usbstick again.
:
:however, I then tried to unplug the device while copying a larger
:file to it.
:no panic, but when I did
: ls /mnt/usbstick
:the system froze.
:
:I think it could be related with the fact that ls showed the old
:files on the previous test (some caching issue?).

Possibly. More likely its because the buffer cache filled up
with dirty buffers which couldn't be flushed because the
device was pulled.

There still needs to be some work done in the that regard,
like to try a few times and then discard the buffer. There's
a problem though in that the buffers represent modified meta-data
that is synchronized with the mount. If the kernel throws the
buffers away the filesystem mount could become even more confused
then just getting I/O errors.

:- would be silently* umounted when the device is plugged off,
: so ls should show nothing in the path

I think an auto-unmount of some sort would be a good idea.
It would probably solve the remaining panics/lockups.

:- would put the processes trying to access the device into a
: waiting loop
:
:- would be silently* remounted if the same device** is plugged in
: again in the same slot within a certain timeframe (the 10 seconds)

Not possible. Once the filesystem gets that many fatal errors
due to the device being unplugged it is pretty much a lost cause.

If the filesystem were idle when the device got unplugged we might
be able to swing it. That's a big if.

:- would return errors to the waiting processes after the timeout
:
:This would be a cool way to "recover" from accidently interrupted
:connections or unreliable devices.

Yah, but I don't think it is in the cards. There is no way
for the kernel to know what buffers actually made it to the
device or not.

:Just a thought. My experiences with C are very limited, so I do not
:qualify to implement such a beast, nor do I know how difficult it is
:or if it might break Posix.
:
:Whow, that one got longer than expected.
:
:Anyway: keep up the great work!
:
:Regards,
:Benny

They are all good ideas :-) Just not so easy to implement.

-Matt

#2 Updated by listen about 6 years ago

Matthew Dillon wrote:

This sounds to me like the umount command is waiting for the buffer to
flush while the buffer is not given up if the mount is in no proper
state. Is this what you are saying?

My naive approach is to let a forced umount blindly clear any related
buffers.

Well, I think it doesn't have to - in conjunction with the rsync based
changes to cp and mv (BTW I am only taking about storage devices, hard
disks, usb-sticks and the like in my posting, should have made this clear).

My naive idea is to let cp/mv simply resume by starting a new transfer
with only the differences, something like this

-> cp starts
-> device unplugged
-> triggers umount -f, buffers cleared, sigwait to cp, locks mount point
-> wait some secs
-> device plugged
-> triggers remount under same mount point
(if not possible: error, give up, release mount point)
-> sighup (or some other signal) to cp
-> cp resumes. i.e. starts copying the same things again, this time
behaving like rsync (I guess rsync solves incomplete transfers by
some kind of checksum / hashing)

Yeah, that was my fear :-). But since you are digging very deep anyway, it
could make a nice (sub-)project. Isn't the dragonfly project about proving
great ideas by cool implementations? Just to stimulate you ;-)

I had the impression that the reason why USB still isn't getting better in
FreeBSD is that nobody didn't really understood the the deeper internals
and since you just had fixed a lot of issues I thought you are in the topic
anyway and there could be a chance to fix the last remaining issues.

But if somebody who just wrote a filesystem tells me that this is not easy
to implement, this sounds like there is no chance to get somebody on such
work anytime soon...

Cheers,
Benny

#3 Updated by corecode about 6 years ago

If the disk layer ack'ed the block, it should be on stable storage,
right? Otherwise it isn't. To me that would mean that we just have to
write out all blocks in flight (those with errors) and the ones we
didn't even write yet. Of course that's only true for non-caching
media, but what can you do.

cheers
simon

#4 Updated by listen about 6 years ago

benny wrote:

perhaps this one is not so far away :)
lets say the fs on both devices is hammer, then a cp could be thought of as
a simple mirroring (syncing) (and a mv as a cp with deletion on the
source). the only thing needed would be to be able to "mirror" a single
file and to "detach" the mirrored file / directory somehow from the master.
I don't know if it is possible to extend hammer that way though.

For other filesystems an rsync something would still be needed, but this
could be a start.

Regards,
Benny

#5 Updated by listen about 6 years ago

benny wrote:

maybe a normal cross-device hammer-to-hammer cp could then be done like
this:
- create a pseudo fs on the target (marked with a special delete-me flag)
- sync the pseudo fs with the source
- transform it *somehow* into a regular entry (no pseudofs any longer)
- remove the special delete-me flag

if any interruption occurs, we got a pseudo fs entry with a special
delete-me flag that doesn't show up in normal operations and could be
removed by normal pruning

if the copying can be resumed, it would continue the mirroring until its
done

maybe single files could be handled on the target the same way
(pseudofs-root would be the filename and have some special flag an the file
in it)

just some food for thinking...I am sure you would come up with better
solutions..

benny

#6 Updated by corecode about 6 years ago

why not just use rsync?

cheers
simon

#7 Updated by listen about 6 years ago

that depends on "who" should use rsync?

if you mean that *the user* could use rsync instead of cp
in the first place - then you are right

but that was the whole point of my suggestion - to give a better user
experience:
IMHO the user should not have to know and install a tool like rsync
(nor to know mirror-copy) only to be able to transfer data reliably and
resumeable from one device to another... I mean, it's such a basic thing
that one tool (cp) should do it.

if you mean that "cp" should use rsync as proposed - then there may be two
caveats against having it as part of the base:
- the license is GPL
- rsync is not maintained by you

so one might suggest a (partial) rewrite under BSD license. that could take
some time - the suggestion for "hammered" devices could serve as a starting
point.

cheers,
Benny

#8 Updated by bastyaelvtars about 6 years ago

Under DF you can use cpdup.

#9 Updated by corecode about 6 years ago

I don't agree. Each tool does its job. cp does copy. rsync syncs
files. Anyways, I wonder how do you wind up with a partial copy? Do
you ^C it? I can't imagine any other situation. I believe that less
than 0.0001% uses of cp are trying to resume a copy.

cheers
simon

#10 Updated by listen about 6 years ago

Simon 'corecode' Schubert wrote:

Well, if you are not in a server environment with perfect hardware and
restricted access, there are many situations where such things happens:

- defective (controllers | harddisks | cables | contacts)
- shaky slots
- fluctuation of current
- accidental removals of external drives and sticks by kids or while
travelling with your notebook

the real world is full of interrupted connections...

in this case I plugged it of to test the behaviour, but I had plenty of
situations where such a feature would have helped (mostly due to defective
harddisks).

cheers,
benny

#11 Updated by listen about 6 years ago

Gergo Szakal wrote:

Thanks, I didn't know it. From a first view it looks like the only
missing piece would then be to make cpdup resumeable on a signal delivered
when the device is accessable again (remounted).

then it should be possible to gain the desired effect by
cpdup -o -s0 src dest

this could by aliased in shell for user comfort.

benny

#12 Updated by dillon about 6 years ago

:If the disk layer ack'ed the block, it should be on stable storage,
:right? Otherwise it isn't. To me that would mean that we just have to
:write out all blocks in flight (those with errors) and the ones we
:didn't even write yet. Of course that's only true for non-caching
:media, but what can you do.
:
:cheers
: simon

Nope. The device is virtually guaranteed to ack the block before
it reaches stable storage, particularly if it is a flash device.

-Matt
Matthew Dillon
<>

#13 Updated by tuxillo over 1 year ago

  • Description updated (diff)
  • Status changed from New to Closed
  • Assignee deleted (0)

Hi,

Not the case anymore:

* No panics / no tons of warning messages.
* No contents shown in /mnt
* No system freeze.

[dfly_i386] /mnt> df -h /mnt
Filesystem Size Used Avail Capacity Mounted on
/dev/da8 29G 22G 7.1G 76% /mnt
[dfly_i386] /mnt> dmesg |tail
da8: <Kingston DT 100 G2 PMAP> Removable Direct Access SCSI-4 device
da8: 40.000MB/s transfers
da8: 29984MB (61408128 512 byte sectors: 255H 63S/T 3822C)
da8: slice starts beyond end of the disk: rejecting it
da8: slice starts beyond end of the disk: rejecting it
da8: slice starts beyond end of the disk: rejecting it
da8: slice starts beyond end of the disk: rejecting it
umass0: at uhub1 port 1 (addr 2) disconnected
(da8:umass-sim0:0:0:0): lost device
umass0: detached
[dfly_i386] /mnt> ls -l /mnt
total 0

Regards,
Antonio Huete

Also available in: Atom PDF