Thursday, June 6, 2019

Sync or async, that is the question

When you write changes in some config file or create new dir and files you expect that is saved on the disk. The reality is different and data may not be written. That is the difference between asynchronous and synchronous write modes. First of them is default mode for almost all filesystem and not write your data immediately to the disk. Second of them forcing write changes immediately.

So how long your data can be not write into disk in asynchronous mode? Most frequently this period of time is equal to 5 seconds. Exactly 5 sec in ext4 and ZFS, and 30 sec in UFS. Why? For best performance. This is a general problem in computer system, all about is speed. This is good? If speed is more important than safety yes, but in most cases I prefer first safety, seconds speed in filesystem.

So, how to change this settings? In ext4 in Linux we have mount options sync, dirsync and commit:

dirsync - is default, it's mean directory operation save immediately, but not data operations
commit - default is 5 second, this is time between saves all data to disk
sync - instructs save any write operation to disk and process waiting for end of this operation

See simple test for default options:
# mount  | grep root
/dev/mapper/storage1--vg-root on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
# dd if=/dev/zero of=test bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes (205 MB, 195 MiB) copied, 0.537975 s, 381 MB/s
# rm test
# dd if=/dev/zero of=test bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes (205 MB, 195 MiB) copied, 0.5754 s, 356 MB/s
# rm test
# dd if=/dev/zero of=test bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes (205 MB, 195 MiB) copied, 0.544742 s, 376 MB/s
# rm test
With commit=1:

# mount -o remount,commit=1 /
# mount  | grep root
/dev/mapper/storage1--vg-root on / type ext4 (rw,relatime,errors=remount-ro,commit=1,data=ordered)

# rm test
# dd if=/dev/zero of=test bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes (205 MB, 195 MiB) copied, 0.57459 s, 356 MB/s
# rm test
# dd if=/dev/zero of=test bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes (205 MB, 195 MiB) copied, 0.570971 s, 359 MB/s
# rm test
# dd if=/dev/zero of=test bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes (205 MB, 195 MiB) copied, 0.57707 s, 355 MB/s
# rm test
And now with sync:

# mount -o remount,sync /
# mount  | grep root
/dev/mapper/storage1--vg-root on / type ext4 (rw,relatime,sync,errors=remount-ro,commit=1,data=ordered)
# dd if=/dev/zero of=test bs=1024 count=200000
^C25644+0 records in
25644+0 records out
26259456 bytes (26 MB, 25 MiB) copied, 1177.03 s, 22.3 kB/s
# rm test
So commit=1 is ok, but sync is overkill operations and I need stop it. My config is:

# grep root /etc/fstab 
/dev/mapper/storage1--vg-root /               ext4    errors=remount-ro,dirsync,commit=1 0       1

That is Linux, now FreeBSD. When I/O operations in async mode this means not all operation use this mode. Program can use fsync() system call, this is instructs system to write this file immediately even filesystem in async mode. This applies also to Linux. On FreeBSD we can check this via:

# mount -v
/dev/vtbd0p2 on / (ufs, local, writes: sync 213 async 8228, reads: sync 1064 async 5, fsid 3c98e15c4c4a6b0c)

Most operation execute in async mode but not all. So this is may depend of program/programer.

FreeBSD have mount option:

noasync - this is like dirsync, but for FreeBSD

Default time period to save all cached data to disk in UFS is:

# sysctl -a | egrep "kern.*delay"
kern.metadelay: 28
kern.dirdelay: 29
kern.filedelay: 30

We can change this:

# vi /etc/sysctl.conf
kern.metadelay=1
kern.dirdelay=1
kern.filedelay=1
# service sysctl restart

and operation is not slow to mauch:

# dd if=/dev/zero of=test bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes transferred in 0.385911 secs (530692229 bytes/sec)

But FreeBSD has mechanism called Soft Updates for metadata, this is dangerous because buffered metadata saved to disk even in minutes. I turn it off. You must boot system in single user mode and execute:

tunefs -n disable /

At the end ZFS has vfs.zfs.txg.timeout options. It period of 'transaction groups' equal 5 sec. We also change it by sysctl.conf. Also we can enable sync mode:

# zfs set sync=always my/datasets

So, first of program should force save data if it is important. If you do some I/O operations in shell you can always use sync command (maybe in you some script eg. backups). System not should wait with saves to long - we can changes this. The last we can create specjal separate filesystem for important data and enable sync mode for it. In btrfs or ZFS dataset is easy.

No comments:

Post a Comment