1. 09 Mar, 2019 28 commits
    • J. R. Okajima's avatar
      aufs: debug print by MagicSysRq · b41567b1
      J. R. Okajima authored
      
      
      Print the current data status for debugging.
      The trigger key is a module parameter and you can freely change. The
      default is 'a' of course.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b41567b1
    • J. R. Okajima's avatar
      aufs: debugfs interface · 0f0e211c
      J. R. Okajima authored
      
      
      Aufs provides some info via debugfs such as
      - the branch path
      - the current number of pseudo-links
      - the size and the number of consumed blocks by XINO, XIB and XIGEN.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      0f0e211c
    • J. R. Okajima's avatar
      aufs: dirwh option · 0b7972a4
      J. R. Okajima authored
      
      
      This is a feature to optimize for rmdir and rename dir.
      When the number of whiteouts under the target dir is very many, it may
      take a long time to remove them all. To prevent this, 'dirwh=%d' option
      specifies the watermark to decide when to remove them.
      
      For details, see aufs manual in aufs-util.git.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      0b7972a4
    • J. R. Okajima's avatar
      aufs: branch management, delete 1/3, file list · b6f19f4f
      J. R. Okajima authored
      
      
      Implement an internal list of opened files to allow deleting a branch
      which has an opened dir. Obviously I don't like such list.
      
      There was such list in linux as sb->s_files, but in linux-3.12 s_files
      became containing just a part of the opened files, and in linux-3.13 it
      was totally gone.
      Aufs still needs the file list, particularly for re-setting the branch
      attribute from RW to RO.
      After resetting to RO, aufs should return EROFS for write. In order to
      support such case, aufs keeps the late s_files and mark_files_ro()
      approach.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b6f19f4f
    • J. R. Okajima's avatar
      aufs: atomic_open 5/5, the body aufs_atomic_open() · dc25d54f
      J. R. Okajima authored
      
      
      ->atomic_open() is another monster (the other is ->rename() of course).
      It operates look-up, create, and open in a single method.
      Strictly speaking the behaviour is not atomic from the branch fs's point
      of view, while the atomicity is kept in aufs's. This is a second-best
      approach. For details, refer to the design document in previous commit.
      
      A simple list 'si_aopen' can be put into aufs inode object, but I don't
      think it a good idea because the number of inodes is much larger than
      the number of super_block. If I put it into inode, it can be an
      un-interesting memory pressure I am afraid.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      dc25d54f
    • J. R. Okajima's avatar
      aufs: udba=none has no ->getattr() 1/2, inode_operations array · 46b85e53
      J. R. Okajima authored
      
      
      An enhancement for udba=none if possible.
      The condition is same to the 'no ->d_revalidate()' patch series.
      In this mode, i_op->getattr() is less important, and the default VFS
      handler is enough.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      46b85e53
    • J. R. Okajima's avatar
      aufs: inode op, rename 1/2, intro · 2a422a58
      J. R. Okajima authored
      
      
      Implement i_op->rename().
      This is a big monster and I don't like it.
      
      In order to call d_move() in aufs lock section, FS_RENAME_DOES_D_MOVE is
      set to fstype.f_flags.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2a422a58
    • J. R. Okajima's avatar
      aufs: dentry op, opt-out ->d_revalidate() · ab70d2ad
      J. R. Okajima authored
      
      
      Optimize out ->d_revalidate() if possible.
      If the refreshing failed, then ->d_revalidate() remains. In this case,
      aufs has two types of dentries. One has ->d_revalidate, the other
      doesn't. I am afraid it will confuse me someday.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      ab70d2ad
    • J. R. Okajima's avatar
      aufs: remount 2/5, refresh the cached dentries (using d_walk()) · 3d06348e
      J. R. Okajima authored
      
      
      As a part of branch-management, aufs maintains all cached inodes,
      dentries, and opened files in remounting.
      This commits handles the cached dentries by calling the VFS internal
      function d_walk().
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      3d06348e
    • J. R. Okajima's avatar
      aufs: remount 1/5, an array of the cached inodes · 9dfa0bd3
      J. R. Okajima authored
      
      
      As a part of branch-management, aufs maintains all cached inodes,
      dentries, and opened files in remounting.
      This commits handles the cached inodes by counting the number of cached
      inodes, generating an array of their pointers. I don't like such array
      approach, but I don't have another idea.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      9dfa0bd3
    • J. R. Okajima's avatar
      aufs: virtual or vertical directory 1/2, intro · 47118316
      J. R. Okajima authored
      
      
      This commit is just to prepare for the succeeding commit, and split to
      suppress the size of a single commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      47118316
    • J. R. Okajima's avatar
      aufs: mount/unmount · 8d4d76d3
      J. R. Okajima authored
      
      
      Now aufs becomes mountable with very few features.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      8d4d76d3
    • J. R. Okajima's avatar
      aufs: export via NFS 1/2, body · c470ddd6
      J. R. Okajima authored
      
      
      Implement exporting via NFS.
      The file handle is rather large (40 bytes at most + the file handle on a
      branch).
      The non-virtual filesystems can use an anonymous (disconnected) dentry
      as long as the inode is identified, but aufs needs a dentry with dinfo
      which is usually constructed.  So aufs has to find or generate the
      normal dentry from the file handle in decoding.  Eg. in aufs, there
      should never be the anonymous dentry.
      
      In decoding the file handle, if both of the dentry and the inode which
      are corresponding the file handle are still in cache, then they are
      returned immediately.  Otherwise aufs has to find the cached parent dir
      from the file handle.  If the parent dir is not cached either, the aufs
      tries these steps.
      - decode the branch fs's file handle and get the parent dir
      - generate the path of the parent dir on the branch
      - convert the branch path to aufs's path
      - lookup the inode number under the aufs' path
      The last one is the slowest case.
      
      exportfs_decode_fh() (actually reconnect_path()) acquires mutex, and
      this behaviour violates the locking order between aufs si_rwsem.  This
      is not a problem since internal exportfs_decode_fh() is called for the
      branch fs.
      Simply use lockdep_off/on to silence the lockdep message.
      
      See also the document in later commit.
      This is compiled only when CONFIG_AUFS_EXPORT is enabled.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c470ddd6
    • J. R. Okajima's avatar
      aufs: writable branch select policy 2/2, variations · 2ef06b5a
      J. R. Okajima authored
      
      
      Several policies to select one among multiple writable branches.
      See also the document in previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2ef06b5a
    • J. R. Okajima's avatar
      aufs: writable branch select policy 1/2, core · 60b24eed
      J. R. Okajima authored
      
      
      Aufs can have multiple writable branches, and there are several
      policies to select one among them.
      This commit implements default "top-down-parent" for both of
      creating-policy and copyup-policy.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      60b24eed
    • J. R. Okajima's avatar
      aufs: enter/leave flag per task · 32bfe3d0
      J. R. Okajima authored
      
      
      In freeing aufs iinfo objects, it acquires the internal rw_sem (see
      another commit in detail). Since iinfo can be freed anytime, a deadlock
      may happen due to the rw_sem. To prevent this problem, this commit
      introduces a flag per task.
      This is another (very) ugly approach which I don't like.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      32bfe3d0
    • J. R. Okajima's avatar
      aufs: pseudo-link and procfs support · c7ae8357
      J. R. Okajima authored
      
      
      Aufs pseudo-link (plink) represents a virtual hardlink across the
      branches. To implement the plink maintenance mode, aufs uses procfs.
      See also the document in this commit.
      
      There is an external user-space utility called 'auplink' in
      aufs-util.git, which has these features.
      - 'list' shows the pseudo-linked inode numbers and filenames.
      - 'cpup' copies-up all pseudo-link to the writable branch.
      - 'flush' calls 'cpup', and then 'mount -o remount,clean_plink=inum'
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c7ae8357
    • J. R. Okajima's avatar
      aufs: sbinfo list · a35f55f4
      J. R. Okajima authored
      
      
      When user accesses aufs via other than fs related systemcalls, aufs
      needs to identify which superblock is the target.  Here is the trick.
      It is just a list of aufs superblocks.  Such way will be procfs and
      MagicSysRq key.  For MagicSysRq support, see the later commit.
      This is a dirty approach which I don't like, but I just don't have
      another idea.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a35f55f4
    • J. R. Okajima's avatar
      aufs: xino truncation · 2a150b32
      J. R. Okajima authored
      
      
      As mentioned earlier, sometimes the size of XINO file is a problem.
      Aufs has a feature to truncate it asynchronously using workqueue. But it
      may not be so effective in some cases, and you may want to stop
      discontiguous distribution of the inode numbers on branch fs.
      See also the log in another commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2a150b32
    • J. R. Okajima's avatar
      aufs: sysfs interface · a3adef77
      J. R. Okajima authored
      
      
      The branch path can be much longer and it is not suitable to print via
      /proc/mounts as a part of mount options. Aufs can show it either
      separately via sysfs or /proc/mounts (as a part of mount options).
      This approach affects the lifetime of aufs objects and sbinfo contains
      kobject (in another commit). Theoretically user can disable
      CONFIG_SYSFS, but the lifetime management is always necessary. So
      supporting sysfs is split into two files, sysaufs.c and sysfs.c.
      sysaufs.c is always compiled, but sysfs.c is compiled only when
      CONFIG_SYSFS is enabled.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a3adef77
    • J. R. Okajima's avatar
      aufs: xino 1/2, core · 6fe05098
      J. R. Okajima authored
      
      
      XINO and XIB files are to maintain the inode numbers in aufs
      (cf. struct.txt and aufs manual in aufs-util.git).
      
      XINO file contains just a sequence of the inode numbers, and their
      offset in the file is real_inum x sizeof(inum).  So the size is limited
      by s_maxbytes of the filesystem where XINO file is located.  In order to
      support the larger inum, aufs stores XINO files as an internal array.
      
      Sometimes the size of XINO file can be a problem, ie. too big,
      particularly when XINO files are located on tmpfs. In this case, another
      separate patch tmpfs-ino.patch in aufs4-standalone.git is recommended
      (as well as vfs-ino.patch). The patch makes tmpfs to maintain inode
      number within itself and suppress its discontiguous distribution.
      
      See also the document in next commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      6fe05098
    • J. R. Okajima's avatar
      aufs: workqueue · f04356cb
      J. R. Okajima authored
      
      
      Aufs uses the workqueue both synchronously and asynchronously.
      For sync-use-case, aufs uses its own specific wkq since doesn't want to
      be disturbed by other tasks on the system. For async-use-case, aufs uses
      the system global workqueue.
      Aufs has to prevent itself to being unmounted during the async-task is
      queued.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f04356cb
    • J. R. Okajima's avatar
      aufs: readonly branch 2/2, callers · 66bc346d
      J. R. Okajima authored
      
      
      For details, see previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      66bc346d
    • J. R. Okajima's avatar
      aufs: readonly branch 1/2, definition · b7051459
      J. R. Okajima authored
      
      
      The branch object is managed by the sbinfo object as an element of its
      internal array. The iinfo and dinfo objects contain the branch id, and
      it will be used to implement the correct order in branch management
      (add/del).
      
      See also the documents in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b7051459
    • J. R. Okajima's avatar
      aufs: infos generation 2/2 · f2ade007
      J. R. Okajima authored
      
      
      The generation of iinfo and dinfo inherit sbinfo's.
      Also iinfo generation tracks the branch inode's generation to test the
      matching after the branch management.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f2ade007
    • J. R. Okajima's avatar
      aufs: sbinfo core · 3742849b
      J. R. Okajima authored
      
      
      The structure is very similar to inode and dentry infos (in previous
      commits), but the internal array is for 'struct au_branch' instead of
      'superblock.' Additionally the lifetime of 'struct au_sbinfo' is managed
      by kobject since it will be connected to sysfs by later commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      3742849b
    • J. R. Okajima's avatar
      aufs: dinfo core · ee166183
      J. R. Okajima authored
      
      
      The structure is very similar to aufs inode info (in previous commit).
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      ee166183
    • J. R. Okajima's avatar
      aufs: iinfo core · a3d2caf2
      J. R. Okajima authored
      
      
      See the documents in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a3d2caf2