1. 09 Mar, 2019 28 commits
    • J. R. Okajima's avatar
      aufs stdalone: full copyright sentence · 4010b936
      J. R. Okajima authored
      
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      4010b936
    • J. R. Okajima's avatar
      aufs: fuse branch (including poll(2)) · 08ecca6f
      J. R. Okajima authored
      
      
      Fuse doesn't want the callers to access the inode attributes without
      issuing stat, and it is not assured that they are valid after lookup or
      iget().
      The inode attribute is critical for aufs, and aufs decided to call stat
      every time for fuse.
      Of course, it makes aufs slow. But when the branch fs is not fuse, stat
      is not called.
      
      Currently, only FUSE implements ->poll(), and aufs supports it.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      08ecca6f
    • J. R. Okajima's avatar
      aufs: ioctl, mvdown 1/2, body · bc962de7
      J. R. Okajima authored
      
      
      Another ioctl feature, move-down.
      The behaviour is, as you can guess, the opposite of copy-up.
      The feature called FHSM (file-based hierarchical storage management, in
      later commit) uses this ioctl aggressively.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      bc962de7
    • J. R. Okajima's avatar
      aufs: branch management, modify the permission and attribute · 6462d8eb
      J. R. Okajima authored
      
      
      The permissions and attributes of a branch can be modified dynamically.
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      6462d8eb
    • J. R. Okajima's avatar
      aufs: atomic_open 5/5, the body aufs_atomic_open() · dc25d54f
      J. R. Okajima authored
      
      
      ->atomic_open() is another monster (the other is ->rename() of course).
      It operates look-up, create, and open in a single method.
      Strictly speaking the behaviour is not atomic from the branch fs's point
      of view, while the atomicity is kept in aufs's. This is a second-best
      approach. For details, refer to the design document in previous commit.
      
      A simple list 'si_aopen' can be put into aufs inode object, but I don't
      think it a good idea because the number of inodes is much larger than
      the number of super_block. If I put it into inode, it can be an
      un-interesting memory pressure I am afraid.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      dc25d54f
    • J. R. Okajima's avatar
      aufs: atomic_open 2/5, introduce vfsub_atomic_open() · be3f6e60
      J. R. Okajima authored
      
      
      Following the design in another commit, aufs calls branch fs's
      ->atomic_open() if exits. Ideally it would be better to call
      VFS:do_last, lookup_open() or atomic_open, but it is very hard for
      aufs. This implementation is far from the best.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      be3f6e60
    • J. R. Okajima's avatar
      aufs: file op · a3387b95
      J. R. Okajima authored
      
      
      Implement several f_op functions for non-dir.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a3387b95
    • J. R. Okajima's avatar
      aufs: xattr and acl · a76ab411
      J. R. Okajima authored
      
      
      Support for XATTR and ACL including several branch attributes to ignore
      the copy error around XATTR and ACL.
      
      NFS always sets MS_POSIXACL regardless its mount option 'noacl.'
      When MS_POSIXACL is set, generic_permission() calls check_acl() (via
      acl_permission_check()) and gets -EOPNOTSUPP because the NFS branch is
      mounted as 'noacl.'
      In aufs, h_permission() should not call generic_permission() in this
      case.
      The similar thing happens in coping-up XATTR. vfs_getxattr_alloc()
      returns -EOPNOTSUPP.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a76ab411
    • J. R. Okajima's avatar
      aufs: inode op, get/set attributes · 25f10296
      J. R. Okajima authored
      
      
      Implement i_op->get/setattr().
      setattr() is another trigger of the copy-up. The file may or may not be
      opened. And some of sub-functions are commonly used to XATTR operations
      in later commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      25f10296
    • J. R. Okajima's avatar
      aufs: inode op, rename 1/2, intro · 2a422a58
      J. R. Okajima authored
      
      
      Implement i_op->rename().
      This is a big monster and I don't like it.
      
      In order to call d_move() in aufs lock section, FS_RENAME_DOES_D_MOVE is
      set to fstype.f_flags.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2a422a58
    • J. R. Okajima's avatar
      aufs: inode op, del, rmdir · 5755f006
      J. R. Okajima authored
      
      
      Implement i_op->rmdir() with supporting logical deletion by whiteout
      including all children.
      
      As struct.txt in previous commit described, the target dir is renamed to
      a whiteout-ed temporary unique name in rmdir(2), and then removed
      asynchronously by the system global workqueue.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      5755f006
    • J. R. Okajima's avatar
      aufs: inode op, del, unlink · 4dc51987
      J. R. Okajima authored
      
      
      Implement i_op->unlink() with supporting logical deletion by whiteout.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      4dc51987
    • J. R. Okajima's avatar
      aufs: inode op, for symlink · 7213fdc8
      J. R. Okajima authored
      
      
      Implement dir.i_op->get_link().
      I am afraid there may exist the case skipping updating the inode atime.
      It means the aufs inode atime may be reverted by the real fs inode atime
      silently.  But I don't think it a big problem.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      7213fdc8
    • J. R. Okajima's avatar
      aufs: superblock op · 3cb74551
      J. R. Okajima authored
      
      
      Implement basic sb_op->show_options(), statfs() and sync_fs() simply.
      - show_options() doesn't print the default values (AuOpt_Def).
      - statfs() will have an option to summarize the numbers from branches
        (in later commit).
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      3cb74551
    • J. R. Okajima's avatar
      aufs: finfo core · c3e1eecf
      J. R. Okajima authored
      
      
      The structure is very similar to iinfo and dinfo (in previous commits).
      This commit is for non-dir files. For a directory, see later commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c3e1eecf
    • J. R. Okajima's avatar
      aufs: export via NFS 1/2, body · c470ddd6
      J. R. Okajima authored
      
      
      Implement exporting via NFS.
      The file handle is rather large (40 bytes at most + the file handle on a
      branch).
      The non-virtual filesystems can use an anonymous (disconnected) dentry
      as long as the inode is identified, but aufs needs a dentry with dinfo
      which is usually constructed.  So aufs has to find or generate the
      normal dentry from the file handle in decoding.  Eg. in aufs, there
      should never be the anonymous dentry.
      
      In decoding the file handle, if both of the dentry and the inode which
      are corresponding the file handle are still in cache, then they are
      returned immediately.  Otherwise aufs has to find the cached parent dir
      from the file handle.  If the parent dir is not cached either, the aufs
      tries these steps.
      - decode the branch fs's file handle and get the parent dir
      - generate the path of the parent dir on the branch
      - convert the branch path to aufs's path
      - lookup the inode number under the aufs' path
      The last one is the slowest case.
      
      exportfs_decode_fh() (actually reconnect_path()) acquires mutex, and
      this behaviour violates the locking order between aufs si_rwsem.  This
      is not a problem since internal exportfs_decode_fh() is called for the
      branch fs.
      Simply use lockdep_off/on to silence the lockdep message.
      
      See also the document in later commit.
      This is compiled only when CONFIG_AUFS_EXPORT is enabled.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c470ddd6
    • J. R. Okajima's avatar
      aufs: writable branch select policy 1/2, core · 60b24eed
      J. R. Okajima authored
      
      
      Aufs can have multiple writable branches, and there are several
      policies to select one among them.
      This commit implements default "top-down-parent" for both of
      creating-policy and copyup-policy.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      60b24eed
    • J. R. Okajima's avatar
      aufs: copy-up 4/7, body · f63b4f2f
      J. R. Okajima authored
      
      
      The functions for
      - create the copy-up target file
      - copy filedata
      - copy metadata
      
      In copying filedata, I had tried splice_direct() instead of repeating
      read/write. Surprisingly, I could not see a big difference. So let's
      keep this approach for a while. Someday SEEK_DATA/SEEK_HOLE become more
      popular, it may help optimizing this read/write.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f63b4f2f
    • J. R. Okajima's avatar
      aufs: copy-up 3/7, internal file I/O · e989fe7b
      J. R. Okajima authored
      
      
      The internal file read/write for copy-up in kernelspace.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      e989fe7b
    • J. R. Okajima's avatar
      aufs: copy-up 1/7, attributes · f7f1bacc
      J. R. Okajima authored
      
      
      Copy the inode attributes between branches.
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f7f1bacc
    • J. R. Okajima's avatar
      aufs: pin or lock the parent dir and the child on a branch · 9f923257
      J. R. Okajima authored
      
      
      To create/delete/rename files including copy-up, aufs acquires several
      locks on the branch fs internally. These lock/unlock operations are
      consolidated into struct au_pin in this commit.
      au_pin handles
      - LOCKDEP class
      - re-validate/verify
      - suspend/resume HNOTIFY
      
      See also lookup.txt in later commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      9f923257
    • J. R. Okajima's avatar
      aufs: writable branch 2/3, body · 59ad1975
      J. R. Okajima authored
      
      
      Actually prepare the whiteout bases on the adding writable branch.
      For details, refer to previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      59ad1975
    • J. R. Okajima's avatar
      aufs: writable branch 1/3, white-out · 72a970fc
      J. R. Okajima authored
      
      
      The writable branch prepares a few files and dirs for whiteouts.
      For branch filesystems which doesn't support link(2), there is "nolwh"
      attribute. On the branch which is specified this attribute, aufs never
      try link(2) for whitout and always creat(2) it.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      72a970fc
    • J. R. Okajima's avatar
      aufs: pseudo-link and procfs support · c7ae8357
      J. R. Okajima authored
      
      
      Aufs pseudo-link (plink) represents a virtual hardlink across the
      branches. To implement the plink maintenance mode, aufs uses procfs.
      See also the document in this commit.
      
      There is an external user-space utility called 'auplink' in
      aufs-util.git, which has these features.
      - 'list' shows the pseudo-linked inode numbers and filenames.
      - 'cpup' copies-up all pseudo-link to the writable branch.
      - 'flush' calls 'cpup', and then 'mount -o remount,clean_plink=inum'
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c7ae8357
    • J. R. Okajima's avatar
      aufs: white-out 1/2 · 2a7e7277
      J. R. Okajima authored
      
      
      The whiteout represents a logical deletion.
      Although the document in this commit mentioned about rmdir(2) and
      rename(2) for dir, this commit doesn't contain such functions. They will
      be added in later commits.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2a7e7277
    • J. R. Okajima's avatar
      aufs: xino 1/2, core · 6fe05098
      J. R. Okajima authored
      
      
      XINO and XIB files are to maintain the inode numbers in aufs
      (cf. struct.txt and aufs manual in aufs-util.git).
      
      XINO file contains just a sequence of the inode numbers, and their
      offset in the file is real_inum x sizeof(inum).  So the size is limited
      by s_maxbytes of the filesystem where XINO file is located.  In order to
      support the larger inum, aufs stores XINO files as an internal array.
      
      Sometimes the size of XINO file can be a problem, ie. too big,
      particularly when XINO files are located on tmpfs. In this case, another
      separate patch tmpfs-ino.patch in aufs4-standalone.git is recommended
      (as well as vfs-ino.patch). The patch makes tmpfs to maintain inode
      number within itself and suppress its discontiguous distribution.
      
      See also the document in next commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      6fe05098
    • J. R. Okajima's avatar
      aufs: readonly branch 2/2, callers · 66bc346d
      J. R. Okajima authored
      
      
      For details, see previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      66bc346d
    • J. R. Okajima's avatar
      aufs: readonly branch 1/2, definition · b7051459
      J. R. Okajima authored
      
      
      The branch object is managed by the sbinfo object as an element of its
      internal array. The iinfo and dinfo objects contain the branch id, and
      it will be used to implement the correct order in branch management
      (add/del).
      
      See also the documents in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b7051459