1. 09 Mar, 2019 40 commits
    • J. R. Okajima's avatar
      aufs: ioctl, mvdown 1/2, body · bc962de7
      J. R. Okajima authored
      
      
      Another ioctl feature, move-down.
      The behaviour is, as you can guess, the opposite of copy-up.
      The feature called FHSM (file-based hierarchical storage management, in
      later commit) uses this ioctl aggressively.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      bc962de7
    • J. R. Okajima's avatar
      aufs: ioctl, ibusy · 8ea77cf9
      J. R. Okajima authored
      
      
      Because of some inode is in use, the deletion of a branch can fail.
      For those who wants to test the inode is busy or not, aufs provides an
      ioctl, and a utility 'aubusy' in aufs-util.git.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      8ea77cf9
    • J. R. Okajima's avatar
      aufs: ioctl, rdu (readdir in userspace) · 681a21c3
      J. R. Okajima authored
      
      
      For a directory which has millions of files, aufs VDIR consumes
      much memory. In this case, RDU (readdir(3) in user-space) is definitely
      better.
      If you enable CONFIG_AUFS_RDU at compiling aufs, install libau.so from
      aufs-util.git, and set some environment variables, then you can use this
      feature. When readdir(3) in libau.so receives an aufs dir, it issues
      ioctl(2) instead of regular readdir(3).
      All merging and whiteout handling are done in userspace.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      681a21c3
    • J. R. Okajima's avatar
      aufs: ioctl, brinfo · 0d7d5cf2
      J. R. Okajima authored
      
      
      Provide info about the branches, which will be used from user-space.
      This is essentially equivalent to the entries under sysfs
      (/sys/fs/aufs/si_*/).
      But the ioctl behaviour is atomic and never confuse the matching of the
      branch id.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      0d7d5cf2
    • J. R. Okajima's avatar
      aufs: ioctl, wbr_fd · de0ee5c9
      J. R. Okajima authored
      
      
      Provide a file descriptor corresponding the specified writable branch.
      The file descriptor will be used from user-space such as FHSM and
      libau.so. For details, see aufs-util.git.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      de0ee5c9
    • J. R. Okajima's avatar
      aufs: debug print by MagicSysRq · b41567b1
      J. R. Okajima authored
      
      
      Print the current data status for debugging.
      The trigger key is a module parameter and you can freely change. The
      default is 'a' of course.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b41567b1
    • J. R. Okajima's avatar
      aufs: debug, several checks only once · a38028d6
      J. R. Okajima authored
      
      
      Simple checks when the module is loaded.
      This feature is compiled when CONFIG_AUFS_DEBUG is enabled.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a38028d6
    • J. R. Okajima's avatar
      aufs: debugfs interface · 0f0e211c
      J. R. Okajima authored
      
      
      Aufs provides some info via debugfs such as
      - the branch path
      - the current number of pseudo-links
      - the size and the number of consumed blocks by XINO, XIB and XIGEN.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      0f0e211c
    • J. R. Okajima's avatar
      aufs: diropq_[aw] options · c0ec0e0e
      J. R. Okajima authored
      
      
      These are very old options.
      Since Unionfs created 'diropq' unconditionally in mkdir(2), old users
      may expect the same behaviour. But there are cases where 'diropq' is
      unnecessary. The aufs default behaviour is to create 'diropq' only when
      it is necessary.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c0ec0e0e
    • J. R. Okajima's avatar
      aufs: show-whiteout option · 9723961f
      J. R. Okajima authored
      
      
      Generally aufs hides the name of whiteouts. But in some cases, to show
      them is very useful for users. For instance, creating a new middle layer
      (branch) by merging existing layers.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      9723961f
    • J. R. Okajima's avatar
      aufs: mount option, warning about the permissions · 59444fad
      J. R. Okajima authored
      
      
      While most people (especially who use tmpfs as top writable branch)
      doesn't care, I care and think it can be a security problem.
      For example, when the lower readonly branch may contain
      /etc/{passwd,shadow} and the permission bits of the upper empty
      branch is world-writable, then a malicious user can make these files
      manually with by-passing aufs.
      Aufs can do nothing but produce a warning.
      
      For details, see aufs manual in aufs-util.git.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      59444fad
    • J. R. Okajima's avatar
      aufs: dirren 6/6, mount options · 852c25f7
      J. R. Okajima authored
      
      
      Introduce the new mount options, dirren and nodirren, which activates
      and deactivates DIRREN feature.
      In remount and unmount, the inum-list per branch should be flushed to
      the file.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      852c25f7
    • J. R. Okajima's avatar
      aufs: dirren 5/6, lookup and revalidate with loading the rename info · 866fd027
      J. R. Okajima authored
      
      
      When aufs meets a new dir inode on a branch in lookup, it tests whether
      the inode is in the list which the branch has. If the inode is found, it
      means the dir has ever been logically renamed and there is some info
      about the name under that dir. Then aufs tries loading the info, and
      continues looking up using the before-renamed name on the lower
      branches.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      866fd027
    • J. R. Okajima's avatar
      aufs: dirren 4/6, rename with saving the rename info · 39d4a736
      J. R. Okajima authored
      
      
      When DIRREN is enabled and activated, the error case where
      aufs rename(2) used to return EXDEV will be gone.
      Aufs rename(2) registers the renaming dir inum to the list in the
      branch, creates the detailed info file, and returns a success.
      
      If udba=notify option is specified with dirren, the internal detection
      may not work correctly since aufs may not be able to find the target
      name.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      39d4a736
    • J. R. Okajima's avatar
      aufs: dirren 3/6, save the detailed info per a dir · 77277be2
      J. R. Okajima authored
      
      
      The detailed info per renamed directory is stored in a regular file per
      branch, ie. when each of two lower branches contains the same named
      entry, then the created info files will be two.
      The file is created internally by aufs rename(2) and loaded by lookup.
      Also when the actual rename on the branch fails, the newly created or
      stored info file should be all reverted.
      
      When the renamed dir is renamed-back to the previous/original name, then
      the info file has to be removed.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      77277be2
    • J. R. Okajima's avatar
      aufs: dirren 2/6, branch id as a filename of the info · f371c057
      J. R. Okajima authored
      
      
      DIRREN gives an identifier to every branch, and it is used as a part of
      the filename of the detailed info file.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f371c057
    • J. R. Okajima's avatar
      aufs: dirren 1/6, inum-list of the renamed dir in a branch · 53c63b41
      J. R. Okajima authored
      
      
      This commits brings a list of the inode numbers which indicates the
      logically renamed dir into a branch. The list will be referred in
      lookup, and its lifetime is equivalent to the branch's, ie. the list is
      loaded/created in adding a branch, and stored/deleted in deleting a
      branch. The simple storing happens in remounting and unmounting aufs
      too.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      53c63b41
    • J. R. Okajima's avatar
      aufs: statfs sum options · 4541724e
      J. R. Okajima authored
      
      
      Since aufs can have multiple writable branches, these options are
      useful.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      4541724e
    • J. R. Okajima's avatar
      aufs: dirwh option · 0b7972a4
      J. R. Okajima authored
      
      
      This is a feature to optimize for rmdir and rename dir.
      When the number of whiteouts under the target dir is very many, it may
      take a long time to remove them all. To prevent this, 'dirwh=%d' option
      specifies the watermark to decide when to remove them.
      
      For details, see aufs manual in aufs-util.git.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      0b7972a4
    • J. R. Okajima's avatar
      aufs: dirperm1 option · b3b9c456
      J. R. Okajima authored
      
      
      Sometimes the aufs policy to respect the branch fs's permission bits
      makes users confused. IE. the direcotry permission bits on the top branch
      allows users to read, but the lower branch prohibts. This option may be
      useful for such case.
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b3b9c456
    • J. R. Okajima's avatar
      aufs: branch management, modify the permission and attribute · 6462d8eb
      J. R. Okajima authored
      
      
      The permissions and attributes of a branch can be modified dynamically.
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      6462d8eb
    • J. R. Okajima's avatar
      aufs: branch management, delete 3/3, mount option · dea1d77c
      J. R. Okajima authored
      
      
      Implement the user interface.
      Since users often wonder "Why I cannot delete this branch?", 'verbose'
      option was introduced.
      
      You may think aufs should not hold several strings for the variation of
      the option, and the mount helper (/sbin/mount.aufs) can convert all
      variations to a single fixed string, and in kernel space aufs should
      contain this only one string.
      I agree, but in our real world, many users don't install
      /sbin/mount.aufs. To be convenient, aufs contains these variations.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      dea1d77c
    • J. R. Okajima's avatar
      aufs: branch management, delete 2/3, body · b79632ef
      J. R. Okajima authored
      
      
      Delete a branch which is not busy.
      Aufs judges the branch is deletable by testing the opened files, the
      cached dentries and inodes. Even if a directory is in use, as long as
      the same named entry exist on another branch, then the branch is
      deletable.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b79632ef
    • J. R. Okajima's avatar
      aufs: branch management, delete 1/3, file list · b6f19f4f
      J. R. Okajima authored
      
      
      Implement an internal list of opened files to allow deleting a branch
      which has an opened dir. Obviously I don't like such list.
      
      There was such list in linux as sb->s_files, but in linux-3.12 s_files
      became containing just a part of the opened files, and in linux-3.13 it
      was totally gone.
      Aufs still needs the file list, particularly for re-setting the branch
      attribute from RW to RO.
      After resetting to RO, aufs should return EROFS for write. In order to
      support such case, aufs keeps the late s_files and mark_files_ro()
      approach.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b6f19f4f
    • J. R. Okajima's avatar
      aufs: branch management, append and prepend · a8203e4e
      J. R. Okajima authored
      
      
      The interfaces to append and prepend branches.
      They are the variations of "branch add" in another commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a8203e4e
    • J. R. Okajima's avatar
      aufs: directory op · 83ab62e6
      J. R. Okajima authored
      
      
      This is very similar to file operations, including re-open after branch
      management. The major part of readdir(3) is split into another object
      called VDIR (in previous commit).
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      83ab62e6
    • J. R. Okajima's avatar
      aufs: atomic_open 5/5, the body aufs_atomic_open() · dc25d54f
      J. R. Okajima authored
      
      
      ->atomic_open() is another monster (the other is ->rename() of course).
      It operates look-up, create, and open in a single method.
      Strictly speaking the behaviour is not atomic from the branch fs's point
      of view, while the atomicity is kept in aufs's. This is a second-best
      approach. For details, refer to the design document in previous commit.
      
      A simple list 'si_aopen' can be put into aufs inode object, but I don't
      think it a good idea because the number of inodes is much larger than
      the number of super_block. If I put it into inode, it can be an
      un-interesting memory pressure I am afraid.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      dc25d54f
    • J. R. Okajima's avatar
      aufs: atomic_open 4/5, introduce au_aopen_or_create() · 47170bb7
      J. R. Okajima authored
      
      
      This new function au_aopen_or_create() tries calling branch fs's
      ->atomic_open() first. If it is not set, call vfs_create() instead.
      By putting this behaviour into aufs's add_simple(), many codes can be
      shared.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      47170bb7
    • J. R. Okajima's avatar
      aufs: atomic_open 3/5, pass h_file to au_do_open() · 51a7f431
      J. R. Okajima authored
      
      
      Extend do_open_dir(), au_do_open_nondir() and au_do_open() to receive an
      additional parameter h_file, which is an opened file object by branch
      fs's ->atomic_open(). By this design/commit, aufs doesn't have to
      duplicate many codes into a new aufs_atomic_open() (in later commit),
      and can simply share them.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      51a7f431
    • J. R. Okajima's avatar
      aufs: atomic_open 2/5, introduce vfsub_atomic_open() · be3f6e60
      J. R. Okajima authored
      
      
      Following the design in another commit, aufs calls branch fs's
      ->atomic_open() if exits. Ideally it would be better to call
      VFS:do_last, lookup_open() or atomic_open, but it is very hard for
      aufs. This implementation is far from the best.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      be3f6e60
    • J. R. Okajima's avatar
      aufs: file op · a3387b95
      J. R. Okajima authored
      
      
      Implement several f_op functions for non-dir.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a3387b95
    • J. R. Okajima's avatar
      aufs: file op, mmap · 8d319094
      J. R. Okajima authored
      
      
      For details, read the document in this commit.
      I don't like this approach, but there is no other way currently. But it
      seems that UnionMount is trying add siblings of f_dentry and d_inode for
      linux-4.0 or later. It may become another light for aufs too.
      
      The finfo object which has ever mmapped is excluded from
      refreshing (based upon fi_mmapped). Otherwise we may corrupt the process
      memory space.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      8d319094
    • J. R. Okajima's avatar
      aufs: file op, internal re-open when write · beeeb01e
      J. R. Okajima authored
      
      
      By default, aufs doesn't copy-up the file in open(2).
      The file write operation is one of the trigger of the copy-up.
      Although I understand that O_RDWR or O_WRONLY should trigger the
      copy-up, it is not a good idea for the case of open(O_RDWR) +
      mmap(MAP_PRIVATE). In this case, all changes are not written-back to the
      file on disk, and the copy-up is meaningless entirely.
      In other words, aufs postpone the copy-up as possible.
      
      This design also applies to the file operation after branch management.
      Some of the opened file need to be refreshed after add/del/mod
      branches. Eg. detect the revealed same named one, open it, close the old
      one internally while the virtual file is kept opened.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      beeeb01e
    • J. R. Okajima's avatar
      aufs: file op, open non-dir · b5bd8ebf
      J. R. Okajima authored
      
      
      Implement f_op->open() for non-directory.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b5bd8ebf
    • J. R. Okajima's avatar
      aufs: xattr and acl · a76ab411
      J. R. Okajima authored
      
      
      Support for XATTR and ACL including several branch attributes to ignore
      the copy error around XATTR and ACL.
      
      NFS always sets MS_POSIXACL regardless its mount option 'noacl.'
      When MS_POSIXACL is set, generic_permission() calls check_acl() (via
      acl_permission_check()) and gets -EOPNOTSUPP because the NFS branch is
      mounted as 'noacl.'
      In aufs, h_permission() should not call generic_permission() in this
      case.
      The similar thing happens in coping-up XATTR. vfs_getxattr_alloc()
      returns -EOPNOTSUPP.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a76ab411
    • J. R. Okajima's avatar
      aufs: udba=none has no ->getattr() 2/2, refresh inode_operations · e4dbdb8c
      J. R. Okajima authored
      
      
      An enhancement for udba=none if possible.
      The condition is same to the 'no ->d_revalidate()' patch series.
      Refresh i_op in all cached inodes at the remount-time.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      e4dbdb8c
    • J. R. Okajima's avatar
      aufs: udba=none has no ->getattr() 1/2, inode_operations array · 46b85e53
      J. R. Okajima authored
      
      
      An enhancement for udba=none if possible.
      The condition is same to the 'no ->d_revalidate()' patch series.
      In this mode, i_op->getattr() is less important, and the default VFS
      handler is enough.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      46b85e53
    • J. R. Okajima's avatar
      aufs: inode op, get/set attributes · 25f10296
      J. R. Okajima authored
      
      
      Implement i_op->get/setattr().
      setattr() is another trigger of the copy-up. The file may or may not be
      opened. And some of sub-functions are commonly used to XATTR operations
      in later commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      25f10296
    • J. R. Okajima's avatar
      aufs: inode op, add, link · 39391364
      J. R. Okajima authored
      
      
      Implement i_op->link().
      As aufs supports 'pseudo-link', aufs_link() can make it without
      copying-up. In the case of aufs_link() has to copy-up, the name of the
      target file is used as-is, and it is pseudo-linked. In other words,
      calling link(2) after the copy-up is unnecessary.
      
      See also struct.txt in previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      39391364
    • J. R. Okajima's avatar
      aufs: inode op, rename 2/2, body · 2902df92
      J. R. Okajima authored
      
      
      Implement i_op->rename().
      This is a big monster and I don't like it.
      
      In this version, the copy-up always happen.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2902df92