1. 09 Mar, 2019 36 commits
    • J. R. Okajima's avatar
      aufs: copy-up 6/7, directories · 377c9279
      J. R. Okajima authored
      
      
      Copy-up the ancestors of the target.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      377c9279
    • J. R. Okajima's avatar
      aufs: copy-up 5/7, simple interface · 253be9ad
      J. R. Okajima authored
      
      
      Because the copy-up operation is big and has many parameters and
      functions, consolidate them in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      253be9ad
    • J. R. Okajima's avatar
      aufs: copy-up 4/7, body · f63b4f2f
      J. R. Okajima authored
      
      
      The functions for
      - create the copy-up target file
      - copy filedata
      - copy metadata
      
      In copying filedata, I had tried splice_direct() instead of repeating
      read/write. Surprisingly, I could not see a big difference. So let's
      keep this approach for a while. Someday SEEK_DATA/SEEK_HOLE become more
      popular, it may help optimizing this read/write.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f63b4f2f
    • J. R. Okajima's avatar
      aufs: copy-up 3/7, internal file I/O · e989fe7b
      J. R. Okajima authored
      
      
      The internal file read/write for copy-up in kernelspace.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      e989fe7b
    • J. R. Okajima's avatar
      aufs: copy-up 2/7, internal lookup for the target · 3be12838
      J. R. Okajima authored
      
      
      Basically copy-up is done by these steps using au_pin (in another
      commit).
      - lock the target parent mutex
      - lookup a negative dentry with a whiteout-ed temporary unique name
      - create it
      - unlock the target parent mutex
      - copy filedata
      - copy metadata (inode attributes)
      - lock the target parent mutex
      - rename the temporary name to the target name
      - unlock the target parent mutex
      
      This commit contains step2 mainly.
      I hope someday aufs uses O_TMPFILE for this.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      3be12838
    • J. R. Okajima's avatar
      aufs: copy-up 1/7, attributes · f7f1bacc
      J. R. Okajima authored
      
      
      Copy the inode attributes between branches.
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f7f1bacc
    • J. R. Okajima's avatar
      aufs: pin or lock the parent dir and the child on a branch · 9f923257
      J. R. Okajima authored
      
      
      To create/delete/rename files including copy-up, aufs acquires several
      locks on the branch fs internally. These lock/unlock operations are
      consolidated into struct au_pin in this commit.
      au_pin handles
      - LOCKDEP class
      - re-validate/verify
      - suspend/resume HNOTIFY
      
      See also lookup.txt in later commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      9f923257
    • J. R. Okajima's avatar
      aufs: native/real readonly branch · 629cbe32
      J. R. Okajima authored
      
      
      Some filesystems are natively readonly. And aufs can make a few
      optimization for them. This new attribute tells aufs.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      629cbe32
    • J. R. Okajima's avatar
      aufs: writable branch 3/3, diropq · 0ba3f959
      J. R. Okajima authored
      
      
      The functions to create/delete the opaque directory marker (called
      'diropq') on the added writable branch.
      For details, refer to previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      0ba3f959
    • J. R. Okajima's avatar
      aufs: writable branch 2/3, body · 59ad1975
      J. R. Okajima authored
      
      
      Actually prepare the whiteout bases on the adding writable branch.
      For details, refer to previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      59ad1975
    • J. R. Okajima's avatar
      aufs: writable branch 1/3, white-out · 72a970fc
      J. R. Okajima authored
      
      
      The writable branch prepares a few files and dirs for whiteouts.
      For branch filesystems which doesn't support link(2), there is "nolwh"
      attribute. On the branch which is specified this attribute, aufs never
      try link(2) for whitout and always creat(2) it.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      72a970fc
    • J. R. Okajima's avatar
      aufs: 'wh' attribute for RO branch · f5f997d4
      J. R. Okajima authored
      
      
      While the whiteout on the writable branch have its effect
      unconditionally (in latter commit), the one on the readonly branch can
      have its effect only when this attribute is specified explicitly.
      For the branch attributes, refer to the manual in aufs-util.git.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f5f997d4
    • J. R. Okajima's avatar
      aufs: enter/leave flag per task · 32bfe3d0
      J. R. Okajima authored
      
      
      In freeing aufs iinfo objects, it acquires the internal rw_sem (see
      another commit in detail). Since iinfo can be freed anytime, a deadlock
      may happen due to the rw_sem. To prevent this problem, this commit
      introduces a flag per task.
      This is another (very) ugly approach which I don't like.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      32bfe3d0
    • J. R. Okajima's avatar
      aufs: pseudo-link and procfs support · c7ae8357
      J. R. Okajima authored
      
      
      Aufs pseudo-link (plink) represents a virtual hardlink across the
      branches. To implement the plink maintenance mode, aufs uses procfs.
      See also the document in this commit.
      
      There is an external user-space utility called 'auplink' in
      aufs-util.git, which has these features.
      - 'list' shows the pseudo-linked inode numbers and filenames.
      - 'cpup' copies-up all pseudo-link to the writable branch.
      - 'flush' calls 'cpup', and then 'mount -o remount,clean_plink=inum'
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c7ae8357
    • J. R. Okajima's avatar
      aufs: sbinfo list · a35f55f4
      J. R. Okajima authored
      
      
      When user accesses aufs via other than fs related systemcalls, aufs
      needs to identify which superblock is the target.  Here is the trick.
      It is just a list of aufs superblocks.  Such way will be procfs and
      MagicSysRq key.  For MagicSysRq support, see the later commit.
      This is a dirty approach which I don't like, but I just don't have
      another idea.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a35f55f4
    • J. R. Okajima's avatar
      aufs: white-out 2/2, diropq · 6be3b122
      J. R. Okajima authored
      
      
      The marker to represent that the directory is opaque (stop digging
      down the branch stack) is implemented as a special whiteout.
      
      See also the document in previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      6be3b122
    • J. R. Okajima's avatar
      aufs: white-out 1/2 · 2a7e7277
      J. R. Okajima authored
      
      
      The whiteout represents a logical deletion.
      Although the document in this commit mentioned about rmdir(2) and
      rename(2) for dir, this commit doesn't contain such functions. They will
      be added in later commits.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2a7e7277
    • J. R. Okajima's avatar
      aufs: xino truncation · 2a150b32
      J. R. Okajima authored
      
      
      As mentioned earlier, sometimes the size of XINO file is a problem.
      Aufs has a feature to truncate it asynchronously using workqueue. But it
      may not be so effective in some cases, and you may want to stop
      discontiguous distribution of the inode numbers on branch fs.
      See also the log in another commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2a150b32
    • J. R. Okajima's avatar
      aufs: sysfs interface · a3adef77
      J. R. Okajima authored
      
      
      The branch path can be much longer and it is not suitable to print via
      /proc/mounts as a part of mount options. Aufs can show it either
      separately via sysfs or /proc/mounts (as a part of mount options).
      This approach affects the lifetime of aufs objects and sbinfo contains
      kobject (in another commit). Theoretically user can disable
      CONFIG_SYSFS, but the lifetime management is always necessary. So
      supporting sysfs is split into two files, sysaufs.c and sysfs.c.
      sysaufs.c is always compiled, but sysfs.c is compiled only when
      CONFIG_SYSFS is enabled.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a3adef77
    • J. R. Okajima's avatar
      aufs: xino 2/2, callers · 8fe49c5d
      J. R. Okajima authored
      
      
      XINO and XIB files are read and written frequently after unlinked, and
      it means that the remote filesystems are not suitable for them.
      Additionally aufs shows their metadata via debugfs (in later commit).
      To make it easier to do this, aufs expects branch filesystems to
      maintain their i_size and i_blocks. And it means some filesystem are not
      suitable for XINO.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      8fe49c5d
    • J. R. Okajima's avatar
      aufs: xino 1/2, core · 6fe05098
      J. R. Okajima authored
      
      
      XINO and XIB files are to maintain the inode numbers in aufs
      (cf. struct.txt and aufs manual in aufs-util.git).
      
      XINO file contains just a sequence of the inode numbers, and their
      offset in the file is real_inum x sizeof(inum).  So the size is limited
      by s_maxbytes of the filesystem where XINO file is located.  In order to
      support the larger inum, aufs stores XINO files as an internal array.
      
      Sometimes the size of XINO file can be a problem, ie. too big,
      particularly when XINO files are located on tmpfs. In this case, another
      separate patch tmpfs-ino.patch in aufs4-standalone.git is recommended
      (as well as vfs-ino.patch). The patch makes tmpfs to maintain inode
      number within itself and suppress its discontiguous distribution.
      
      See also the document in next commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      6fe05098
    • J. R. Okajima's avatar
      aufs: workqueue · f04356cb
      J. R. Okajima authored
      
      
      Aufs uses the workqueue both synchronously and asynchronously.
      For sync-use-case, aufs uses its own specific wkq since doesn't want to
      be disturbed by other tasks on the system. For async-use-case, aufs uses
      the system global workqueue.
      Aufs has to prevent itself to being unmounted during the async-task is
      queued.
      
      See also the document in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f04356cb
    • J. R. Okajima's avatar
      aufs: debug print sbinfo and branch · 2460d8e6
      J. R. Okajima authored
      
      
      Print various info about aufs branch and superblock. This feature is
      enabled when CONFIG_AUFS_DEBUG and the module parameter 'debug' are set.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      2460d8e6
    • J. R. Okajima's avatar
      aufs: readonly branch 2/2, callers · 66bc346d
      J. R. Okajima authored
      
      
      For details, see previous commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      66bc346d
    • J. R. Okajima's avatar
      aufs: readonly branch 1/2, definition · b7051459
      J. R. Okajima authored
      
      
      The branch object is managed by the sbinfo object as an element of its
      internal array. The iinfo and dinfo objects contain the branch id, and
      it will be used to implement the correct order in branch management
      (add/del).
      
      See also the documents in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      b7051459
    • J. R. Okajima's avatar
      aufs: infos generation 2/2 · f2ade007
      J. R. Okajima authored
      
      
      The generation of iinfo and dinfo inherit sbinfo's.
      Also iinfo generation tracks the branch inode's generation to test the
      matching after the branch management.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      f2ade007
    • J. R. Okajima's avatar
      aufs: infos generation 1/2, dcsub · fa25dbe6
      J. R. Okajima authored
      
      
      The generation of aufs objects will be updated by the branch
      management (add/del branches, by later commit), and aufs will refresh
      the objects based upon the generation.
      In refreshing dinfos, aufs will find all ancestors of the given dentry
      and store the pointers in dynamically allocated memory. I don't think it
      beautiful, but I don't have any other idea.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      fa25dbe6
    • J. R. Okajima's avatar
      aufs: sbinfo core · 3742849b
      J. R. Okajima authored
      
      
      The structure is very similar to inode and dentry infos (in previous
      commits), but the internal array is for 'struct au_branch' instead of
      'superblock.' Additionally the lifetime of 'struct au_sbinfo' is managed
      by kobject since it will be connected to sysfs by later commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      3742849b
    • J. R. Okajima's avatar
      aufs: dinfo core · ee166183
      J. R. Okajima authored
      
      
      The structure is very similar to aufs inode info (in previous commit).
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      ee166183
    • J. R. Okajima's avatar
      aufs: debug print · 786c90cb
      J. R. Okajima authored
      
      
      Print various info about aufs inode. This feature is enabled when
      CONFIG_AUFS_DEBUG and the module parameter 'debug' are set.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      786c90cb
    • J. R. Okajima's avatar
      aufs: module parameter, debug · 6011ea2b
      J. R. Okajima authored
      
      
      This parameter is available only when CONFIG_AUFS_DEBUG is enabled, and
      its default value is 0. Setting 1 will enable the various verifications
      and debug outputs.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      6011ea2b
    • J. R. Okajima's avatar
      aufs: iinfo, debug by rwsem · c74884be
      J. R. Okajima authored
      
      
      This is a very old debugging routine for rw_semaphore I was using
      privately and less meaningful to other people. It was (probably) before
      LOCKDEP feature was introduced, but now it is based upon LOCKDEP. This
      is compiled when CONFIG_AUFS_DEBUG is enabled.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c74884be
    • J. R. Okajima's avatar
      aufs: iinfo core · a3d2caf2
      J. R. Okajima authored
      
      
      See the documents in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      a3d2caf2
    • J. R. Okajima's avatar
      aufs: kmalloc/kfree wrappers · c7926831
      J. R. Okajima authored
      
      
      Very basic, simple and stupid wrappers for kmalloc family.
      It tries kfree_ruc() as possible, with hoping a better performance.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      c7926831
    • J. R. Okajima's avatar
      aufs: intro, public header · 78ac643d
      J. R. Okajima authored
      
      
      A header file for both of kernelspace and userspace.
      
      For the new file fs/aufs/Kconfig, the maximum number of branches is
      customizable, and it determines the type (size) of 'aufs_bindex_t.' The
      type is always 'signed.' If we made it 'unsigned,' then more branches
      would be available. But generally I think 127 (default) is enough and it
      won't be a big issue.
      
      For those who wants more than 127 branches, other values are
      available. But we should care the size of the internal pointer arrays,
      and it is good for the performance to keep it in a page at most.
      AUFS_BRANCH_MAX_511 is mainly for 64bit systems which limits the
      internal array size less than 4k (511 x 8bytes < 4k). Similarly for
      32bit systems, AUFS_BRANCH_MAX_1023 (1023 x 4 bytes < 4k).
      
      See also the documents in this commit.
      Signed-off-by: default avatarJ. R. Okajima <hooanon05g@gmail.com>
      78ac643d
    • J. R. Okajima's avatar
      aufs5.0/00base begins · 187bcc89
      J. R. Okajima authored
      187bcc89