- 18 Jun, 2020 1 commit
-
-
Mauricio Faria de Oliveira authored
The 'struct inode.i_readcount' field is maintained at the VFS, and should not be modified by filesystems. But aufs does in one place, which causes it to be unbalanced. This started with Linux v2.6.39 commit 890275b5eb79 ("IMA: maintain i_readcount in the VFS layer"), which moved the i_readcount updates from IMA into the VFS (at the same places IMA was called previously) and introduced 'mutex_lock(i_mutex)' in the ima_file_check() path. The former change is functionally equivalent, thus no changes are needed in response to it. The latter change, on the other hand, is _not_; and is reported to cause a deadlock in aufs (see below), thus it dropped the call to ima_file_check(). However, when dropping the ima_file_check() call, aufs introduced the i_readcount_inc() call as well, which according to the commit changes is not necessary. This can be observed in aufs2-standalone.git commit 1dbd1c864e455 ("aufs2.1 standalone version for linux-2.6."), announced to the aufs-users mailing list on 2011-04-04 [1]. diff --git a/ChangeLog b/ChangeLog ... +commit 17eac367b03334e57a93e8051eb712add24d2534 +Author: J. R. Okajima <hooanon05@yahoo.co.jp> +Date: Fri Apr 1 16:31:22 2011 +0900 + + aufs: for 2.6.39, limit the support for IMA + + Since it acquires i_mutex and causes a deadlock, replace a + ima_file_check() call by i_readcount_inc(). + + Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp> ... diff --git a/fs/aufs/vfsub.c b/fs/aufs/vfsub.c ... struct file *vfsub_dentry_open(struct path *path, int flags) ... + if (!IS_ERR_OR_NULL(file) + && (file->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) + i_readcount_inc(path->dentry->d_inode); - err = ima_file_check(file, au_conv_oflags(flags)); ... Apparently, this might have been a misunderstanding of one hunk in the 2.6.39 commit, that deletes the lines to increment i_readcount, and adds the lines to acquire i_mutex. It reuses code from the removed function ima_counts_get() to create ima_rdwr_violation_check(), and another hunk calls the new function from ima_file_check(). But note that the i_readcount increment was _not_ called from ima_file_check() previously, via ima_counts_get(): -void ima_counts_get(struct file *file) +static void ima_rdwr_violation_check(struct file *file) { ... + mutex_lock(&inode->i_mutex); /* file metadata: permissions, xattr */ ... - atomic_inc(&inode->i_readcount); @@ -318,6 +308,7 @@ int ima_file_check(struct file *file, int mask) ... + ima_rdwr_violation_check(file); So, in order to avoid the unbalance caused to i_readcount, drop the i_readcount_inc() call. Note the issue is not the lack of a corresponding i_readcount_dec() call; it's the mere usage of these functions outside of VFS layer, where i_readcount is maintained. Links: [1] https://sourceforge.net/p/aufs/mailman/message/27304125/ snippet: """ aufs2 Monday GIT release From: <sfjro@us...> - 2011-04-04 04:59:18 o news - begin supporting linux-2.6.39-rcN. ... - aufs2-2.6.git#aufs2.1 branch ... aufs: for 2.6.39, limit the support for IMA ... """ Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> (cherry picked from commit 515a586eeef31e0717d5dea21e2c11a965340b3c)
-
- 22 Jan, 2020 1 commit
-
-
J. R. Okajima authored
Signed-off-by: J. R. Okajima <hooanon05g@gmail.com> (cherry picked from commit 56b4b776c84f364384971c4cdfd66b4a2f0696d4)
-
- 09 Mar, 2019 20 commits
-
-
J. R. Okajima authored
Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Fuse doesn't want the callers to access the inode attributes without issuing stat, and it is not assured that they are valid after lookup or iget(). The inode attribute is critical for aufs, and aufs decided to call stat every time for fuse. Of course, it makes aufs slow. But when the branch fs is not fuse, stat is not called. Currently, only FUSE implements ->poll(), and aufs supports it. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Following the design in another commit, aufs calls branch fs's ->atomic_open() if exits. Ideally it would be better to call VFS:do_last, lookup_open() or atomic_open, but it is very hard for aufs. This implementation is far from the best. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Implement several f_op functions for non-dir. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Support for XATTR and ACL including several branch attributes to ignore the copy error around XATTR and ACL. NFS always sets MS_POSIXACL regardless its mount option 'noacl.' When MS_POSIXACL is set, generic_permission() calls check_acl() (via acl_permission_check()) and gets -EOPNOTSUPP because the NFS branch is mounted as 'noacl.' In aufs, h_permission() should not call generic_permission() in this case. The similar thing happens in coping-up XATTR. vfs_getxattr_alloc() returns -EOPNOTSUPP. See also the document in this commit. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Implement i_op->rename(). This is a big monster and I don't like it. In order to call d_move() in aufs lock section, FS_RENAME_DOES_D_MOVE is set to fstype.f_flags. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Implement basic sb_op->show_options(), statfs() and sync_fs() simply. - show_options() doesn't print the default values (AuOpt_Def). - statfs() will have an option to summarize the numbers from branches (in later commit). Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
This is the hardest test to support UDBA (users' direct branch access). It uses 'fsnotify' internally. Detecting UDBA, decrements the generation of the cached aufs objects. In the next access to the file, aufs detects the generation is obsoleted and tries refreshing it. Eventually aufs cache will be updated to latest status. The fsnotify is set on the cached dirs on the non-RR branches. The RR (real readonly) branches will never be modified and it is unnecessary to set fsnotify for them. This commit is for the declarations mainly, and the body parts will be in succeeding commits. This feature is compiled only when CONFIG_AUFS_HNOTIFY is enabled. See also the document in this commit. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Implement exporting via NFS. The file handle is rather large (40 bytes at most + the file handle on a branch). The non-virtual filesystems can use an anonymous (disconnected) dentry as long as the inode is identified, but aufs needs a dentry with dinfo which is usually constructed. So aufs has to find or generate the normal dentry from the file handle in decoding. Eg. in aufs, there should never be the anonymous dentry. In decoding the file handle, if both of the dentry and the inode which are corresponding the file handle are still in cache, then they are returned immediately. Otherwise aufs has to find the cached parent dir from the file handle. If the parent dir is not cached either, the aufs tries these steps. - decode the branch fs's file handle and get the parent dir - generate the path of the parent dir on the branch - convert the branch path to aufs's path - lookup the inode number under the aufs' path The last one is the slowest case. exportfs_decode_fh() (actually reconnect_path()) acquires mutex, and this behaviour violates the locking order between aufs si_rwsem. This is not a problem since internal exportfs_decode_fh() is called for the branch fs. Simply use lockdep_off/on to silence the lockdep message. See also the document in later commit. This is compiled only when CONFIG_AUFS_EXPORT is enabled. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Aufs can have multiple writable branches, and there are several policies to select one among them. This commit implements default "top-down-parent" for both of creating-policy and copyup-policy. See also the document in this commit. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
The functions for - create the copy-up target file - copy filedata - copy metadata In copying filedata, I had tried splice_direct() instead of repeating read/write. Surprisingly, I could not see a big difference. So let's keep this approach for a while. Someday SEEK_DATA/SEEK_HOLE become more popular, it may help optimizing this read/write. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
The internal file read/write for copy-up in kernelspace. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Copy the inode attributes between branches. See also the document in this commit. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
To create/delete/rename files including copy-up, aufs acquires several locks on the branch fs internally. These lock/unlock operations are consolidated into struct au_pin in this commit. au_pin handles - LOCKDEP class - re-validate/verify - suspend/resume HNOTIFY See also lookup.txt in later commit. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Actually prepare the whiteout bases on the adding writable branch. For details, refer to previous commit. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
The writable branch prepares a few files and dirs for whiteouts. For branch filesystems which doesn't support link(2), there is "nolwh" attribute. On the branch which is specified this attribute, aufs never try link(2) for whitout and always creat(2) it. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
Aufs pseudo-link (plink) represents a virtual hardlink across the branches. To implement the plink maintenance mode, aufs uses procfs. See also the document in this commit. There is an external user-space utility called 'auplink' in aufs-util.git, which has these features. - 'list' shows the pseudo-linked inode numbers and filenames. - 'cpup' copies-up all pseudo-link to the writable branch. - 'flush' calls 'cpup', and then 'mount -o remount,clean_plink=inum' Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
XINO and XIB files are read and written frequently after unlinked, and it means that the remote filesystems are not suitable for them. Additionally aufs shows their metadata via debugfs (in later commit). To make it easier to do this, aufs expects branch filesystems to maintain their i_size and i_blocks. And it means some filesystem are not suitable for XINO. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
XINO and XIB files are to maintain the inode numbers in aufs (cf. struct.txt and aufs manual in aufs-util.git). XINO file contains just a sequence of the inode numbers, and their offset in the file is real_inum x sizeof(inum). So the size is limited by s_maxbytes of the filesystem where XINO file is located. In order to support the larger inum, aufs stores XINO files as an internal array. Sometimes the size of XINO file can be a problem, ie. too big, particularly when XINO files are located on tmpfs. In this case, another separate patch tmpfs-ino.patch in aufs4-standalone.git is recommended (as well as vfs-ino.patch). The patch makes tmpfs to maintain inode number within itself and suppress its discontiguous distribution. See also the document in next commit. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-
J. R. Okajima authored
For details, see previous commit. Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
-