03lookup.txt 5.23 KB
Newer Older
J. R. Okajima's avatar
J. R. Okajima committed
1

2
# Copyright (C) 2005-2020 Junjiro R. Okajima
3
4
5
6
7
8
9
10
11
12
13
14
15
# 
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# 
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
J. R. Okajima's avatar
J. R. Okajima committed
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Lookup in a Branch
----------------------------------------------------------------------
Since aufs has a character of sub-VFS (see Introduction), it operates
lookup for branches as VFS does. It may be a heavy work. But almost all
lookup operation in aufs is the simplest case, ie. lookup only an entry
directly connected to its parent. Digging down the directory hierarchy
is unnecessary. VFS has a function lookup_one_len() for that use, and
aufs calls it.

When a branch is a remote filesystem, aufs basically relies upon its
->d_revalidate(), also aufs forces the hardest revalidate tests for
them.
For d_revalidate, aufs implements three levels of revalidate tests. See
"Revalidate Dentry and UDBA" in detail.
J. R. Okajima's avatar
J. R. Okajima committed
31
32


J. R. Okajima's avatar
J. R. Okajima committed
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Test Only the Highest One for the Directory Permission (dirperm1 option)
----------------------------------------------------------------------
Let's try case study.
- aufs has two branches, upper readwrite and lower readonly.
  /au = /rw + /ro
- "dirA" exists under /ro, but /rw. and its mode is 0700.
- user invoked "chmod a+rx /au/dirA"
- the internal copy-up is activated and "/rw/dirA" is created and its
  permission bits are set to world readable.
- then "/au/dirA" becomes world readable?

In this case, /ro/dirA is still 0700 since it exists in readonly branch,
or it may be a natively readonly filesystem. If aufs respects the lower
branch, it should not respond readdir request from other users. But user
allowed it by chmod. Should really aufs rejects showing the entries
under /ro/dirA?

To be honest, I don't have a good solution for this case. So aufs
implements 'dirperm1' and 'nodirperm1' mount options, and leave it to
users.
When dirperm1 is specified, aufs checks only the highest one for the
directory permission, and shows the entries. Otherwise, as usual, checks
every dir existing on all branches and rejects the request.

As a side effect, dirperm1 option improves the performance of aufs
because the number of permission check is reduced when the number of
branch is many.


J. R. Okajima's avatar
J. R. Okajima committed
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
Revalidate Dentry and UDBA (User's Direct Branch Access)
----------------------------------------------------------------------
Generally VFS helpers re-validate a dentry as a part of lookup.
0. digging down the directory hierarchy.
1. lock the parent dir by its i_mutex.
2. lookup the final (child) entry.
3. revalidate it.
4. call the actual operation (create, unlink, etc.)
5. unlock the parent dir

If the filesystem implements its ->d_revalidate() (step 3), then it is
called. Actually aufs implements it and checks the dentry on a branch is
still valid.
But it is not enough. Because aufs has to release the lock for the
parent dir on a branch at the end of ->lookup() (step 2) and
->d_revalidate() (step 3) while the i_mutex of the aufs dir is still
held by VFS.
If the file on a branch is changed directly, eg. bypassing aufs, after
aufs released the lock, then the subsequent operation may cause
something unpleasant result.

This situation is a result of VFS architecture, ->lookup() and
->d_revalidate() is separated. But I never say it is wrong. It is a good
design from VFS's point of view. It is just not suitable for sub-VFS
character in aufs.

Aufs supports such case by three level of revalidation which is
selectable by user.
1. Simple Revalidate
   Addition to the native flow in VFS's, confirm the child-parent
   relationship on the branch just after locking the parent dir on the
   branch in the "actual operation" (step 4). When this validation
   fails, aufs returns EBUSY. ->d_revalidate() (step 3) in aufs still
   checks the validation of the dentry on branches.
2. Monitor Changes Internally by Inotify/Fsnotify
   Addition to above, in the "actual operation" (step 4) aufs re-lookup
   the dentry on the branch, and returns EBUSY if it finds different
   dentry.
   Additionally, aufs sets the inotify/fsnotify watch for every dir on branches
   during it is in cache. When the event is notified, aufs registers a
   function to kernel 'events' thread by schedule_work(). And the
   function sets some special status to the cached aufs dentry and inode
   private data. If they are not cached, then aufs has nothing to
   do. When the same file is accessed through aufs (step 0-3) later,
   aufs will detect the status and refresh all necessary data.
   In this mode, aufs has to ignore the event which is fired by aufs
   itself.
3. No Extra Validation
   This is the simplest test and doesn't add any additional revalidation
   test, and skip the revalidation in step 4. It is useful and improves
   aufs performance when system surely hide the aufs branches from user,
   by over-mounting something (or another method).