Discussion:
[PATCH -V4 00/11] New ACL format for better NFSv4 acl interoperability
Aneesh Kumar K.V
2010-09-24 12:48:03 UTC
Permalink
Hi,

The following set of patches implements VFS changes needed to implement
a new acl model for linux. Rich ACLs are an implementation of NFSv4 ACLs,
extended by file masks to fit into the standard POSIX file permission model.
They are designed to work seamlessly locally as well as across the NFSv4 and
CIFS/SMB2 network file system protocols.

The patch set consists of four parts:

The first set of patches, posted as a follow up, contains VFS changes needed
to implement the Rich ACL model. The second set [1] contains the Rich ACL model
and Ext4 implementation. The third set [2] contains mapping of Rich ACL to
NFSv4 ACL (how to apply file mask to access mask) and implementation of
Richacl ACL for NFS server and client. The fourth set [3] contains POSIX ACL
to Rich ACL mapping and its ext4 usage.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-minimal
[2] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-upstream
[3] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-fullset

A user-space utility for displaying and changing richacls is available at [4]
(a number of examples can be found at http://acl.bestbits.at/richacl/examples.html).

[4] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/richacl.git master

To test richacl on ext4 use -o richacl mount option. This mount option may later be
dropped in favour of a feature flag.

More details regarding richacl can be found at
http://acl.bestbits.at/richacl/

Changes from V3:
a) Droped may_delete and may_create inode operations callback and reworked
the patch series to use additional check flags.
b) Rebased to the latest kernel
c) The patch series now contain only the minimal VFS changes.

Changes from V2:
1) Git repo include check-acl branch that drop newly added inode_operations
callback in favour for additional access check flags (MAY_CREATE_FILE,
MAY_CREATE_DIR, MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_TAKE_OWNERSHIP,
MAY_CHMOD, and MAY_SET_TIMES)
2) richacl is now cached in the vfs inode instead of file system inode.
(currently kept as a separate patch. We may want to fold that later)
3) Added a new acl flag ACL4_MASKED. richacl_apply_masks() can skip transforming acls
without this flag, which speeds things up and avoids modifying those acls unnecessarily.
4) Owner always allowed permissions are now explicitly included when synthesizing an acl
from file mode.

Changes from V1:
1) Split the patches into smaller patches
2) Added extensive documentation to the patches.

-aneesh
Aneesh Kumar K.V
2010-09-24 12:48:04 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 24896e8..c5920b5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -200,7 +200,7 @@ static int acl_permission_check(struct inode *inode, int mask,
/**
* generic_permission - check for access rights on a Posix-like filesystem
* @inode: inode to check access rights for
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
* @check_acl: optional callback to check for Posix ACLs
*
* Used to check for read/write/execute permissions on a file.
@@ -242,7 +242,7 @@ int generic_permission(struct inode *inode, int mask,
/**
* inode_permission - check for access rights to a given inode
* @inode: inode to check permission on
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
*
* Used to check for read/write/execute permissions on an inode.
* We use "fsuid" for this, letting us set arbitrary permissions
@@ -288,7 +288,7 @@ int inode_permission(struct inode *inode, int mask)
/**
* file_permission - check for additional access rights to a given file
* @file: file to check access rights for
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
*
* Used to check for read/write/execute permissions on an already opened
* file.
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:05 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Some file permission models differentiate between writing to a file
(MAY_WRITE) and appending to it (MAY_WRITE | MAY_APPEND). Pass all the
mask flags down to iop->check_acl so that filesystems can distinguish
between writing and appending.

All users of iop->check_acl pass the mask value back into
posix_acl_permission(); strip off the additional mask flags there.

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 4 +---
fs/posix_acl.c | 2 ++
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index c5920b5..7cadd07 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,8 +174,6 @@ static int acl_permission_check(struct inode *inode, int mask,
{
umode_t mode = inode->i_mode;

- mask &= MAY_READ | MAY_WRITE | MAY_EXEC;
-
if (current_fsuid() == inode->i_uid)
mode >>= 6;
else {
@@ -192,7 +190,7 @@ static int acl_permission_check(struct inode *inode, int mask,
/*
* If the DACs are ok we don't need any capability check.
*/
- if ((mask & ~mode) == 0)
+ if ((mask & (MAY_READ | MAY_WRITE | MAY_EXEC) & ~mode) == 0)
return 0;
return -EACCES;
}
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 39df95a..c60bddf 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -213,6 +213,8 @@ posix_acl_permission(struct inode *inode, const struct posix_acl *acl, int want)
const struct posix_acl_entry *pa, *pe, *mask_obj;
int found = 0;

+ want &= MAY_READ | MAY_WRITE | MAY_EXEC;
+
FOREACH_ACL_ENTRY(pa, acl, pe) {
switch(pa->e_tag) {
case ACL_USER_OBJ:
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:06 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 7cadd07..bf822b8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -167,7 +167,7 @@ EXPORT_SYMBOL(putname);
#endif

/*
- * This does basic POSIX ACL permission checking
+ * This does the basic permission checking
*/
static int acl_permission_check(struct inode *inode, int mask,
int (*check_acl)(struct inode *inode, int mask))
@@ -212,7 +212,7 @@ int generic_permission(struct inode *inode, int mask,
int ret;

/*
- * Do the basic POSIX ACL permission checks.
+ * Do the basic permission checks.
*/
ret = acl_permission_check(inode, mask, check_acl);
if (ret != -EACCES)
@@ -246,6 +246,8 @@ int generic_permission(struct inode *inode, int mask,
* We use "fsuid" for this, letting us set arbitrary permissions
* for filesystem access without changing the "normal" uids which
* are used for other things.
+ *
+ * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
*/
int inode_permission(struct inode *inode, int mask)
{
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:07 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

When IS_POSIXACL() is true, the vfs does not apply the umask. Other acl
models will need the same exception, so introduce a separate IS_ACL()
test.

The IS_POSIX_ACL() test is still needed so that nfsd can determine when
the underlying file system supports POSIX ACLs (as opposed to some other
kind).

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 6 +++---
fs/nfs/nfs4proc.c | 2 +-
include/linux/fs.h | 8 +++++++-
3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index bf822b8..855b360 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1486,7 +1486,7 @@ static int __open_namei_create(struct nameidata *nd, struct path *path,
int error;
struct dentry *dir = nd->path.dentry;

- if (!IS_POSIXACL(dir->d_inode))
+ if (!IS_ACL(dir->d_inode))
mode &= ~current_umask();
error = security_path_mknod(&nd->path, path->dentry, mode, 0);
if (error)
@@ -2009,7 +2009,7 @@ SYSCALL_DEFINE4(mknodat, int, dfd, const char __user *, filename, int, mode,
error = PTR_ERR(dentry);
goto out_unlock;
}
- if (!IS_POSIXACL(nd.path.dentry->d_inode))
+ if (!IS_ACL(nd.path.dentry->d_inode))
mode &= ~current_umask();
error = may_mknod(mode);
if (error)
@@ -2086,7 +2086,7 @@ SYSCALL_DEFINE3(mkdirat, int, dfd, const char __user *, pathname, int, mode)
if (IS_ERR(dentry))
goto out_unlock;

- if (!IS_POSIXACL(nd.path.dentry->d_inode))
+ if (!IS_ACL(nd.path.dentry->d_inode))
mode &= ~current_umask();
error = mnt_want_write(nd.path.mnt);
if (error)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 089da5b..c77dbbf 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2042,7 +2042,7 @@ nfs4_atomic_open(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
if (nd->flags & LOOKUP_CREATE) {
attr.ia_mode = nd->intent.open.create_mode;
attr.ia_valid = ATTR_MODE;
- if (!IS_POSIXACL(dir))
+ if (!IS_ACL(dir))
attr.ia_mode &= ~current_umask();
} else {
open_flags &= ~O_EXCL;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 63d069b..485c697 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -199,7 +199,7 @@ struct inodes_stat_t {
#define MS_VERBOSE 32768 /* War is peace. Verbosity is silence.
MS_VERBOSE is deprecated. */
#define MS_SILENT 32768
-#define MS_POSIXACL (1<<16) /* VFS does not apply the umask */
+#define MS_POSIXACL (1<<16) /* Supports POSIX ACLs */
#define MS_UNBINDABLE (1<<17) /* change to unbindable */
#define MS_PRIVATE (1<<18) /* change to private */
#define MS_SLAVE (1<<19) /* change to slave */
@@ -270,6 +270,12 @@ struct inodes_stat_t {
#define IS_SWAPFILE(inode) ((inode)->i_flags & S_SWAPFILE)
#define IS_PRIVATE(inode) ((inode)->i_flags & S_PRIVATE)

+/*
+ * IS_ACL() tells the VFS to not apply the umask
+ * and use iop->check_acl for acl permission checks when defined.
+ */
+#define IS_ACL(inode) __IS_FLG(inode, MS_POSIXACL)
+
/* the read-only stuff doesn't really belong here, but any other place is
probably as bad and I don't want to create yet another include file. */
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:08 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Introduce a new MS_RICHACL super-block flag and a new IS_RICHACL() test
which file systems like nfs can use. IS_ACL() is true if IS_POSIXACL()
or IS_RICHACL() is true.

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
include/linux/fs.h | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 485c697..d86e77c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -208,6 +208,7 @@ struct inodes_stat_t {
#define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */
#define MS_I_VERSION (1<<23) /* Update inode I_version field */
#define MS_STRICTATIME (1<<24) /* Always perform atime updates */
+#define MS_RICHACL (1<<25) /* Supports richacls */
#define MS_BORN (1<<29)
#define MS_ACTIVE (1<<30)
#define MS_NOUSER (1<<31)
@@ -264,6 +265,7 @@ struct inodes_stat_t {
#define IS_APPEND(inode) ((inode)->i_flags & S_APPEND)
#define IS_IMMUTABLE(inode) ((inode)->i_flags & S_IMMUTABLE)
#define IS_POSIXACL(inode) __IS_FLG(inode, MS_POSIXACL)
+#define IS_RICHACL(inode) __IS_FLG(inode, MS_RICHACL)

#define IS_DEADDIR(inode) ((inode)->i_flags & S_DEAD)
#define IS_NOCMTIME(inode) ((inode)->i_flags & S_NOCMTIME)
@@ -274,7 +276,7 @@ struct inodes_stat_t {
* IS_ACL() tells the VFS to not apply the umask
* and use iop->check_acl for acl permission checks when defined.
*/
-#define IS_ACL(inode) __IS_FLG(inode, MS_POSIXACL)
+#define IS_ACL(inode) __IS_FLG(inode, MS_POSIXACL | MS_RICHACL)

/* the read-only stuff doesn't really belong here, but any other place is
probably as bad and I don't want to create yet another include file. */
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:09 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

if CONFIG_FS_RICHACL is not defined optimize out
the ACL check function.

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/Kconfig | 4 ++++
include/linux/fs.h | 5 +++++
2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 3d18530..cd283dc 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -39,6 +39,10 @@ config FS_POSIX_ACL
bool
default n

+config FS_RICHACL
+ bool
+ default n
+
source "fs/xfs/Kconfig"
source "fs/gfs2/Kconfig"
source "fs/ocfs2/Kconfig"
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d86e77c..26fc8ae 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -265,7 +265,12 @@ struct inodes_stat_t {
#define IS_APPEND(inode) ((inode)->i_flags & S_APPEND)
#define IS_IMMUTABLE(inode) ((inode)->i_flags & S_IMMUTABLE)
#define IS_POSIXACL(inode) __IS_FLG(inode, MS_POSIXACL)
+
+#ifdef CONFIG_FS_RICHACL
#define IS_RICHACL(inode) __IS_FLG(inode, MS_RICHACL)
+#else
+#define IS_RICHACL(inode) 0
+#endif

#define IS_DEADDIR(inode) ((inode)->i_flags & S_DEAD)
#define IS_NOCMTIME(inode) ((inode)->i_flags & S_NOCMTIME)
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:10 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 855b360..b0b8a71 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,6 +174,12 @@ static int acl_permission_check(struct inode *inode, int mask,
{
umode_t mode = inode->i_mode;

+ if (IS_RICHACL(inode)) {
+ int error = check_acl(inode, mask);
+ if (error != -EAGAIN)
+ return error;
+ }
+
if (current_fsuid() == inode->i_uid)
mode >>= 6;
else {
--
1.7.0.4
Jeff Layton
2010-09-24 15:50:49 UTC
Permalink
On Fri, 24 Sep 2010 18:18:10 +0530
Post by Aneesh Kumar K.V
From: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 855b360..b0b8a71 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,6 +174,12 @@ static int acl_permission_check(struct inode *inode, int mask,
{
umode_t mode = inode->i_mode;
+ if (IS_RICHACL(inode)) {
+ int error = check_acl(inode, mask);
+ if (error != -EAGAIN)
+ return error;
+ }
+
if (current_fsuid() == inode->i_uid)
mode >>= 6;
else {
This may just be my own ignorance of ACL semantics and unfamiliarity
with the ACL code in general. It seems a bit unusual though...

Just to be clear...this patch implies that with richacls you can deny
or grant access to the owner of the file even if the mode bits say
otherwise. With POSIX acls, this seems to be the other way around.

Hmm....guess I should read the spec...
--
Jeff Layton <jlayton at redhat.com>
Aneesh Kumar K. V
2010-09-24 18:55:51 UTC
Permalink
Post by Jeff Layton
On Fri, 24 Sep 2010 18:18:10 +0530
Post by Aneesh Kumar K.V
From: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 855b360..b0b8a71 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,6 +174,12 @@ static int acl_permission_check(struct inode *inode, int mask,
{
umode_t mode = inode->i_mode;
+ if (IS_RICHACL(inode)) {
+ int error = check_acl(inode, mask);
+ if (error != -EAGAIN)
+ return error;
+ }
+
if (current_fsuid() == inode->i_uid)
mode >>= 6;
else {
This may just be my own ignorance of ACL semantics and unfamiliarity
with the ACL code in general. It seems a bit unusual though...
Just to be clear...this patch implies that with richacls you can deny
or grant access to the owner of the file even if the mode bits say
otherwise. With POSIX acls, this seems to be the other way around.
Hmm....guess I should read the spec...
To be POSIX compatible we need to ensure that additional file access
control mechanisms may only further restrict the access permissions defined
by the file permission bits. So with rich acl, similar to POSIX ACL,
we use file mask as an upper bound of the acess permission
allowed. Unlike POSIX ACL where the 'owner' and 'other' ACL entry access
mask is kept in sync with mode bits, rich acl include a file mask even
for 'owner' and 'everyone' entries.

The patch that gives more details about the permission check algo can be
found at

http://git.kernel.org/?p=linux/kernel/git/agruen/linux-2.6-richacl.git;a=commitdiff;h=442c675aeac85cfc893a2ec600f7fb3da3951177;hp=02456437cf280838a50ef00d7b4df2e7179fe6b2

-aneesh
Andreas Gruenbacher
2010-09-27 13:03:49 UTC
Permalink
Post by Aneesh Kumar K. V
To be POSIX compatible we need to ensure that additional file access
control mechanisms may only further restrict the access permissions defined
by the file permission bits.
That's true but I don't think it fully answers Jeff's question.

With POSIX ACLs, the owner file permission bits are always identical to the
permissions that the owner is granted through the ACL. Therefore, when
acl_permission_check() is invoked on behalf of the owner, the ACL does not
need to be consulted at all. For non-owners, the ACL always needs to be
checked. This optimization is also true for richacls for the base permissions
(read, write, execute), but:

* Some permissions are more fine-grained than the file mode permission
bits: richacls distinguish between write and append, and between creating
directories and non-directories.

* Some permissions go beyond what the owner is implicitly allowed or what can
be expressed with read, write, execute: in a richacl, a user can be granted
the right to delete a specific file even without write access to the
containing directory and to take ownership of a file

(* In addition, a richacl can grant the right to chmod and set the acl of a
file, and to explicitly set the file timestamps. These are permissions
which the owner is implicitly allowed anyway, so they are not relevant to
this change to acl_permission_check().)

To handle those cases correctly too, we always look at the acl for richacls,
even for the owner. (We could still skip the acl check in some, but fewer,
cases.)

Thanks,
Andreas
Aneesh Kumar K.V
2010-09-24 12:48:11 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Some permission models distinguish between the permission to create a
non-directory and a directory. Pass this information down to
inode_permission() as mask flags

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 21 ++++++++++++---------
include/linux/fs.h | 2 ++
2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index b0b8a71..ed786b2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -253,7 +253,8 @@ int generic_permission(struct inode *inode, int mask,
* for filesystem access without changing the "normal" uids which
* are used for other things.
*
- * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
+ * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
+ * MAY_WRITE must also be set in @mask.
*/
int inode_permission(struct inode *inode, int mask)
{
@@ -1337,13 +1338,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
* 3. We should have write and exec permissions on dir
* 4. We can't do it if dir is immutable (done in permission())
*/
-static inline int may_create(struct inode *dir, struct dentry *child)
+static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
{
+ int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
+
if (child->d_inode)
return -EEXIST;
if (IS_DEADDIR(dir))
return -ENOENT;
- return inode_permission(dir, MAY_WRITE | MAY_EXEC);
+ return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
}

/*
@@ -1391,7 +1394,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
struct nameidata *nd)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);

if (error)
return error;
@@ -1953,7 +1956,7 @@ EXPORT_SYMBOL_GPL(lookup_create);

int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);

if (error)
return error;
@@ -2057,7 +2060,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)

int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 1);

if (error)
return error;
@@ -2342,7 +2345,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)

int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);

if (error)
return error;
@@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
if (!inode)
return -ENOENT;

- error = may_create(dir, new_dentry);
+ error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
if (error)
return error;

@@ -2631,7 +2634,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
return error;

if (!new_dentry->d_inode)
- error = may_create(new_dir, new_dentry);
+ error = may_create(new_dir, new_dentry, is_dir);
else
error = may_delete(new_dir, new_dentry, is_dir);
if (error)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 26fc8ae..5dc1ebd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -55,6 +55,8 @@ struct inodes_stat_t {
#define MAY_ACCESS 16
#define MAY_OPEN 32
#define MAY_CHDIR 64
+#define MAY_CREATE_FILE 128
+#define MAY_CREATE_DIR 256

/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
--
1.7.0.4
Jeff Layton
2010-09-24 15:54:23 UTC
Permalink
On Fri, 24 Sep 2010 18:18:11 +0530
Post by Aneesh Kumar K.V
From: Andreas Gruenbacher <agruen at suse.de>
Some permission models distinguish between the permission to create a
non-directory and a directory. Pass this information down to
inode_permission() as mask flags
Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 21 ++++++++++++---------
include/linux/fs.h | 2 ++
2 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index b0b8a71..ed786b2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -253,7 +253,8 @@ int generic_permission(struct inode *inode, int mask,
* for filesystem access without changing the "normal" uids which
* are used for other things.
*
+ * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
*/
int inode_permission(struct inode *inode, int mask)
{
@@ -1337,13 +1338,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
* 3. We should have write and exec permissions on dir
* 4. We can't do it if dir is immutable (done in permission())
*/
-static inline int may_create(struct inode *dir, struct dentry *child)
+static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
^^^^^
nit: maybe saner as a bool?
Post by Aneesh Kumar K.V
{
+ int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
+
if (child->d_inode)
return -EEXIST;
if (IS_DEADDIR(dir))
return -ENOENT;
- return inode_permission(dir, MAY_WRITE | MAY_EXEC);
+ return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
}
/*
@@ -1391,7 +1394,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
struct nameidata *nd)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -1953,7 +1956,7 @@ EXPORT_SYMBOL_GPL(lookup_create);
int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -2057,7 +2060,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)
int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 1);
if (error)
return error;
@@ -2342,7 +2345,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
if (!inode)
return -ENOENT;
- error = may_create(dir, new_dentry);
+ error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
^^^^ this is a little
scary, but even if it's
a directory, it'll get
kicked out in a later
check. Would it be
clearer to move up the
S_ISDIR() check in this
function and then pass
this in as false?
Post by Aneesh Kumar K.V
if (error)
return error;
@@ -2631,7 +2634,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
return error;
if (!new_dentry->d_inode)
- error = may_create(new_dir, new_dentry);
+ error = may_create(new_dir, new_dentry, is_dir);
else
error = may_delete(new_dir, new_dentry, is_dir);
if (error)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 26fc8ae..5dc1ebd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -55,6 +55,8 @@ struct inodes_stat_t {
#define MAY_ACCESS 16
#define MAY_OPEN 32
#define MAY_CHDIR 64
+#define MAY_CREATE_FILE 128
+#define MAY_CREATE_DIR 256
/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
--
Jeff Layton <jlayton at redhat.com>
Aneesh Kumar K. V
2010-09-24 19:16:03 UTC
Permalink
Post by Jeff Layton
On Fri, 24 Sep 2010 18:18:11 +0530
Post by Aneesh Kumar K.V
From: Andreas Gruenbacher <agruen at suse.de>
Some permission models distinguish between the permission to create a
non-directory and a directory. Pass this information down to
inode_permission() as mask flags
Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 21 ++++++++++++---------
include/linux/fs.h | 2 ++
2 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index b0b8a71..ed786b2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -253,7 +253,8 @@ int generic_permission(struct inode *inode, int mask,
* for filesystem access without changing the "normal" uids which
* are used for other things.
*
+ * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
*/
int inode_permission(struct inode *inode, int mask)
{
@@ -1337,13 +1338,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
* 3. We should have write and exec permissions on dir
* 4. We can't do it if dir is immutable (done in permission())
*/
-static inline int may_create(struct inode *dir, struct dentry *child)
+static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
^^^^^
nit: maybe saner as a bool?
Post by Aneesh Kumar K.V
{
+ int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
+
if (child->d_inode)
return -EEXIST;
if (IS_DEADDIR(dir))
return -ENOENT;
- return inode_permission(dir, MAY_WRITE | MAY_EXEC);
+ return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
}
/*
@@ -1391,7 +1394,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
struct nameidata *nd)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -1953,7 +1956,7 @@ EXPORT_SYMBOL_GPL(lookup_create);
int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -2057,7 +2060,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)
int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 1);
if (error)
return error;
@@ -2342,7 +2345,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
if (!inode)
return -ENOENT;
- error = may_create(dir, new_dentry);
+ error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
^^^^ this is a little
scary, but even if it's
a directory, it'll get
kicked out in a later
check. Would it be
clearer to move up the
S_ISDIR() check in this
function and then pass
this in as false?
Can you elaborate on this ?

-aneesh
Jeff Layton
2010-09-24 19:23:42 UTC
Permalink
On Sat, 25 Sep 2010 00:46:03 +0530
Post by Aneesh Kumar K. V
Post by Jeff Layton
On Fri, 24 Sep 2010 18:18:11 +0530
Post by Aneesh Kumar K.V
From: Andreas Gruenbacher <agruen at suse.de>
Some permission models distinguish between the permission to create a
non-directory and a directory. Pass this information down to
inode_permission() as mask flags
Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 21 ++++++++++++---------
include/linux/fs.h | 2 ++
2 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index b0b8a71..ed786b2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -253,7 +253,8 @@ int generic_permission(struct inode *inode, int mask,
* for filesystem access without changing the "normal" uids which
* are used for other things.
*
+ * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
*/
int inode_permission(struct inode *inode, int mask)
{
@@ -1337,13 +1338,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
* 3. We should have write and exec permissions on dir
* 4. We can't do it if dir is immutable (done in permission())
*/
-static inline int may_create(struct inode *dir, struct dentry *child)
+static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
^^^^^
nit: maybe saner as a bool?
Post by Aneesh Kumar K.V
{
+ int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
+
if (child->d_inode)
return -EEXIST;
if (IS_DEADDIR(dir))
return -ENOENT;
- return inode_permission(dir, MAY_WRITE | MAY_EXEC);
+ return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
}
/*
@@ -1391,7 +1394,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
struct nameidata *nd)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -1953,7 +1956,7 @@ EXPORT_SYMBOL_GPL(lookup_create);
int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -2057,7 +2060,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)
int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 1);
if (error)
return error;
@@ -2342,7 +2345,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);
if (error)
return error;
@@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
if (!inode)
return -ENOENT;
- error = may_create(dir, new_dentry);
+ error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
^^^^ this is a little
scary, but even if it's
a directory, it'll get
kicked out in a later
check. Would it be
clearer to move up the
S_ISDIR() check in this
function and then pass
this in as false?
Can you elaborate on this ?
-aneesh
Hardlinked directories are a no-no, of course. So when I first saw this
patch, it gave me pause. There's a later check in vfs_link though that
explicitly rejects hardlinking directories, so the above is harmless.
It may be more efficient to go ahead and return error if the target
is a directory however and bypass the permission check.

OTOH, maybe there's good reason to do it this way or I'm just being
excessively nitpicky.
--
Jeff Layton <jlayton at redhat.com>
Andreas Gruenbacher
2010-09-27 13:14:00 UTC
Permalink
Post by Jeff Layton
On Fri, 24 Sep 2010 18:18:11 +0530
Post by Aneesh Kumar K.V
@@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
if (!inode)
return -ENOENT;
- error = may_create(dir, new_dentry);
+ error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
^^^^ this is a little
scary, but even if it's
a directory, it'll get
kicked out in a later
check. Would it be
clearer to move up the
S_ISDIR() check in this
function and then pass
this in as false?
Ah, you mean this:

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2450,7 +2450,9 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
if (!inode)
return -ENOENT;

- error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
+ if (S_ISDIR(inode->i_mode))
+ return -EPERM;
+ error = may_create(dir, new_dentry, 0);
if (error)
return error;

@@ -2464,8 +2466,6 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
return -EPERM;
if (!dir->i_op->link)
return -EPERM;
- if (S_ISDIR(inode->i_mode))
- return -EPERM;

error = security_inode_link(old_dentry, dir, new_dentry);
if (error)

This is a clear improvement; I don't think it matters that user-space will
get -EPERM instead of -EXDEV when trying to hard-link a directory across
devices.

Thanks,
Andreas
Aneesh Kumar K.V
2010-09-24 12:48:12 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Normally, deleting a file requires write access to the parent directory.
Some permission models use a different permission on the parent
directory to indicate delete access. In addition, a process can have
per-file delete access even without delete access on the parent
directory.

Introduce two new inode_permission() mask flags and use them in
may_delete()

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 36 ++++++++++++++++++++++++------------
include/linux/fs.h | 2 ++
2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ed786b2..24c1e8c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -254,7 +254,7 @@ int generic_permission(struct inode *inode, int mask,
* are used for other things.
*
* When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
- * MAY_WRITE must also be set in @mask.
+ * MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_WRITE must also be set in @mask.
*/
int inode_permission(struct inode *inode, int mask)
{
@@ -1298,30 +1298,42 @@ static inline int check_sticky(struct inode *dir, struct inode *inode)
* 10. We don't allow removal of NFS sillyrenamed files; it's handled by
* nfs_async_unlink().
*/
-static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
+static int may_delete(struct inode *dir, struct dentry *victim,
+ int isdir, int replace)
{
+ struct inode *inode = victim->d_inode;
+ int mask;
int error;

- if (!victim->d_inode)
+ if (!inode)
return -ENOENT;

BUG_ON(victim->d_parent->d_inode != dir);
audit_inode_child(victim, dir);

- error = inode_permission(dir, MAY_WRITE | MAY_EXEC);
+ mask = MAY_WRITE | MAY_EXEC | MAY_DELETE_CHILD;
+ if (replace)
+ mask |= S_ISDIR(inode->i_mode) ?
+ MAY_CREATE_DIR : MAY_CREATE_FILE;
+ error = inode_permission(dir, mask);
+ if (error && IS_RICHACL(inode) &&
+ !inode_permission(dir, MAY_EXEC) &&
+ !inode_permission(inode, MAY_WRITE | MAY_DELETE_SELF))
+ error = 0;
+ else if (!error && check_sticky(dir, inode))
+ error = -EPERM;
if (error)
return error;
if (IS_APPEND(dir))
return -EPERM;
- if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)||
- IS_IMMUTABLE(victim->d_inode) || IS_SWAPFILE(victim->d_inode))
+ if (IS_APPEND(inode) || IS_IMMUTABLE(inode) || IS_SWAPFILE(inode))
return -EPERM;
if (isdir) {
- if (!S_ISDIR(victim->d_inode->i_mode))
+ if (!S_ISDIR(inode->i_mode))
return -ENOTDIR;
if (IS_ROOT(victim))
return -EBUSY;
- } else if (S_ISDIR(victim->d_inode->i_mode))
+ } else if (S_ISDIR(inode->i_mode))
return -EISDIR;
if (IS_DEADDIR(dir))
return -ENOENT;
@@ -2150,7 +2162,7 @@ void dentry_unhash(struct dentry *dentry)

int vfs_rmdir(struct inode *dir, struct dentry *dentry)
{
- int error = may_delete(dir, dentry, 1);
+ int error = may_delete(dir, dentry, 1, 0);

if (error)
return error;
@@ -2237,7 +2249,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)

int vfs_unlink(struct inode *dir, struct dentry *dentry)
{
- int error = may_delete(dir, dentry, 0);
+ int error = may_delete(dir, dentry, 0, 0);

if (error)
return error;
@@ -2629,14 +2641,14 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
if (old_dentry->d_inode == new_dentry->d_inode)
return 0;

- error = may_delete(old_dir, old_dentry, is_dir);
+ error = may_delete(old_dir, old_dentry, is_dir, 0);
if (error)
return error;

if (!new_dentry->d_inode)
error = may_create(new_dir, new_dentry, is_dir);
else
- error = may_delete(new_dir, new_dentry, is_dir);
+ error = may_delete(new_dir, new_dentry, is_dir, 1);
if (error)
return error;

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5dc1ebd..845a930 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -57,6 +57,8 @@ struct inodes_stat_t {
#define MAY_CHDIR 64
#define MAY_CREATE_FILE 128
#define MAY_CREATE_DIR 256
+#define MAY_DELETE_CHILD 512
+#define MAY_DELETE_SELF 1024

/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:13 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

We will need to call iop->permission and iop->change_acl from
inode_change_ok() for additional permission checks, and both take a
non-const inode.

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/attr.c | 2 +-
include/linux/fs.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index 7ca4181..f081b5a 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -26,7 +26,7 @@
* Should be called as the first thing in ->setattr implementations,
* possibly after taking additional locks.
*/
-int inode_change_ok(const struct inode *inode, struct iattr *attr)
+int inode_change_ok(struct inode *inode, struct iattr *attr)
{
unsigned int ia_valid = attr->ia_valid;

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 845a930..2616d34 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2402,7 +2402,7 @@ extern int buffer_migrate_page(struct address_space *,
#define buffer_migrate_page NULL
#endif

-extern int inode_change_ok(const struct inode *, struct iattr *);
+extern int inode_change_ok(struct inode *, struct iattr *);
extern int inode_newsize_ok(const struct inode *, loff_t offset);
extern void setattr_copy(struct inode *inode, const struct iattr *attr);
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:14 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Some permission models can allow processes to take ownership of a file,
change the file permissions, and set the file timestamps. Introduce new
permission mask flags and check for those permissions in
inode_change_ok().

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/attr.c | 49 ++++++++++++++++++++++++++++++++++++++-----------
include/linux/fs.h | 3 +++
2 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index f081b5a..00f971a 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -13,6 +13,20 @@
#include <linux/fsnotify.h>
#include <linux/fcntl.h>
#include <linux/security.h>
+#include <linux/richacl.h>
+
+static int richacl_change_ok(struct inode *inode, int mask)
+{
+ if (!IS_RICHACL(inode))
+ return -EPERM;
+
+ if (inode->i_op->permission)
+ return inode->i_op->permission(inode, mask);
+ else if (inode->i_op->check_acl)
+ return inode->i_op->check_acl(inode, mask);
+ else
+ return -EPERM;
+}

/**
* inode_change_ok - check if attribute changes to an inode are allowed
@@ -47,20 +61,29 @@ int inode_change_ok(struct inode *inode, struct iattr *attr)
/* Make sure a caller can chown. */
if ((ia_valid & ATTR_UID) &&
(current_fsuid() != inode->i_uid ||
- attr->ia_uid != inode->i_uid) && !capable(CAP_CHOWN))
- return -EPERM;
+ attr->ia_uid != inode->i_uid) &&
+ (current_fsuid() != attr->ia_uid ||
+ richacl_change_ok(inode, MAY_TAKE_OWNERSHIP)) &&
+ !capable(CAP_CHOWN))
+ goto error;

/* Make sure caller can chgrp. */
- if ((ia_valid & ATTR_GID) &&
- (current_fsuid() != inode->i_uid ||
- (!in_group_p(attr->ia_gid) && attr->ia_gid != inode->i_gid)) &&
- !capable(CAP_CHOWN))
- return -EPERM;
+ if (ia_valid & ATTR_GID) {
+ int in_group = in_group_p(attr->ia_gid);
+ if ((current_fsuid() != inode->i_uid ||
+ (!in_group && attr->ia_gid != inode->i_gid)) &&
+ (!in_group ||
+ richacl_change_ok(inode, MAY_TAKE_OWNERSHIP)) &&
+ !capable(CAP_CHOWN))
+ goto error;
+ }

/* Make sure a caller can chmod. */
if (ia_valid & ATTR_MODE) {
- if (!is_owner_or_cap(inode))
- return -EPERM;
+ if (current_fsuid() != inode->i_uid &&
+ richacl_change_ok(inode, MAY_CHMOD) &&
+ !capable(CAP_FOWNER))
+ goto error;
/* Also check the setgid bit! */
if (!in_group_p((ia_valid & ATTR_GID) ? attr->ia_gid :
inode->i_gid) && !capable(CAP_FSETID))
@@ -69,11 +92,15 @@ int inode_change_ok(struct inode *inode, struct iattr *attr)

/* Check for setting the inode time. */
if (ia_valid & (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)) {
- if (!is_owner_or_cap(inode))
- return -EPERM;
+ if (current_fsuid() != inode->i_uid &&
+ richacl_change_ok(inode, MAY_SET_TIMES) &&
+ !capable(CAP_FOWNER))
+ goto error;
}

return 0;
+error:
+ return -EPERM;
}
EXPORT_SYMBOL(inode_change_ok);

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2616d34..f08e82f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -59,6 +59,9 @@ struct inodes_stat_t {
#define MAY_CREATE_DIR 256
#define MAY_DELETE_CHILD 512
#define MAY_DELETE_SELF 1024
+#define MAY_TAKE_OWNERSHIP 2048
+#define MAY_CHMOD 4096
+#define MAY_SET_TIMES 8192

/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
--
1.7.0.4
Aneesh Kumar K.V
2010-09-24 12:48:04 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 24896e8..c5920b5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -200,7 +200,7 @@ static int acl_permission_check(struct inode *inode, int mask,
/**
* generic_permission - check for access rights on a Posix-like filesystem
* @inode: inode to check access rights for
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
* @check_acl: optional callback to check for Posix ACLs
*
* Used to check for read/write/execute permissions on a file.
@@ -242,7 +242,7 @@ int generic_permission(struct inode *inode, int mask,
/**
* inode_permission - check for access rights to a given inode
* @inode: inode to check permission on
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
*
* Used to check for read/write/execute permissions on an inode.
* We use "fsuid" for this, letting us set arbitrary permissions
@@ -288,7 +288,7 @@ int inode_permission(struct inode *inode, int mask)
/**
* file_permission - check for additional access rights to a given file
* @file: file to check access rights for
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
*
* Used to check for read/write/execute permissions on an already opened
* file.
--
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Aneesh Kumar K.V
2010-09-24 12:48:07 UTC
Permalink
From: Andreas Gruenbacher <agruen at suse.de>

When IS_POSIXACL() is true, the vfs does not apply the umask. Other acl
models will need the same exception, so introduce a separate IS_ACL()
test.

The IS_POSIX_ACL() test is still needed so that nfsd can determine when
the underlying file system supports POSIX ACLs (as opposed to some other
kind).

Signed-off-by: Andreas Gruenbacher <agruen at suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
---
fs/namei.c | 6 +++---
fs/nfs/nfs4proc.c | 2 +-
include/linux/fs.h | 8 +++++++-
3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index bf822b8..855b360 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1486,7 +1486,7 @@ static int __open_namei_create(struct nameidata *nd, struct path *path,
int error;
struct dentry *dir = nd->path.dentry;

- if (!IS_POSIXACL(dir->d_inode))
+ if (!IS_ACL(dir->d_inode))
mode &= ~current_umask();
error = security_path_mknod(&nd->path, path->dentry, mode, 0);
if (error)
@@ -2009,7 +2009,7 @@ SYSCALL_DEFINE4(mknodat, int, dfd, const char __user *, filename, int, mode,
error = PTR_ERR(dentry);
goto out_unlock;
}
- if (!IS_POSIXACL(nd.path.dentry->d_inode))
+ if (!IS_ACL(nd.path.dentry->d_inode))
mode &= ~current_umask();
error = may_mknod(mode);
if (error)
@@ -2086,7 +2086,7 @@ SYSCALL_DEFINE3(mkdirat, int, dfd, const char __user *, pathname, int, mode)
if (IS_ERR(dentry))
goto out_unlock;

- if (!IS_POSIXACL(nd.path.dentry->d_inode))
+ if (!IS_ACL(nd.path.dentry->d_inode))
mode &= ~current_umask();
error = mnt_want_write(nd.path.mnt);
if (error)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 089da5b..c77dbbf 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2042,7 +2042,7 @@ nfs4_atomic_open(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
if (nd->flags & LOOKUP_CREATE) {
attr.ia_mode = nd->intent.open.create_mode;
attr.ia_valid = ATTR_MODE;
- if (!IS_POSIXACL(dir))
+ if (!IS_ACL(dir))
attr.ia_mode &= ~current_umask();
} else {
open_flags &= ~O_EXCL;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 63d069b..485c697 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -199,7 +199,7 @@ struct inodes_stat_t {
#define MS_VERBOSE 32768 /* War is peace. Verbosity is silence.
MS_VERBOSE is deprecated. */
#define MS_SILENT 32768
-#define MS_POSIXACL (1<<16) /* VFS does not apply the umask */
+#define MS_POSIXACL (1<<16) /* Supports POSIX ACLs */
#define MS_UNBINDABLE (1<<17) /* change to unbindable */
#define MS_PRIVATE (1<<18) /* change to private */
#define MS_SLAVE (1<<19) /* change to slave */
@@ -270,6 +270,12 @@ struct inodes_stat_t {
#define IS_SWAPFILE(inode) ((inode)->i_flags & S_SWAPFILE)
#define IS_PRIVATE(inode) ((inode)->i_flags & S_PRIVATE)

+/*
+ * IS_ACL() tells the VFS to not apply the umask
+ * and use iop->check_acl for acl permission checks when defined.
+ */
+#define IS_ACL(inode) __IS_FLG(inode, MS_POSIXACL)
+
/* the read-only stuff doesn't really belong here, but any other place is
probably as bad and I don't want to create yet another include file. */
--
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Loading...