diff options
author | Tejun Heo | 2014-04-25 18:28:02 -0400 |
---|---|---|
committer | Tejun Heo | 2014-04-25 18:28:02 -0400 |
commit | 842b597ee0a7e1aa5a3148164ffdba00ec17f614 (patch) | |
tree | 545209e6b3830a92bae889d41182d15dcccc01aa /include | |
parent | 50bce01b0ee34ab9f18a2d5a7467053dda355d30 (diff) |
cgroup: implement cgroup.populated for the default hierarchy
cgroup users often need a way to determine when a cgroup's
subhierarchy becomes empty so that it can be cleaned up. cgroup
currently provides release_agent for it; unfortunately, this mechanism
is riddled with issues.
* It delivers events by forking and execing a userland binary
specified as the release_agent. This is a long deprecated method of
notification delivery. It's extremely heavy, slow and cumbersome to
integrate with larger infrastructure.
* There is single monitoring point at the root. There's no way to
delegate management of a subtree.
* The event isn't recursive. It triggers when a cgroup doesn't have
any tasks or child cgroups. Events for internal nodes trigger only
after all children are removed. This again makes it impossible to
delegate management of a subtree.
* Events are filtered from the kernel side. "notify_on_release" file
is used to subscribe to or suppress release event. This is
unnecessarily complicated and probably done this way because event
delivery itself was expensive.
This patch implements interface file "cgroup.populated" which can be
used to monitor whether the cgroup's subhierarchy has tasks in it or
not. Its value is 0 if there is no task in the cgroup and its
descendants; otherwise, 1, and kernfs_notify() notificaiton is
triggers when the value changes, which can be monitored through poll
and [di]notify.
This is a lot ligther and simpler and trivially allows delegating
management of subhierarchy - subhierarchy monitoring can block further
propgation simply by putting itself or another process in the root of
the subhierarchy and monitor events that it's interested in from there
without interfering with monitoring higher in the tree.
v2: Patch description updated as per Serge.
v3: "cgroup.subtree_populated" renamed to "cgroup.populated". The
subtree_ prefix was a bit confusing because
"cgroup.subtree_control" uses it to denote the tree rooted at the
cgroup sans the cgroup itself while the populated state includes
the cgroup itself.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Lennart Poettering <lennart@poettering.net>
Diffstat (limited to 'include')
-rw-r--r-- | include/linux/cgroup.h | 15 |
1 files changed, 15 insertions, 0 deletions
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index ada239253ec7..4b38e2d6110d 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -154,6 +154,14 @@ struct cgroup { /* the number of attached css's */ int nr_css; + /* + * If this cgroup contains any tasks, it contributes one to + * populated_cnt. All children with non-zero popuplated_cnt of + * their own contribute one. The count is zero iff there's no task + * in this cgroup or its subtree. + */ + int populated_cnt; + atomic_t refcnt; /* @@ -166,6 +174,7 @@ struct cgroup { struct cgroup *parent; /* my parent */ struct kernfs_node *kn; /* cgroup kernfs entry */ struct kernfs_node *control_kn; /* kn for "cgroup.subtree_control" */ + struct kernfs_node *populated_kn; /* kn for "cgroup.subtree_populated" */ /* * Monotonically increasing unique serial number which defines a @@ -264,6 +273,12 @@ enum { * * - "cgroup.clone_children" is removed. * + * - "cgroup.subtree_populated" is available. Its value is 0 if + * the cgroup and its descendants contain no task; otherwise, 1. + * The file also generates kernfs notification which can be + * monitored through poll and [di]notify when the value of the + * file changes. + * * - If mount is requested with sane_behavior but without any * subsystem, the default unified hierarchy is mounted. * |