Skip to content

Commit

Permalink
Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/g…
Browse files Browse the repository at this point in the history
…it/tj/cgroup

* 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits)
  cgroup: fix to allow mounting a hierarchy by name
  cgroup: move assignement out of condition in cgroup_attach_proc()
  cgroup: Remove task_lock() from cgroup_post_fork()
  cgroup: add sparse annotation to cgroup_iter_start() and cgroup_iter_end()
  cgroup: mark cgroup_rmdir_waitq and cgroup_attach_proc() as static
  cgroup: only need to check oldcgrp==newgrp once
  cgroup: remove redundant get/put of task struct
  cgroup: remove redundant get/put of old css_set from migrate
  cgroup: Remove unnecessary task_lock before fetching css_set on migration
  cgroup: Drop task_lock(parent) on cgroup_fork()
  cgroups: remove redundant get/put of css_set from css_set_check_fetched()
  resource cgroups: remove bogus cast
  cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
  cgroup, cpuset: don't use ss->pre_attach()
  cgroup: don't use subsys->can_attach_task() or ->attach_task()
  cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  cgroup: improve old cgroup handling in cgroup_attach_proc()
  cgroup: always lock threadgroup during migration
  threadgroup: extend threadgroup_lock() to cover exit and exec
  threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsem
  ...

Fix up conflict in kernel/cgroup.c due to commit e0197aa: "cgroups:
fix a css_set not found bug in cgroup_attach_proc" that already
mentioned that the bug is fixed (differently) in Tejun's cgroup
patchset. This one, in other words.
  • Loading branch information
torvalds committed Jan 9, 2012
2 parents ac69e09 + 0d19ea8 commit db0c2bf
Show file tree
Hide file tree
Showing 15 changed files with 470 additions and 349 deletions.
51 changes: 21 additions & 30 deletions Documentation/cgroups/cgroups.txt
Original file line number Diff line number Diff line change
Expand Up @@ -594,53 +594,44 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
called multiple times against a cgroup.

int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct task_struct *task)
struct cgroup_taskset *tset)
(cgroup_mutex held by caller)

Called prior to moving a task into a cgroup; if the subsystem
returns an error, this will abort the attach operation. If a NULL
task is passed, then a successful result indicates that *any*
unspecified task can be moved into the cgroup. Note that this isn't
called on a fork. If this method returns 0 (success) then this should
remain valid while the caller holds cgroup_mutex and it is ensured that either
Called prior to moving one or more tasks into a cgroup; if the
subsystem returns an error, this will abort the attach operation.
@tset contains the tasks to be attached and is guaranteed to have at
least one task in it.

If there are multiple tasks in the taskset, then:
- it's guaranteed that all are from the same thread group
- @tset contains all tasks from the thread group whether or not
they're switching cgroups
- the first task is the leader

Each @tset entry also contains the task's old cgroup and tasks which
aren't switching cgroup can be skipped easily using the
cgroup_taskset_for_each() iterator. Note that this isn't called on a
fork. If this method returns 0 (success) then this should remain valid
while the caller holds cgroup_mutex and it is ensured that either
attach() or cancel_attach() will be called in future.

int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
(cgroup_mutex held by caller)

As can_attach, but for operations that must be run once per task to be
attached (possibly many when using cgroup_attach_proc). Called after
can_attach.

void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct task_struct *task, bool threadgroup)
struct cgroup_taskset *tset)
(cgroup_mutex held by caller)

Called when a task attach operation has failed after can_attach() has succeeded.
A subsystem whose can_attach() has some side-effects should provide this
function, so that the subsystem can implement a rollback. If not, not necessary.
This will be called only about subsystems whose can_attach() operation have
succeeded.

void pre_attach(struct cgroup *cgrp);
(cgroup_mutex held by caller)

For any non-per-thread attachment work that needs to happen before
attach_task. Needed by cpuset.
succeeded. The parameters are identical to can_attach().

void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct cgroup *old_cgrp, struct task_struct *task)
struct cgroup_taskset *tset)
(cgroup_mutex held by caller)

Called after the task has been attached to the cgroup, to allow any
post-attachment activity that requires memory allocations or blocking.

void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
(cgroup_mutex held by caller)

As attach, but for operations that must be run once per task to be attached,
like can_attach_task. Called before attach. Currently does not support any
subsystem that might need the old_cgrp for every thread in the group.
The parameters are identical to can_attach().

void fork(struct cgroup_subsy *ss, struct task_struct *task)

Expand Down
45 changes: 28 additions & 17 deletions block/blk-cgroup.c
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,10 @@ EXPORT_SYMBOL_GPL(blkio_root_cgroup);

static struct cgroup_subsys_state *blkiocg_create(struct cgroup_subsys *,
struct cgroup *);
static int blkiocg_can_attach_task(struct cgroup *, struct task_struct *);
static void blkiocg_attach_task(struct cgroup *, struct task_struct *);
static int blkiocg_can_attach(struct cgroup_subsys *, struct cgroup *,
struct cgroup_taskset *);
static void blkiocg_attach(struct cgroup_subsys *, struct cgroup *,
struct cgroup_taskset *);
static void blkiocg_destroy(struct cgroup_subsys *, struct cgroup *);
static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);

Expand All @@ -44,8 +46,8 @@ static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
struct cgroup_subsys blkio_subsys = {
.name = "blkio",
.create = blkiocg_create,
.can_attach_task = blkiocg_can_attach_task,
.attach_task = blkiocg_attach_task,
.can_attach = blkiocg_can_attach,
.attach = blkiocg_attach,
.destroy = blkiocg_destroy,
.populate = blkiocg_populate,
#ifdef CONFIG_BLK_CGROUP
Expand Down Expand Up @@ -1626,30 +1628,39 @@ blkiocg_create(struct cgroup_subsys *subsys, struct cgroup *cgroup)
* of the main cic data structures. For now we allow a task to change
* its cgroup only if it's the only owner of its ioc.
*/
static int blkiocg_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
static int blkiocg_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct cgroup_taskset *tset)
{
struct task_struct *task;
struct io_context *ioc;
int ret = 0;

/* task_lock() is needed to avoid races with exit_io_context() */
task_lock(tsk);
ioc = tsk->io_context;
if (ioc && atomic_read(&ioc->nr_tasks) > 1)
ret = -EINVAL;
task_unlock(tsk);

cgroup_taskset_for_each(task, cgrp, tset) {
task_lock(task);
ioc = task->io_context;
if (ioc && atomic_read(&ioc->nr_tasks) > 1)
ret = -EINVAL;
task_unlock(task);
if (ret)
break;
}
return ret;
}

static void blkiocg_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
static void blkiocg_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct cgroup_taskset *tset)
{
struct task_struct *task;
struct io_context *ioc;

task_lock(tsk);
ioc = tsk->io_context;
if (ioc)
ioc->cgroup_changed = 1;
task_unlock(tsk);
cgroup_taskset_for_each(task, cgrp, tset) {
task_lock(task);
ioc = task->io_context;
if (ioc)
ioc->cgroup_changed = 1;
task_unlock(task);
}
}

void blkio_policy_register(struct blkio_policy_type *blkiop)
Expand Down
31 changes: 25 additions & 6 deletions include/linux/cgroup.h
Original file line number Diff line number Diff line change
Expand Up @@ -456,6 +456,28 @@ int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task);
void cgroup_exclude_rmdir(struct cgroup_subsys_state *css);
void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css);

/*
* Control Group taskset, used to pass around set of tasks to cgroup_subsys
* methods.
*/
struct cgroup_taskset;
struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset);
int cgroup_taskset_size(struct cgroup_taskset *tset);

/**
* cgroup_taskset_for_each - iterate cgroup_taskset
* @task: the loop cursor
* @skip_cgrp: skip if task's cgroup matches this, %NULL to iterate through all
* @tset: taskset to iterate
*/
#define cgroup_taskset_for_each(task, skip_cgrp, tset) \
for ((task) = cgroup_taskset_first((tset)); (task); \
(task) = cgroup_taskset_next((tset))) \
if (!(skip_cgrp) || \
cgroup_taskset_cur_cgroup((tset)) != (skip_cgrp))

/*
* Control Group subsystem type.
* See Documentation/cgroups/cgroups.txt for details
Expand All @@ -467,14 +489,11 @@ struct cgroup_subsys {
int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct task_struct *tsk);
int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
struct cgroup_taskset *tset);
void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct task_struct *tsk);
void (*pre_attach)(struct cgroup *cgrp);
void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
struct cgroup_taskset *tset);
void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct cgroup *old_cgrp, struct task_struct *tsk);
struct cgroup_taskset *tset);
void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
void (*exit)(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct cgroup *old_cgrp, struct task_struct *task);
Expand Down
9 changes: 4 additions & 5 deletions include/linux/init_task.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,10 @@ extern struct files_struct init_files;
extern struct fs_struct init_fs;

#ifdef CONFIG_CGROUPS
#define INIT_THREADGROUP_FORK_LOCK(sig) \
.threadgroup_fork_lock = \
__RWSEM_INITIALIZER(sig.threadgroup_fork_lock),
#define INIT_GROUP_RWSEM(sig) \
.group_rwsem = __RWSEM_INITIALIZER(sig.group_rwsem),
#else
#define INIT_THREADGROUP_FORK_LOCK(sig)
#define INIT_GROUP_RWSEM(sig)
#endif

#define INIT_SIGNALS(sig) { \
Expand All @@ -46,7 +45,7 @@ extern struct fs_struct init_fs;
}, \
.cred_guard_mutex = \
__MUTEX_INITIALIZER(sig.cred_guard_mutex), \
INIT_THREADGROUP_FORK_LOCK(sig) \
INIT_GROUP_RWSEM(sig) \
}

extern struct nsproxy init_nsproxy;
Expand Down
73 changes: 54 additions & 19 deletions include/linux/sched.h
Original file line number Diff line number Diff line change
Expand Up @@ -637,13 +637,15 @@ struct signal_struct {
#endif
#ifdef CONFIG_CGROUPS
/*
* The threadgroup_fork_lock prevents threads from forking with
* CLONE_THREAD while held for writing. Use this for fork-sensitive
* threadgroup-wide operations. It's taken for reading in fork.c in
* copy_process().
* Currently only needed write-side by cgroups.
* group_rwsem prevents new tasks from entering the threadgroup and
* member tasks from exiting,a more specifically, setting of
* PF_EXITING. fork and exit paths are protected with this rwsem
* using threadgroup_change_begin/end(). Users which require
* threadgroup to remain stable should use threadgroup_[un]lock()
* which also takes care of exec path. Currently, cgroup is the
* only user.
*/
struct rw_semaphore threadgroup_fork_lock;
struct rw_semaphore group_rwsem;
#endif

int oom_adj; /* OOM kill score adjustment (bit shift) */
Expand Down Expand Up @@ -2394,29 +2396,62 @@ static inline void unlock_task_sighand(struct task_struct *tsk,
spin_unlock_irqrestore(&tsk->sighand->siglock, *flags);
}

/* See the declaration of threadgroup_fork_lock in signal_struct. */
#ifdef CONFIG_CGROUPS
static inline void threadgroup_fork_read_lock(struct task_struct *tsk)
static inline void threadgroup_change_begin(struct task_struct *tsk)
{
down_read(&tsk->signal->threadgroup_fork_lock);
down_read(&tsk->signal->group_rwsem);
}
static inline void threadgroup_fork_read_unlock(struct task_struct *tsk)
static inline void threadgroup_change_end(struct task_struct *tsk)
{
up_read(&tsk->signal->threadgroup_fork_lock);
up_read(&tsk->signal->group_rwsem);
}
static inline void threadgroup_fork_write_lock(struct task_struct *tsk)

/**
* threadgroup_lock - lock threadgroup
* @tsk: member task of the threadgroup to lock
*
* Lock the threadgroup @tsk belongs to. No new task is allowed to enter
* and member tasks aren't allowed to exit (as indicated by PF_EXITING) or
* perform exec. This is useful for cases where the threadgroup needs to
* stay stable across blockable operations.
*
* fork and exit paths explicitly call threadgroup_change_{begin|end}() for
* synchronization. While held, no new task will be added to threadgroup
* and no existing live task will have its PF_EXITING set.
*
* During exec, a task goes and puts its thread group through unusual
* changes. After de-threading, exclusive access is assumed to resources
* which are usually shared by tasks in the same group - e.g. sighand may
* be replaced with a new one. Also, the exec'ing task takes over group
* leader role including its pid. Exclude these changes while locked by
* grabbing cred_guard_mutex which is used to synchronize exec path.
*/
static inline void threadgroup_lock(struct task_struct *tsk)
{
down_write(&tsk->signal->threadgroup_fork_lock);
/*
* exec uses exit for de-threading nesting group_rwsem inside
* cred_guard_mutex. Grab cred_guard_mutex first.
*/
mutex_lock(&tsk->signal->cred_guard_mutex);
down_write(&tsk->signal->group_rwsem);
}
static inline void threadgroup_fork_write_unlock(struct task_struct *tsk)

/**
* threadgroup_unlock - unlock threadgroup
* @tsk: member task of the threadgroup to unlock
*
* Reverse threadgroup_lock().
*/
static inline void threadgroup_unlock(struct task_struct *tsk)
{
up_write(&tsk->signal->threadgroup_fork_lock);
up_write(&tsk->signal->group_rwsem);
mutex_unlock(&tsk->signal->cred_guard_mutex);
}
#else
static inline void threadgroup_fork_read_lock(struct task_struct *tsk) {}
static inline void threadgroup_fork_read_unlock(struct task_struct *tsk) {}
static inline void threadgroup_fork_write_lock(struct task_struct *tsk) {}
static inline void threadgroup_fork_write_unlock(struct task_struct *tsk) {}
static inline void threadgroup_change_begin(struct task_struct *tsk) {}
static inline void threadgroup_change_end(struct task_struct *tsk) {}
static inline void threadgroup_lock(struct task_struct *tsk) {}
static inline void threadgroup_unlock(struct task_struct *tsk) {}
#endif

#ifndef __HAVE_THREAD_FUNCTIONS
Expand Down
Loading

0 comments on commit db0c2bf

Please sign in to comment.