Make release_agent per-cgroup property. Run release_agent in proper ve.

Submitted by Valeriy Vdovin on April 17, 2020, 2:55 p.m.

Details

Reviewer None
Submitted April 17, 2020, 2:55 p.m.
Last Updated April 18, 2020, 12:58 a.m.
Revision 1

Cover Letter

Problems:
1. Currently release_agent is a mount-wide cgroup property, single for whole hierarchy. It is
not possible to override it's value for a cgroup down the hierarchy, which is a virtual root
for a container.
2. Code that spawnes release_agent notification processes, does so from ve0, inside of a container
any logic that waits for notifications of empty cgroups will fail, 
see https://jira.sw.ru/browse/PSBM-83887 for an example of such problem with systemd.

Solution:
In this patchset release_agent is moved from 'struct cgroupfs_root' to 'struct cgroup', enabling
the possibility to set release_agent per-ve.
Also 'struct cgroup' recieves a pointer to owning ve, so that release_agent notifications
could be spawned under the right ve.

v1: Removed complex locking scheme for ve_owner<->cgroup binding.
v2: release_agent_path protected by RCU
v3: cgroup_root_from_opts uses ..set_release_agent helper without lockdep
v4: fixed possible race at cgroup_release_agent
v5: Use per-ve workqueue to maintain per-ve cgroups notifications
v6: rebased to latest branch
v7: Fixed lockdeps, removed dependency from is_running param.
v8: cgroup_rcu_strdup uses strlcpy, ve_get_release_agent releases list spinlock early,
    patchset was split into lesser changes.
v9: rearranged cgroup_mark_ve_root with ve_workqueue_start, added lost kfree
v10: fixed indentation
v11: - patch 6,7 have been rearraged into 3 patches.
     - task_cgroup_from_root have been changed to css_cgroup_from_root for ve->init_task
     - cgroup_mount initialized cgroup->ve_owner to ve0
     - removed rarely used optimization branching from css_cgroup_from_root 

Valeriy Vdovin (12):
  ve/cgroup: implemented per-ve workqueue.
  cgroup: added rcu node string wrapper for in-cgroup usage.
  cgroup: declared cgroup_mark_ve_root in public header
  cgroup: exported __put_css_set and wrappers to cgroup.h
  ve/cgroup: saving root_css to ve
  ve/cgroup: unmark ve-root cgroups at container stop
  ve/cgroup: Added ve_owner field to cgroup
  ve/cgroup: moved release_agent from system_wq to per-ve workqueues
  ve/cgroup: Implemented logic that uses 'cgroup->ve_owner' to run
    release_agent notifications.
  ve/cgroup: private per-cgroup-root data container
  ve/cgroup: set release_agent_path for root cgroups separately for each
    ve.
  cgroup: reuse css_cgroup_from_root where appropriate

 include/linux/cgroup.h |  40 +++++-
 include/linux/ve.h     |  33 +++++
 kernel/cgroup.c        | 328 ++++++++++++++++++++++++++++++++++++++-----------
 kernel/ve/ve.c         | 223 ++++++++++++++++++++++++++++++++-
 4 files changed, 543 insertions(+), 81 deletions(-)
  

Revisions