LXC WebUI Support

Submitted by Arthur Lockman on June 29, 2016, 6:20 p.m.

Details

Message ID 1467224419-27117-1-git-send-email-alockman@redhat.com
State Rejected
Series "LXC WebUI Support"
Headers show

Commit Message

Arthur Lockman June 29, 2016, 6:20 p.m.
From: Arthur Lockman <hello@rthr.me>

Adds support for migrating LXC containers with the p.haul WebUI. It uses the checkpoint functionality in LXC and rsync to migrate and restart the containers on different hosts.

Signed-off-by: Arthur Lockman <alockman@redhat.com>
---
 README.md                 |  7 +++-
 phaul/criu_cr.py          |  4 +-
 phaul/criu_req.py         |  4 +-
 phaul/iters.py            | 44 +++++++++++++---------
 phaul/p_haul_vz.py        | 58 ++++++++++++++++++++++++++++
 webgui/p_haul_web_gui.py  | 36 ++++++++++++++++++
 webgui/procs.py           |  7 ++++
 webgui/static/criugui.css | 18 ++++++++-
 webgui/static/migrate.js  | 96 +++++++++++++++++++++++++++++++++++++++--------
 webgui/static/pstree.js   | 47 +++++++++++++----------
 10 files changed, 261 insertions(+), 60 deletions(-)

Patch hide | download patch | download mbox

diff --git a/README.md b/README.md
index 45327ea..46345c2 100644
--- a/README.md
+++ b/README.md
@@ -31,8 +31,11 @@  wiki (http://criu.org/Category:P.Haul).
 How to contribute
 =======
 
-The p.haul patches should be sent to CRIU development mailing list which is
-located at https://openvz.org/mailman/listinfo/criu
+The p.haul patches should be sent to CRIU development mailing list
+(https://openvz.org/mailman/listinfo/criu) with "p.haul" prefix.
+Configure your local git repository using following command to
+set subject prefix automatically:
+* $ git config format.subjectprefix "PATCH p.haul"
 
 Before sending patches please make sure your code formatted according to
 project coding style (we use [PEP8](https://www.python.org/dev/peps/pep-0008/)
diff --git a/phaul/criu_cr.py b/phaul/criu_cr.py
index 4c0f804..8f25cff 100644
--- a/phaul/criu_cr.py
+++ b/phaul/criu_cr.py
@@ -7,9 +7,9 @@  import pycriu.rpc
 import criu_req
 
 
-def criu_predump(pid, img, criu_connection, fs):
+def criu_predump(htype, pid, img, criu_connection, fs):
 	logging.info("\tIssuing pre-dump command to service")
-	req = criu_req.make_predump_req(pid, img, criu_connection, fs)
+	req = criu_req.make_predump_req(pid, htype, img, criu_connection, fs)
 	resp = criu_connection.send_req(req)
 	if not resp.success:
 		raise Exception("Pre-dump failed")
diff --git a/phaul/criu_req.py b/phaul/criu_req.py
index a5bb01c..36d1e43 100644
--- a/phaul/criu_req.py
+++ b/phaul/criu_req.py
@@ -66,10 +66,10 @@  def _make_common_dump_req(typ, pid, htype, img, connection, fs):
 	return req
 
 
-def make_predump_req(pid, img, connection, fs):
+def make_predump_req(pid, htype, img, connection, fs):
 	"""Prepare pre-dump criu request (source side)"""
 	return _make_common_dump_req(
-		pycriu.rpc.PRE_DUMP, pid, None, img, connection, fs)
+		pycriu.rpc.PRE_DUMP, pid, htype, img, connection, fs)
 
 
 def make_dump_req(pid, htype, img, connection, fs):
diff --git a/phaul/iters.py b/phaul/iters.py
index b8b06cf..d1be567 100644
--- a/phaul/iters.py
+++ b/phaul/iters.py
@@ -193,7 +193,8 @@  class phaul_iter_worker:
 			logging.info("* Iteration %d", iter_index)
 			self.target_host.start_iter(True)
 			self.img.new_image_dir()
-			criu_cr.criu_predump(root_pid, self.img, self.criu_connection, self.fs)
+			criu_cr.criu_predump(self.htype, root_pid, self.img,
+				self.criu_connection, self.fs)
 			self.target_host.end_iter()
 
 			# Handle FS migration iteration
@@ -220,31 +221,35 @@  class phaul_iter_worker:
 			# Handle final FS and images sync on frozen htype
 			logging.info("Final FS and images sync")
 			fsstats = self.fs.stop_migration()
-
 			self.img.sync_imgs_to_target(self.target_host, self.htype,
 				self.connection.mem_sk)
 
 			# Restore htype on target
 			logging.info("Asking target host to restore")
 			self.target_host.restore_from_images()
-			logging.info("Restored on target host")
+
 		except:
 			self.htype.migration_fail(self.fs)
 			raise
 
-		# Ack previous dump request to terminate all frozen tasks
-		resp = self.criu_connection.ack_notify()
-		if not resp.success:
-			logging.warning("Bad notification from target host")
+		# Restored on target, can't fail starting from this point
+		try:
+			# Ack previous dump request to terminate all frozen tasks
+			resp = self.criu_connection.ack_notify()
+			if not resp.success:
+				logging.warning("Bad notification from target host")
 
-		dstats = criu_api.criu_get_dstats(self.img)
-		migration_stats.handle_iteration(dstats, fsstats)
+			dstats = criu_api.criu_get_dstats(self.img)
+			migration_stats.handle_iteration(dstats, fsstats)
 
-		logging.info("Migration succeeded")
-		self.htype.migration_complete(self.fs, self.target_host)
-		migration_stats.handle_stop(self)
-		self.img.close()
-		self.criu_connection.close()
+			logging.info("Migration succeeded")
+			self.htype.migration_complete(self.fs, self.target_host)
+			migration_stats.handle_stop(self)
+			self.img.close()
+			self.criu_connection.close()
+
+		except Exception as e:
+			logging.warning("Exception during final cleanup: %s", e)
 
 	def __start_restart_migration(self):
 		"""
@@ -292,16 +297,19 @@  class phaul_iter_worker:
 			# Start htype on target
 			logging.info("Asking target host to start")
 			self.target_host.start_htype()
-			logging.info("Started on target host")
 
 		except:
 			self.htype.migration_fail(self.fs)
 			self.htype.start()
 			raise
 
-		logging.info("Migration succeeded")
-		self.htype.migration_complete(self.fs, self.target_host)
-		migration_stats.handle_stop()
+		# Started on target, can't fail starting from this point
+		try:
+			logging.info("Migration succeeded")
+			self.htype.migration_complete(self.fs, self.target_host)
+			migration_stats.handle_stop()
+		except Exception as e:
+			logging.warning("Exception during final cleanup: %s", e)
 
 	def __check_live_iter_progress(self, index, dstats, prev_dstats):
 
diff --git a/phaul/p_haul_vz.py b/phaul/p_haul_vz.py
index c65a7f0..3bf907a 100644
--- a/phaul/p_haul_vz.py
+++ b/phaul/p_haul_vz.py
@@ -16,6 +16,8 @@  import pycriu.rpc
 vz_global_conf = "/etc/vz/vz.conf"
 vz_conf_dir = "/etc/vz/conf/"
 vzctl_bin = "vzctl"
+cgget_bin = "cgget"
+cgexec_bin = "cgexec"
 
 
 vz_cgroup_mount_map = {
@@ -123,6 +125,8 @@  class p_haul_type:
 
 	def adjust_criu_req(self, req):
 		"""Add module-specific options to criu request"""
+
+		# Specify dump specific options
 		if req.type == pycriu.rpc.DUMP:
 
 			# Specify root fs
@@ -138,6 +142,11 @@  class p_haul_type:
 			# Increase ghost-limit up to 50Mb
 			req.opts.ghost_limit = 50 << 20
 
+		# Specify freezer cgroup for both predump and dump requests
+		if req.type == pycriu.rpc.PRE_DUMP or req.type == pycriu.rpc.DUMP:
+			req.opts.freeze_cgroup = \
+				"/sys/fs/cgroup/freezer/{0}/".format(self._ctid)
+
 	def root_task_pid(self):
 		path = "/var/run/ve/{0}.init.pid".format(self._ctid)
 		with open(path) as pidfile:
@@ -151,7 +160,56 @@  class p_haul_type:
 		pass
 
 	def final_dump(self, pid, img, ccon, fs):
+		"""Perform Virtuozzo-specific final dump"""
+		self.__pre_final_dump(img)
 		criu_cr.criu_dump(self, pid, img, ccon, fs)
+		self.__post_final_dump(img)
+
+	def __pre_final_dump(self, img):
+		"""Create extra images before final dump"""
+		extra_images = (
+			("vz_clock_bootbased.img", "ve.clock_bootbased"),
+			("vz_clock_monotonic.img", "ve.clock_monotonic"),
+			("vz_iptables_mask.img", "ve.iptables_mask"),
+			("vz_os_release.img", "ve.os_release"),
+			("vz_features.img", "ve.features"),
+			("vz_aio_max_nr.img", "ve.aio_max_nr"))
+		for image_name, var_name in extra_images:
+			self.__create_cgget_extra_image(img, image_name, var_name)
+
+	def __post_final_dump(self, img):
+		"""Create extra images after final dump"""
+		extra_images = (
+			("vz_core_pattern.img", ["cat", "/proc/sys/kernel/core_pattern"]),)
+		for image_name, exec_args in extra_images:
+			self.__create_cgexec_extra_image(img, image_name, exec_args)
+
+	def __create_cgget_extra_image(self, img, image_name, var_name):
+		"""Create extra image using cgget output"""
+		proc = subprocess.Popen(
+			[cgget_bin, "-n", "-v", "-r", var_name, self._ctid],
+			stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+		image_data = proc.communicate()[0]
+		if proc.returncode == 0:
+			self.__create_extra_image(img, image_name, image_data)
+		else:
+			logging.warning("cgget failed to create %s", image_name)
+
+	def __create_cgexec_extra_image(self, img, image_name, exec_args):
+		"""Create extra image using cgexec output"""
+		proc = subprocess.Popen(
+			[cgexec_bin, "-g", "ve:{0}".format(self._ctid)] + exec_args,
+			stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+		image_data = proc.communicate()[0]
+		if proc.returncode == 0:
+			self.__create_extra_image(img, image_name, image_data)
+		else:
+			logging.warning("cgexec failed to create %s", image_name)
+
+	def __create_extra_image(self, img, image_name, image_data):
+		image_path = os.path.join(img.image_dir(), image_name)
+		with open(image_path, "w") as f:
+			f.write(image_data)
 
 	def final_restore(self, img, connection):
 		"""Perform Virtuozzo-specific final restore"""
diff --git a/webgui/p_haul_web_gui.py b/webgui/p_haul_web_gui.py
index e06da34..3d9fd7f 100644
--- a/webgui/p_haul_web_gui.py
+++ b/webgui/p_haul_web_gui.py
@@ -96,6 +96,42 @@  def migrate():
     return flask.jsonify({"succeeded": True})
 
 
+@APP.route('/migrate-lxc-tx')
+def migrate_lxc():
+    """
+    Attempt to migrate an LXC container, where the PID is given in the URL.
+    parameter "pid"
+    """
+    cname = flask.request.args.get('cname')
+
+    print "Migrating container " + cname
+
+    dest_host = partner, rpc_port
+    cleanup_command = "rm -rf /tmp/" + cname
+    os.system(cleanup_command)
+    lxc_command = "lxc-checkpoint -n " + cname + " -D /tmp/" + cname + " -vvvv -s"
+    os.system(lxc_command)
+    rsync_command = "rsync -a /var/lib/lxc/" + cname + " " + partner + ":/var/lib/lxc"
+    os.system(rsync_command)
+    rsync_command = "rsync -a /tmp/" + cname + "/ " + partner + ":/tmp/" + cname + "/"
+    os.system(rsync_command)
+    ucarp_command = "killall -USR2 ucarp"
+    os.system(ucarp_command)
+    return flask.jsonify({"succeeded": True})
+
+
+@APP.route('/migrate-lxc-rx')
+def migrate_lxc_rx():
+    """
+    Receive a migrated LXC container.
+    parameter "cname" container name
+    """
+    cname = flask.request.args.get('cname')
+    lxc_command = "lxc-checkpoint -n " + cname + " -D /tmp/" + cname + " -vvvv -r"
+    os.system(lxc_command)
+    return flask.jsonify({"succeeded": True})
+
+
 def start_web_gui(migration_partner, _rpc_port, _debug=False):
     global partner
     global myself
diff --git a/webgui/procs.py b/webgui/procs.py
index 1e0eab8..cb0c9c0 100644
--- a/webgui/procs.py
+++ b/webgui/procs.py
@@ -54,6 +54,10 @@  def procs():
                         name = os.path.basename(p.cmdline[0])
                     except:
                         name = p.name
+                is_lxc = False
+                if 'lxc ' in name:
+                    is_lxc = True
+                    name = name.replace('lxc ', 'container ')
                 proc = {
                     # name and ppid are either functions or variables in
                     # different versions of psutil.
@@ -61,6 +65,7 @@  def procs():
                     "id": p.pid,
                     "parent": p.ppid() if callable(p.ppid) else p.ppid,
                     "children": [],
+                    "is_lxc": is_lxc,
                 }
 
                 if p.pid == 1:
@@ -90,6 +95,8 @@  def procs():
 
         for childProc in flatprocs:
             if "parent" in childProc and childProc["parent"] == proc["id"]:
+                if proc["is_lxc"] == True:
+                    childProc["is_lxc"] = True
                 proc["children"].append(childProc)
             else:
                 remainder.append(childProc)
diff --git a/webgui/static/criugui.css b/webgui/static/criugui.css
index a18b46c..233fa93 100644
--- a/webgui/static/criugui.css
+++ b/webgui/static/criugui.css
@@ -47,10 +47,26 @@  svg {
   stroke-width: 1px;
 }
 
+.node circle.lxc-circle {
+  fill: #009933;
+  stroke: #003300;
+}
+
 .node text.node-label {
   fill: #333;
   font-family: "Liberation Mono", monospace;
-  font-size: 10pt;
+  font-size: 12pt;
+}
+
+.node text.lxc-label {
+  fill: #009933;
+  font-family: "Liberation Mono", monospace;
+  text-shadow:
+    -1px -1px 0 #fff,  
+     1px -1px 0 #fff,
+    -1px 1px 0 #fff,
+     1px 1px 0 #fff;
+  font-weight: bold;
 }
 
 .active-node text.node-label {
diff --git a/webgui/static/migrate.js b/webgui/static/migrate.js
index a360b9d..ba49239 100644
--- a/webgui/static/migrate.js
+++ b/webgui/static/migrate.js
@@ -29,21 +29,78 @@  function migrate(proc, source, target) {
     return;
   }
 
-  /* Add an alert to let the user know that the migration has started. */
-  var alert = insertAlert();
-  alert.classed("alert-info", true);
-
-  var p = alert.append("p");
-  p.append("b").text("Info: ");
-  p.append("span").text("Migrating ");
-  p.append("code").text(stringifyProc(proc));
-  p.append("span");
-  p.append("span").text(" from ");
-  p.append("code").text(source.name);
-  p.append("span").text(" to ");
-  p.append("code").text(target.name);
-
-  _migrate(proc, source, target);
+  if (proc.is_lxc)
+  {
+      /* Add an alert to let the user know that the migration has started. */
+      var cname = proc.name.replace(' ', '').replace('container', '').replace('lxc', '');
+      var alert = insertAlert();
+      alert.classed("alert-info", true);
+      var p = alert.append("p");
+      p.append("b").text("Info: ");
+      p.append("span").text("Migrating container ");
+      p.append("code").text(cname);
+      p.append("span");
+      p.append("span").text(" from ");
+      p.append("code").text(source.name);
+      p.append("span").text(" to ");
+      p.append("code").text(target.name);
+      _migrate_container(cname, source, target);
+  } else {
+      /* Add an alert to let the user know that the migration has started. */
+      var alert = insertAlert();
+      alert.classed("alert-info", true);
+      var p = alert.append("p");
+      p.append("b").text("Info: ");
+      p.append("span").text("Migrating ");
+      p.append("code").text(stringifyProc(proc));
+      p.append("span");
+      p.append("span").text(" from ");
+      p.append("code").text(source.name);
+      p.append("span").text(" to ");
+      p.append("code").text(target.name);
+
+      _migrate(proc, source, target);
+  }
+}
+
+function _migrate_container(container, source, target) {
+  var req = new XMLHttpRequest();
+
+  req.onload = function() {
+    console.log(this.responseText);
+    var resp = JSON.parse(this.responseText);
+
+    /* Add an alert to the page with info on the result of the dump. */
+    var alert = insertAlert();
+    var p = alert.append("p");
+
+    if (!resp.succeeded) {
+      alert.classed("alert-danger", true);
+
+      p.append("b").text("Migration Failed: ");
+      p.append("span").text("There was a problem migrating ");
+      p.append("code").text(container);
+      p.append("span").text(" from " );
+      p.append("code").text(source.name);
+
+      alert.append("br");
+      alert.append("pre").text(resp.why);
+    } else {
+      var req = new XMLHttpRequest();
+      req.open("get", target.address + "/migrate-lxc-rx?cname=" + container, true);
+      req.send();
+      alert.classed("alert-success", true);
+      p.append("b").text("Migration Succeded! Moved container ");
+      p.append("code").text(container);
+      p.append("span");
+      p.append("span").text(" from ");
+      p.append("code").text(source.name);
+      p.append("span").text(" to ");
+      p.append("code").text(target.name);
+    }
+  };
+  req.open("get", source.address + "/migrate-lxc-tx?cname=" + container, true);
+  req.send();
 }
 
 function _migrate(proc, source, target) {
@@ -68,6 +125,15 @@  function _migrate(proc, source, target) {
 
       alert.append("br");
       alert.append("pre").text(resp.why);
+    } else {
+      alert.classed("alert-success", true);
+      p.append("b").text("Migration Succeded! Moved ");
+      p.append("code").text(stringifyProc(proc));
+      p.append("span");
+      p.append("span").text(" from ");
+      p.append("code").text(source.name);
+      p.append("span").text(" to ");
+      p.append("code").text(target.name);
     }
   };
 
diff --git a/webgui/static/pstree.js b/webgui/static/pstree.js
index 4e5c63d..c5d9a09 100644
--- a/webgui/static/pstree.js
+++ b/webgui/static/pstree.js
@@ -16,11 +16,11 @@ 
  * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
-var nodeLabelOffset = { x:6, y:3 };
+var nodeLabelOffset = { x:9, y:5 };
 var diagonal = d3.svg.diagonal().projection(function(d) { return [ d.y, d.x ]; });
 var dragging = false;
 var tree = d3.layout.tree()
-    .nodeSize([16, 200])
+    .nodeSize([18, 250])
     .children(function(d) { return d.children; })
     .sort(function(a, b) { return d3.ascending(a.name, b.name); });
 
@@ -182,10 +182,13 @@  PSTree.prototype.redraw = function(e) {
         d3.select(this).select("text.node-label").text(function(d) { return d.name; });
       });
 
-  nodeGroups.append("circle").attr({r: 3.0});
+  nodeGroups.append("circle")
+      .attr("r", 6.0)
+      .classed("lxc-circle", function(d) { return d.is_lxc; });
   nodeGroups.append("text")
       .attr(nodeLabelOffset)
-      .classed("node-label", true);
+      .classed("node-label", true)
+      .classed("lxc-label", function(d) { return d.is_lxc; });
 
   nodes
       .transition()
@@ -207,20 +210,24 @@  PSTree.prototype.redraw = function(e) {
   var links = this.linkGroup.selectAll("path.link").data(linkData, function(d) { return d.target.id; });
 
   /* Links are drawn as SVG paths using d3's svg.diagonal helper. */
-  links.enter()
-      .append("path")
-      .attr("class", "link")
-      .style("opacity", 0);
-
-  links
-      .transition()
-      .duration(200)
-      .attr("d", diagonal)
-      .style("opacity", 1);
-
-  links.exit()
-      .transition()
-      .duration(200)
-      .style("opacity", 0)
-      .remove();
+  /* Update the links between the nodes with the latest data. */
+    var linkData = tree.links(nodeData);
+      var links = this.linkGroup.selectAll("path.link").data(linkData, function(d) { return d.target.id; });
+
+        /* Links are drawn as SVG paths using d3's svg.diagonal helper. */
+        links.enter()
+             .append("path")
+             .attr("class", "link")
+             .style("opacity", 0);
+
+        links.transition()
+             .duration(200)
+             .attr("d", diagonal)
+             .style("opacity", 1);
+
+        links.exit()
+             .transition()
+             .duration(200)
+             .style("opacity", 0)
+             .remove();
 };

Comments

Cyrill Gorcunov June 30, 2016, 1:44 p.m.
On Wed, Jun 29, 2016 at 02:20:19PM -0400, Arthur Lockman wrote:
> From: Arthur Lockman <hello@rthr.me>
> 
> Adds support for migrating LXC containers with the p.haul WebUI. It uses the
> checkpoint functionality in LXC and rsync to migrate and restart the
> containers on different hosts.
> 
> Signed-off-by: Arthur Lockman <alockman@redhat.com>

Wow! That's cool stuff!!!
Adrian Reber June 30, 2016, 6:10 p.m.
On Wed, Jun 29, 2016 at 02:20:19PM -0400, Arthur Lockman wrote:
> From: Arthur Lockman <hello@rthr.me>
> 
> Adds support for migrating LXC containers with the p.haul WebUI. It uses the checkpoint functionality in LXC and rsync to migrate and restart the containers on different hosts.
> 
> Signed-off-by: Arthur Lockman <alockman@redhat.com>
> ---
>  README.md                 |  7 +++-
>  phaul/criu_cr.py          |  4 +-
>  phaul/criu_req.py         |  4 +-
>  phaul/iters.py            | 44 +++++++++++++---------
>  phaul/p_haul_vz.py        | 58 ++++++++++++++++++++++++++++
>  webgui/p_haul_web_gui.py  | 36 ++++++++++++++++++
>  webgui/procs.py           |  7 ++++
>  webgui/static/criugui.css | 18 ++++++++-
>  webgui/static/migrate.js  | 96 +++++++++++++++++++++++++++++++++++++++--------
>  webgui/static/pstree.js   | 47 +++++++++++++----------
>  10 files changed, 261 insertions(+), 60 deletions(-)

Thanks for working on this Arthur. The patch, however, needs some work.

Please try to create separate patches for the different code parts your
are changing with detailed commit messages why you are changing it.

It is for example not clear why your are changing the VZ files, but no
changes to the LXC files are made.

With different parts I would expect something like, first the LXC
support in p.haul, then the necessary changes in the webgui and as the
last step the CSS and JS changes (or maybe switch the last steps,
whatever makes more sense).

Thanks for bringing the README up to date, but that could also be a
separate patch.

Also, the webgui currently directly calls the lxc commands and rsync.
The whole migration should be done using p.haul. p.haul, theoretically,
already knows how to migrate LXC container and the necessary steps to
migrate the container's file system. The webgui should only use p.haul
for necessary steps.

So splitting this patch into smaller patches with more detailed
description should make reviewing much simpler.

		Adrian