We know something is causing lvs/vgs commands to block while deleting a
volume and this is causing Tempest to timeout while waiting for the
volume to be deleted. What we don't have right now is very good
(specific) logging when this happens, unless we get messages in syslog
for lvm tasks taking more than 120 seconds, but that doesn't always
happen when we see the volume delete timeout in Tempest.
This patch adds a check for when getting logical volumes and volume
groups takes longer than 60 seconds and logs a warning if that happens.
This is helpful in production also because the default interval for
periodic tasks is 60 seconds so having these take longer than that time
could cause periodic tasks to block up on each other and you'll get
warnings from the FixedIntervalLoopingCall in oslo which is controlling
the task runs.
Related-Bug: #
1373513
Change-Id: I7ac5ddbbd9bda7744db33f0bb8d56879301f5538
import itertools
import math
import re
+import time
from cinder.brick import exception
from cinder.brick import executor
if vg_name is not None:
cmd.append(vg_name)
+ lvs_start = time.time()
(out, err) = putils.execute(*cmd,
root_helper=root_helper,
run_as_root=True)
+ total_time = time.time() - lvs_start
+ if total_time > 60:
+ LOG.warning(_('Took %s seconds to get logical volumes.'),
+ total_time)
lv_list = []
if out is not None:
if vg_name is not None:
cmd.append(vg_name)
+ start_vgs = time.time()
(out, err) = putils.execute(*cmd,
root_helper=root_helper,
run_as_root=True)
+ total_time = time.time() - start_vgs
+ if total_time > 60:
+ LOG.warning(_('Took %s seconds to get volume groups.'), total_time)
vg_list = []
if out is not None: