John Griffith [Tue, 20 Jan 2015 23:31:57 +0000 (16:31 -0700)]
Enable use of an /etc/cinder/lvm.conf file
During tempest and Rally runs we've noticed occasional
LVM command hangs (lvs, vgs and pvs), we've also gathered
enough data to show that we see very slow response times from
these commands almost all of the time.
It turns out that this seems to be an issue with us scanning
all devices during LVM operations, including devices that may
be Cinder Volumes that are attaching and detaching from the system.
Inspecting a run instrumented with strace shows a number of LVM
commands timing out due to the device being scanned being removed
during scan, and the LVM command in turn waiting until it times out
on the scan that's in process.
This patch just adds the ability to setup a lvm.conf file in
/etc/cinder. The Cinder LVM code will now specifically set
the LVM_SYSTEM_DIR environment variable to that directory for
each of the LVM scan commands in brick/local_dev/lvm.py.
If the system doesn't have the file, we use the empty string
which tells LVM to use it's defaults. This only affects LVM
commands in Cinder, the idea is to ensure we don't impact any
other LVM operations on the node outside of Cinder and that we
behave as we always have in the case of no lvm.conf file being
setup in /etc/cinder. The presence of the file is auto-detected
on brick/localdev/lvm init.
We'll update the OpenStack Devstack deployment scripts to put this
together and fix things up there first. Until that's done and until
we auto-generate the conf (or document it well), this will be a
*partial* bugfix.
I considered adding a default lvm.conf file to cinder/etc/<sample>
that would be copied in on install, but decided against this to
avoid any possible issues with compatability issues between
platforms or versions.
To use, just copy the /etc/lvm/lvm.conf file to /etc/cinder and
modify the filter as appropriate, for example:
To use loopback device only:
filter = [ "a/loop/", "r/.*/" ]
If you have a physical drive like /dev/sdb1
filter = [ "a/dev/sdb1/", "r/.*/" ]
Finally, this patch also goes through and cleans up our cmd
variables in brick/localdev/lvm. We had a mix of using a
cmd array, and strings; this causes inconsistencies and makes
it difficult to extend or modify commands. Switch everything to
using an array and use extend to provide the correct prefix.
Need to update docs to include a recommendation to create an
/etc/cinder/lvm.conf file and set device filters appropriately.
Doc-Impact
Partial-Bug: #1373513
Jay Wang [Tue, 27 Jan 2015 20:02:59 +0000 (12:02 -0800)]
Fixes attribute content checking
Use proper way to check the volume attribute contents in display_name
and display_description. If they are not empty, translate them to
volume backend description. Modify other places where return empty
string makes sense than None as well as the return value checking.
Rich Hagarty [Wed, 14 Jan 2015 17:11:27 +0000 (09:11 -0800)]
HP3Par: Set snapCPG when managing existing volumes
This fixes a corner case where the user did not assign a "Copy CPG"
value to the volume when originally created on the back-end, and
then the user does not assign a "Volume Type" to the volume
when it is then "managed" into OpenStack.
This results in an undefined "snapCPG" value which means no
snapshots can be created for the volume.
The fix is to set the "snapCPG" value to match the "User CPG"
value associated with the volume when the volume is "managed"
TaoBai [Tue, 20 Jan 2015 12:18:39 +0000 (04:18 -0800)]
Failed to discovery when iscsi multipath and CHAP both enabled
Storage server may be configured to protect target discovering phase with CHAP
authentication, in this case existing discovery command will be failed in Nova
when iscsi multipath enabled. Nova need these below discovery auth properties.
"discovery.sendtargets.auth.authmethod",
"discovery.sendtargets.auth.username",
"discovery.sendtargets.auth.password"
Cinder Storage driver need to send discovery auth properties to Nova in this
case and the properties are:
iscsi_properties['discovery_auth_method']
iscsi_properties['discovery_auth_username']
iscsi_properties['discovery_auth_password']
This issue not just for IBM Storwize, but also other storage drivers who need
CHAP authentication to do iscsi discover.
The according nova change: https://review.openstack.org/#/c/148516/
John Griffith [Wed, 28 Jan 2015 04:59:28 +0000 (22:59 -0600)]
Add completion logging for snapshots and volumes
While trying to debug insufficient resource issues occasionally
seen in the gate, it was noted that there's no logging for
resource deletion of volumes and snapshots.
This patch adds some simple logging in the cinder.volume.api
to note when and why a delete call might fail as an info
message indicating when the request is successfully issued
to the scheduler/rpc layer.
In addition, this patch also adds the same sort of logging
to the LVM backend driver.
This additional logging will help troubleshoot the current
resource issues/races we're encountering in the gate.
Rich Hagarty [Sat, 24 Jan 2015 01:44:53 +0000 (17:44 -0800)]
HP 3PAR modules have bad log messages
Some log messages are incorrectly formatted. Specifically, those
messages that take multiple variables. In this case, each variable
needs to be assigned a unique placeholder.
This defect was recently introduced when all log messages where
updated to use "," instead of "%" (see
Change-Id: I23d53d66fda47981ed8f20b618b2ced8ef6e0682),
which allowed translations to be ignored if not required by current
log levels.
John Griffith [Mon, 26 Jan 2015 22:21:30 +0000 (15:21 -0700)]
Create SolidFire Template account on init
The standard flow in the SolidFire volume is to create
accounts dynamically as needed based on project id.
Unfortunately the clone_image method "forgot" about this
detail and calls the private clone_volume method without
a valid account resulting in an AccountNotFound
Exception.
The clone_image method is enabled using a specific account
provided via the configuration file. Rather than worry
about dynamic account creation on clone_image call, this
patch just makes it part of the init phase. We check to
see if template caching is enabled and if it is, just
go ahead and check/create the account on startup.
wuyuting [Mon, 26 Jan 2015 18:50:28 +0000 (02:50 +0800)]
Fetch_to_volume_format calls copy_volume using wrong parameter
When creating a volume from an image, if qemu-img is not installed,
fetch_to_volume_format will call volume_utils.copy_volume to copy
image to volume. Copy_volume need the size of image in megabyte,
but fetch_to_volume_format call it using size in bytes.
Zhiteng Huang [Mon, 12 Jan 2015 05:27:15 +0000 (13:27 +0800)]
Raise correct exception when validate_connector failed
Cinder volume manager uses validate_connector() method to verify if required
information is in connector when handling initialize_connection() request.
validate_connector() is actually a pure input validation method, basically
checking if 'initiator' or 'wwpns' is in connector if storage protocol is
iSCSI or FC. However, when required information is missing, currently drivers
raises either VolumeBackendAPIException or VolumeDriverException, which would
then bubble up to API and then to user (Nova) as InternalServerError.
This change adds a new exception - InvalidConnectorException, that drivers
should raise when connector is found not valid. With that, Cinder API would
raise BadRequest instead to user, suggesting things are missing in request.
Xing Yang [Fri, 16 Jan 2015 17:45:53 +0000 (12:45 -0500)]
Add provisioned_capacity
This change is needed by the over subscription patch:
https://review.openstack.org/#/c/142171/
This patch makes the following change:
Add 'provisioned_capacity' to Cinder base driver, Cinder reference
driver (LVM), and scheduler host_manager.
provisioned_capacity is the apparent allocated space indicating how
much capacity has been provisioned.
Example: User A created 2x10G volumes in Cinder from backend A, and
user B created 3x10G volumes from backend A directly, without using
Cinder. Assume those are all the volumes provisioned on backend A.
The total provisioned_capacity will be 50G and that is what the driver
should be reporting.
Yusuke Hayashi [Fri, 26 Dec 2014 12:21:04 +0000 (21:21 +0900)]
Move 3 Fujitsu ETERNUS DX related file
Since there are three volume driver files relating to
Fujitsu ETERNUS DX in cinder.volume.drivers,
I make 'fujitsu' directory at cinder.volume.drivers
and I move these files to the directory.
John Griffith [Thu, 22 Jan 2015 18:22:25 +0000 (11:22 -0700)]
Add retry to lvm snapshot create
We have some occasional issues with snapshot-create
failing for what looks to be conflicts with udev. It
looks like this problem is a status conflict between LVM
cache and udev, and in most cases the best way to get
around this is to retry the command a few times until
the cache and udev are back in sync.
This patch uses the newly added retry decorator and
for now we're just adding it to the snapshot create
call. We're using the default values for interval and
retry count but we can certainly adjust this as needed.
John Griffith [Thu, 22 Jan 2015 17:35:31 +0000 (10:35 -0700)]
Add a generic retry decorator to cinder/utils
Retries are something that we use in a number of
places, and could probably use in a few more. Rather
than continue writing custom retry code or silly loops
let's add a generic retry decorator class that can be
initialized and used as needed.
This patch leverages the retrying library that's already
in OpenStack Requirements and used in other places. We
just add a wrapper around it for some logging and add
a bit of error handling incase somebody sets retries to 0
Zhiteng Huang [Wed, 14 Jan 2015 03:54:40 +0000 (11:54 +0800)]
Cleanup unused DB APIs, part I
There are unused DB APIs, some of them were leftovers of other features
removal, some are purely added without being used. This change removes
these unused DB APIs.
Note that there are some DB APIs only used by unit tests, in other words
they are used in unit tests where they are not the unit of code being
tested. Those will be cleaned up in follow-up patch(es).
rajinir [Tue, 20 Jan 2015 19:52:32 +0000 (13:52 -0600)]
Fix the eqlx driver to retry on ssh timeout
When the ssh session is timing out, the driver
should make attempts to retry based on the value
in eqlx_cli_max_retries. Instead it was raising
the exception and bailing out on a single attempt.
Fixed the driver to raise a different exception
so the ssh sessions can be retried.
Added unit tests to ensure the max retries
happen
Also fixed the actual attempts made in the message
John Griffith [Fri, 23 Jan 2015 00:21:27 +0000 (17:21 -0700)]
Add retrying lib from global requirements
Submitted a retry decorator and it was pointed out that
the retrying library was already in global-requirements,
it would probably be good to leverage this and use it rather
than roll our own, so let's add it to our version of requirements
and then we can look at whether using a decorator or just
leveraging the lib is the right way to go.
Joshua Harlow [Thu, 22 Jan 2015 19:22:21 +0000 (11:22 -0800)]
Remove usage of taskflow 'utils.misc' module
The failure type at its old location in an internal
utils directory of taskflow is deprecated (in general
usage of taskflow utils code should be restricted/not
done) and it has been moved to a public location of
taskflow.types (which is ok to use and is not
deprecated).
This change updates to use the better and more supported
module/code location instead.
Jay S. Bryant [Wed, 21 Jan 2015 07:43:00 +0000 (01:43 -0600)]
Move oslo.serialization to oslo_serialization namespace
This is the fifth in a series of changes to move to using
the new oslo_<library> namespace that is being used for
oslo libraries.
There is currently a shim in place that is allowing the old
oslo.<library> imports to work, but we need to be prepared for
when the shims go away. Thus, we need patches like this one to
move to the new namespace.
This patch also updates our hacking check to ensure that no instances
of oslo.utils sneak back in.
Sean McGinnis [Tue, 20 Jan 2015 22:50:19 +0000 (16:50 -0600)]
Improve debug logging of Dell Storage Center driver
Adding debug message and expanding some existing log statements
to make debugging and auditing easier. Some interesting calls
were not being logged while others had logging that was missing
some relevant arguments.
Ilya Tyaptin [Thu, 6 Nov 2014 12:32:33 +0000 (16:32 +0400)]
Fix _usage_from_snapshot in volume.utils
Now in this function we trying to get snapshot_ref.volume for
collecting 'availability_zone'. It's invalid because snapshot_ref
in this function is __dict__. In this patchset there is fix for
it.
Creating vCenter inventory folder for grouping volumes will
fail with AttributeError if the vCenter's datacenter doesn't
have any child folder under vmFolder (inventory folder for
grouping virtual machines). This patch fixes it.
Joshua Harlow [Thu, 22 Jan 2015 00:32:28 +0000 (16:32 -0800)]
Shrink down customized logging listener
Most of the code for this has been moved into taskflow
upstream; so now we should be able to override a single
*cinder-specific* method and retain the same functionality
that previously existed but in a more shareable manner (for
example glance will be using similar code).
Mike Perez [Thu, 22 Jan 2015 00:06:49 +0000 (16:06 -0800)]
Prevent deleting volumes in a consistency group
Currently you have to destroy the consistency group if you want to
delete these volumes. In the Kilo release, we'll have the ability to
remove a volume from a consistency group, which then can be deleted.
wuyuting [Sun, 18 Jan 2015 22:32:34 +0000 (06:32 +0800)]
Fix bug in rbd driver: the cloned volume size is wrong
The cloned volume size is wrong when the size is different
from source volume. This is because rbd driver doesn't
resize the volume when clone has completed.
The error message shown when the parser finds a parser error
says, 'file not found' which causes confusion on the user when
he/she needs to debug the real cause of the problem. This patch fixes
this by testing first if the file exist and then throwing a proper error
message.
Also corrects an error when appending the config_file_name to the message on
the NFS driver and fixes LOG messages according to oslo.i18n guidelines.
Jay S. Bryant [Tue, 13 Jan 2015 04:55:45 +0000 (22:55 -0600)]
Ensure lazy translation is disabled properly
Commit 894f20d9cf57b36ccf9a675c6b2b070d56c9b297 changed the way that
enable_lazy() is being configured in cinder's test cases. The changes
were required to remove the use of _lazy from the oslo's i18n library.
_lazy was removed from i18n and the code that was accessing the internal
variable broke. The commit referenced above made changes to remove
the use of _lazy.
The commit, however, changed the behavior of the test cases to only
enable_lazy without disabling it, which diverges from the original
behavior.
This commit uses the new oslo.i18n ToggleLazy fixture (added in 1.3.0).
Jay S. Bryant [Fri, 16 Jan 2015 22:54:27 +0000 (16:54 -0600)]
Move oslo.utils to oslo_utils namespace
This is the fourth in a series of changes to move to using
the new oslo_<library> namespace that is being used for
oslo libraries.
There is currently a shim in place that is allowing the old
oslo.<library> imports to work, but we need to be prepared for
when the shims go away. Thus, we need patches like this one to
move to the new namespace.
This patch also updates our hacking check to ensure that no instances
of oslo.utils sneak back in.
Eric Harney [Mon, 19 Jan 2015 19:01:45 +0000 (14:01 -0500)]
Make test_create_delete_snapshot more robust
This patch does two things to improve this test, and
help debug issues with it failing.
- Add additional checks for contents of the first two
expected notifications.
- Assert the length of the notifications list after checking
its contents so that if it contains unexpected items, we can
see what they are. This should help with the current gate
failures.
Eric Harney [Mon, 19 Jan 2015 18:10:00 +0000 (13:10 -0500)]
Add policy_dirs conf fixture
Unit tests that call into the policy enforcer from openstack
common will try to load files from the default directory of
'policy.d'.
Set the policy_dirs option to [] for unit tests since tests
are not trying to use policy from a directory like this, and
this will prevent test failures.