Sean McGinnis [Thu, 12 Feb 2015 20:52:56 +0000 (14:52 -0600)]
Dell Storage Center: Add retries to API calls
In heavily loaded networks we have seen some cases
of temporary ConnectionErrors when making REST API
calls. There are usually successful calls just prior
and immediately after these failures, so it appears
to be a transient condition.
This patch utilizes the recently merged retry decorator
to add some retry handling to the REST API calls when
this condition is encountered.
With a test script looping through repeatedly running
CI against this first patch I was able to validate
that it addresses the issue. Out of 20 runs, a full
2/3 of the tests passed showing that the retry was
used. These test runs would have failed without the
retry.
The output from these test runs can be viewed here:
http://oslogs.compellent.com/?C=N;O=D
Pertinent results are dell-sc-iscsi-1554792015-12*
Once merged we should be able to enable full third
party CI testing with some expectation of reliable
results.
John Griffith [Mon, 9 Feb 2015 21:45:40 +0000 (14:45 -0700)]
Don't fail target_delete if target doesn't exist
There are cases seen in the Gate where a target delete is
called and an exception is raised because the target does
not exist. In the cinder target driver code we raise this
as an ISCSITargetRemoveFailed exception, but if we're asking
to delete the target and the target doesn't exist we can
probably safely move along.
This patch adds a check for this specific case and logs a warning
and continues rather than failing. We also add a unit test to
check this case.
Alex Meade [Mon, 2 Feb 2015 21:18:15 +0000 (16:18 -0500)]
Limit ram and disk used by ceph backup tests
The test_discard_bytes test in
cinder/tests/test_backup_ceph.py uses a lot of
ram and disk space since it tests the ceph driver with
the default chunk size of 128Mi. This patch lowers it
to use 1024 bytes instead.
Xing Yang [Thu, 5 Feb 2015 17:09:34 +0000 (12:09 -0500)]
Fix detach volume from host problem in VMAX driver
The VMAX driver unmaps a volume from a host without checking
the host info in the connector, resulting the wrong host to be
detached. This patch looks up the host info before detach and
fixes the problem.
VMware: Delay string interpolation in log messages
As per OpenStack developer guidelines, string interpolation
should be delayed to be handled by the logging code. This
patch fixes violations of this guideline in the VMDK driver.
The common code between various VMware drivers was moved to
oslo.vmware library during Icehouse release. The VMDK driver
should be updated to use this library. Changes are mostly
replacing import statements and removing redundant modules
and corresponding test cases.
Summary of changes:
* Replace api with oslo.vmware.api
* Replace vim with oslo.vmware.vim
* Replace pbm with oslo.vmware.pbm
* Replace io_util with oslo.vmware.image_transfer
* Replace vmware_images with oslo.vmware.image_transfer
* Replace read_write_util with oslo.vmware.rw_handles
* Remove error_util and use oslo.vmware.exceptions
* Move VMDK driver specific exceptions to a new module
'exceptions'
* Replace PBM WSDL files with the corresponding files in
oslo.vmware
* Replace PBM related methods in volumeops with the
corresponding ones in oslo.vmware.pbm
Tomoki Sekiyama [Thu, 11 Dec 2014 23:23:18 +0000 (18:23 -0500)]
Enhance iSCSI multipath support
Currently, nova-compute and brick support multipath for iSCSI volume
data path. It depends on response to targets discovery of the main
iSCSI portal, expecting multiple portal addresses to be contained.
However, some arrays only respond to discovery with a single portal
address, even if secondary portals are available. In this case,
the connector cannot know secondary portals and corresponding
iSCSI target IQN, so cannot establish multiple sessions for
the target(s). To enable the connector to login to secondary portals,
cinder should tell all the portal addresses and corresponding
target iqns/luns.
With this patch initialize_connection API will return connection_info
with multiple portal addresses/iqns/luns when multipath=True is
specified in the connector info. For example:
Tom Swanson [Thu, 5 Feb 2015 19:33:03 +0000 (13:33 -0600)]
Dell Storage Center Unit Test Updates for Kilo
Expanded our unit test coverage: test_dellscapi.py.
This tests our api module. We've also added some
minor driver fixes found by unit and other testing.
Some logging statements have been updated.
All changes are in the dell storage center driver
and unit test files.
ChangBo Guo(gcb) [Wed, 26 Nov 2014 03:30:40 +0000 (11:30 +0800)]
Add extra library oslo.concurrency to oslo.config.generator.rc
We used extra library oslo.concurrency in Cinder, and oslo.concurrency
provides configuration options, we need include these options in sample
config file. This commit handles that.
Change-Id: I534539b7e87a3f5dc36722395cbe241a10b2d75e
Xing Yang [Fri, 16 Jan 2015 21:27:23 +0000 (16:27 -0500)]
Support over subscription in thin provisioning
This patch adds support for over subscription in thin provisioning.
The following changes are proposed:
* A configuration option "max_over_subscription_ratio" will be
introduced.
* Driver reports the following capacities and ratios:
* provisioned_capacity
* max_over_subscription_ratio
* Driver can use the newly added configuration option to report
this ratio or it can decide what ratio to report itself.
The value of this ratio is depending on driver implementation
and will be reported together with other capabilities and
capacities by the driver.
* reserved_percentage
* Note: This is an existing parameter reported by the driver.
* Currently it is measured against the free capacity. In this
patch, it will be changed to measure against the total
capacity in the filter scheduler.
* Driver also reports the following capabilities:
* thin_provisioning_support (True or False)
* thick_provisioning_support (True or False)
* Scheduler will use the above new capabilities reported by the
driver to make decisions when choosing a backend.
For more details, please see Cinder spec:
https://review.openstack.org/#/c/129342/12/specs/kilo/
over-subscription-in-thin-provisioning.rst
Kurt Martin [Thu, 5 Feb 2015 00:09:44 +0000 (16:09 -0800)]
Lefthand driver fails to attach a cloned volume
The provider location was not being populated for cloned volumes.
This patch is updating the provider location for cloned volumes
resulting in successful volume attachments.
Abel Lopez [Tue, 13 Jan 2015 02:50:00 +0000 (18:50 -0800)]
Purge deleted rows
Adds the ability to clean up rows that are already marked as
deleted of a certain specified age. Age is calculated as timedelta
from now() in days, which are given at command line
Patrick East [Tue, 3 Feb 2015 20:09:58 +0000 (12:09 -0800)]
Make PureISCSIDriver iSCSI port discovery more flexible
While searching for reachable iSCSI ports on the target flash array it
might fail the discovery command. If this happens on driver setup it
will throw an exception from attempting to use fields on an
uninitialized class variable. This fixes the log message and adds a
retry to the method to make this more flexible with network timeouts.
Jeegn Chen [Sun, 14 Dec 2014 09:17:41 +0000 (17:17 +0800)]
EMC VNX Cinder Driver Update
VNX Direct Driver was contributed in Icehouse and updated in Juno.
This commit is to continuously improve the driver with the
following enhancements in Kilo:
* Performance improvement, especially the synchronized operations
initiatlize_connetion and terminate_connection.
* LUN Number Threshold Support
* Initiator Auto Deregistration
* Force Deleting LUN in Storage Groups
* Code refactor to enhance the robustness
Eric Harney [Fri, 23 Jan 2015 20:41:40 +0000 (15:41 -0500)]
RemoteFS: Use nas_ip and nas_share_path options
This replaces the <x>fs_shares_config file configuration
options with the existing nas_ip option and new
nas_share_path and nas_mount_options configuration
options.
This means that RemoteFS drivers will manage a single
export rather than a handful of unrelated exports.
If the nas_ip and nas_share_path options are set, they
are used. If not, the previous configuration mechanism,
based on loading a set of shares from a file configured
by <x>fs_shares_config will be used.
Also use nas_mount_options to replace
nfs_mount_options for consistency. If nas_mount_options
is not set, nfs_mount_options will be used for
compatibility.
Implements blueprint: remotefs-share-cfg-improvements
DocImpact: new configuration options
John Griffith [Tue, 20 Jan 2015 23:31:57 +0000 (16:31 -0700)]
Enable use of an /etc/cinder/lvm.conf file
During tempest and Rally runs we've noticed occasional
LVM command hangs (lvs, vgs and pvs), we've also gathered
enough data to show that we see very slow response times from
these commands almost all of the time.
It turns out that this seems to be an issue with us scanning
all devices during LVM operations, including devices that may
be Cinder Volumes that are attaching and detaching from the system.
Inspecting a run instrumented with strace shows a number of LVM
commands timing out due to the device being scanned being removed
during scan, and the LVM command in turn waiting until it times out
on the scan that's in process.
This patch just adds the ability to setup a lvm.conf file in
/etc/cinder. The Cinder LVM code will now specifically set
the LVM_SYSTEM_DIR environment variable to that directory for
each of the LVM scan commands in brick/local_dev/lvm.py.
If the system doesn't have the file, we use the empty string
which tells LVM to use it's defaults. This only affects LVM
commands in Cinder, the idea is to ensure we don't impact any
other LVM operations on the node outside of Cinder and that we
behave as we always have in the case of no lvm.conf file being
setup in /etc/cinder. The presence of the file is auto-detected
on brick/localdev/lvm init.
We'll update the OpenStack Devstack deployment scripts to put this
together and fix things up there first. Until that's done and until
we auto-generate the conf (or document it well), this will be a
*partial* bugfix.
I considered adding a default lvm.conf file to cinder/etc/<sample>
that would be copied in on install, but decided against this to
avoid any possible issues with compatability issues between
platforms or versions.
To use, just copy the /etc/lvm/lvm.conf file to /etc/cinder and
modify the filter as appropriate, for example:
To use loopback device only:
filter = [ "a/loop/", "r/.*/" ]
If you have a physical drive like /dev/sdb1
filter = [ "a/dev/sdb1/", "r/.*/" ]
Finally, this patch also goes through and cleans up our cmd
variables in brick/localdev/lvm. We had a mix of using a
cmd array, and strings; this causes inconsistencies and makes
it difficult to extend or modify commands. Switch everything to
using an array and use extend to provide the correct prefix.
Need to update docs to include a recommendation to create an
/etc/cinder/lvm.conf file and set device filters appropriately.
Doc-Impact
Partial-Bug: #1373513
Xing Yang [Mon, 19 Jan 2015 18:27:31 +0000 (13:27 -0500)]
Roll back if VMAX masking view not created
In emc_vmax_masking.py, an exception object is created if
there is a failure in the masking view creation logic.
However, the exception is not raised. As a result,
initialize_connection will succeed in spite of a failure
but the VM will not be able to access the disk.
This patch adds rollback logic to handle failures in
creating masking view. If the masking view cannot be
created, the volume will be added back to the default
storage group and an exception will be raised.
ChangBo Guo(gcb) [Sat, 31 Jan 2015 05:42:14 +0000 (13:42 +0800)]
Drop deprecated namespace for oslo.rootwrap
The oslo team is recommending everyone to switch to the
non-namespaced versions of libraries.[1]
oslo.rootwrap suggests to use oslo_rootwrap.cmd:main.[2]
Jay Wang [Tue, 27 Jan 2015 20:02:59 +0000 (12:02 -0800)]
Fixes attribute content checking
Use proper way to check the volume attribute contents in display_name
and display_description. If they are not empty, translate them to
volume backend description. Modify other places where return empty
string makes sense than None as well as the return value checking.
Aviram Bar-Haim [Sun, 25 Jan 2015 20:50:46 +0000 (22:50 +0200)]
Support iSER driver within the ISCSITarget flow
Currently the iSER driver is supported over TGT only,
and there are a couple of iSER classes that inherits
from iSCSI classes, but most of their functionality is
the same as the iSCSI classes. This code duplication caused
instability in the iSER driver code, when new features or
changes are added to the iSCSI driver flow.
Main changes:
1. Added a new parameter to volume/driver.py in order to
set the iSCSI protocol type to 'iscsi' or 'iser', with default
to 'iscsi'.
2. Configured TGT VOLUME_CONF and VOLUME_CONF_WITH_CHAP_AUTH
with the new iSCSI protocol parameter.
3. Added support for RDMA (using iSER) to cinder-rtstool.
4. Set "driver_volume_type" to "iscsi" or "iser" value, according
to the new parameter value.
5. Added unit tests for the new iSER flow.
6. Added deprecation alert to ISERTgtAdm.
Rich Hagarty [Wed, 14 Jan 2015 17:11:27 +0000 (09:11 -0800)]
HP3Par: Set snapCPG when managing existing volumes
This fixes a corner case where the user did not assign a "Copy CPG"
value to the volume when originally created on the back-end, and
then the user does not assign a "Volume Type" to the volume
when it is then "managed" into OpenStack.
This results in an undefined "snapCPG" value which means no
snapshots can be created for the volume.
The fix is to set the "snapCPG" value to match the "User CPG"
value associated with the volume when the volume is "managed"
TaoBai [Tue, 20 Jan 2015 12:18:39 +0000 (04:18 -0800)]
Failed to discovery when iscsi multipath and CHAP both enabled
Storage server may be configured to protect target discovering phase with CHAP
authentication, in this case existing discovery command will be failed in Nova
when iscsi multipath enabled. Nova need these below discovery auth properties.
"discovery.sendtargets.auth.authmethod",
"discovery.sendtargets.auth.username",
"discovery.sendtargets.auth.password"
Cinder Storage driver need to send discovery auth properties to Nova in this
case and the properties are:
iscsi_properties['discovery_auth_method']
iscsi_properties['discovery_auth_username']
iscsi_properties['discovery_auth_password']
This issue not just for IBM Storwize, but also other storage drivers who need
CHAP authentication to do iscsi discover.
The according nova change: https://review.openstack.org/#/c/148516/
John Griffith [Thu, 15 Jan 2015 15:56:28 +0000 (08:56 -0700)]
Add retry for tgtadm update when tgt exists
For target creation using tgtadm driver, we create a persistence
file for the target, then send a tgt-admin --update 'name' where
name is the specific persistence file we want to read in and update
from.
It turns out that we can hit race conditions where the persistence
file is written, and an update is requested but target has already
done work to make it believe that the target has already been created.
One thought was to just use "update ALL" but this still seems to have
issues, and changes the error to an account exists failure.
This patch takes the brute force approach and adds the cinder.utils
retry decorator to the tgt-admin --update command. To do this
we just break out the tgt-admin --update call into it's own method
and add the decorator to it.
John Griffith [Wed, 28 Jan 2015 04:59:28 +0000 (22:59 -0600)]
Add completion logging for snapshots and volumes
While trying to debug insufficient resource issues occasionally
seen in the gate, it was noted that there's no logging for
resource deletion of volumes and snapshots.
This patch adds some simple logging in the cinder.volume.api
to note when and why a delete call might fail as an info
message indicating when the request is successfully issued
to the scheduler/rpc layer.
In addition, this patch also adds the same sort of logging
to the LVM backend driver.
This additional logging will help troubleshoot the current
resource issues/races we're encountering in the gate.
John Griffith [Tue, 27 Jan 2015 23:19:11 +0000 (16:19 -0700)]
Create SolidFire Template account on init
The standard flow in the SolidFire volume is to create
accounts dynamically as needed based on project id.
Unfortunately the clone_image method "forgot" about this
detail and calls the private clone_volume method without
a valid account resulting in an AccountNotFound
Exception.
The clone_image method is enabled using a specific account
provided via the configuration file. Rather than worry
about dynamic account creation on clone_image call, this
patch just makes it part of the init phase. We check to
see if template caching is enabled and if it is, just
go ahead and check/create the account on startup.
John Griffith [Wed, 28 Jan 2015 04:20:38 +0000 (22:20 -0600)]
Add debug message for lvremove after udev settle
Just add a simple debug level log message for lvremove calls that
are made and successfully completed as part of error recovery.
Currently some platforms commonly receive a "Unable to deactivate"
error message when running an lvremove. We have a retry mechanism
that performs a udevsettle and retries that in most cases results
in a succesful removal.
This patch just adds an explicit debug log message so we can track
this more easily and gather some statistical data on it.