John Griffith [Fri, 10 Oct 2014 01:22:03 +0000 (19:22 -0600)]
Move SolidFire driver from httplib to requests
The SolidFire driver has been pretty static for a number of
years now, this change is to move from httplib for API calls
to requests. There are a number of advantages to this, including
performance, simplicity and ability to add things like ssl support
easily.
In addtion this change removes the confusing looping/retry mechanisms
that were in the issue_api_request method and replaces it with a
retry decorator for the exceptions we're interested in retrying.
Finally, I realize that my unit tests suck! That will be one of the
follow up items after a bit more clean up in the driver.
During create_backup failure handling, backup_update fails with
DataError ("Data too long for column") if the fail_reason is
greater than 255 characters. As a result, backup status is stuck
in 'creating' state. This patch avoids the problem by truncating
fail_reason to 255 characters before update.
Sean McGinnis [Tue, 7 Oct 2014 15:29:26 +0000 (10:29 -0500)]
Fix eqlx CLI output parsing on bad input
The eqlx driver would identify CLI command completion
by looking for the system name prompt at the end of
the output ("ARRAY_NAME>"). In some cases where there
is bad input the array will print an error, then
prepopulate the command again so it can be edited,
resulting in the output:
"ARRAY_NAME> [bad command]"
The array name prompt only gets printed on command
completion, so the fix is to look for the prompt
anywhere in the CLI output.
Sean McGinnis [Tue, 7 Oct 2014 15:10:57 +0000 (10:10 -0500)]
Eqlx fix NoSuchOptError for volume_name_template on clone
The eqlx driver was referencing the volume_name_template
config setting via self.configuration.volume_name_template.
This option is not imported in self.configuration.
The curent preferred method for volume clone is to reference
the passed in name, avoiding the need for the driver to
know what the naming template is all together.
John Griffith [Tue, 7 Oct 2014 17:49:58 +0000 (11:49 -0600)]
Make sure device support Direct before setting
We added '-t none' option to the qemu-img convert operation
in image_utils.py a while back to accomodate a couple of
backend devices that didn't flush writes on disconnect.
(Change: I7a04f683add8c23b9125fe837c4048ccc3ac224d)
The only problem here is that some backend devices don't
support Direct mode and raise an exception and fail when
setting this option.
This patch adds a simple check using dd to see if the dest
supports the Direct flag and only sets '-t none' if the device
does in fact support it.
Additionally it was brought up that even yet other backends
are using file devices not blk devices. In their case setting
Direct will still work, however it's sub-optimal as qemu-convert
has internal mechanisms to make sure flushing etc are done
correctly and efficiently for those devices. So to accomodate
that particular use case I'm also adding a check if blk dev
that can be used for determining whether to set Direct for the
qemu-convert process.
Windows SMBFS: Handle volume_name in _qemu_img_info
The volume_name is now parsed to the _qemu_img_info wrapper. As
this method is not prone to security issues because this driver
does not support raw images (at least not yet), we don't have to
perform any checks on the backing image file path.
Thus, this method simply ignores this argument that will be parsed
by the base class methods.
Sean McGinnis [Fri, 26 Sep 2014 20:21:35 +0000 (15:21 -0500)]
Handle eqlx SSH connection close on abort.
EqualLogic array CLI operation timeout causes the
SSH thread to be aborted. This would cause SSH
sessions to be orphaned and hit a max connection
limit on the array. This fix catches these aborts
and makes sure the connection is closed.
Clinton Knight [Fri, 26 Sep 2014 16:07:44 +0000 (12:07 -0400)]
Deprecate / obsolete NetApp volume extra specs
The NetApp Data ONTAP (Cluster-mode) NFS & iSCSI drivers for Juno support
the Cinder pools feature, but the drivers are reporting two qualified
extra specs that must be converted to unqualified extra specs in order to
be used by the Cinder scheduler's capability filter. Furthermore, there
are four extra specs that must be deprecated due to having the pools
feature. Warnings will be logged during volume creation if any of the
obsolete or deprecated extra specs are seen in the volume type.
Patrick East [Tue, 30 Sep 2014 18:47:42 +0000 (11:47 -0700)]
Fix race condition in ISCSIConnector _disconnect_volume_multipath_iscsi
This is a similar issue as seen in
https://bugs.launchpad.net/cinder/+bug/1375382
The list of devices returned by driver.get_all_block_devices() in
_disconnect_volume_multipath_iscsi will potentially contain broken
symlinks as the SCSI devices have been deleted from calling
self._linuxscsi.remove_multipath_device(device_realpath) right before
_disconnect_volume_multipath_iscsi but the udev rule for the symlink
may not yet have completed.
Adding in a check to os.path.exists() will ensure that we will not
consider the broken symlinks as an “in use” device.
Juan Zuluaga [Wed, 24 Sep 2014 22:51:07 +0000 (18:51 -0400)]
ZFSSA iSCSI vol create fails with vol type option
Vol create with volume-type option is not working since
volume_backend_name contains the class name as
predefined string. No matter what was specified in cinder.conf
as volume_backend_name, volume creation failed.
Multi-backend option and using extra specs to create custom volumes
won't work.
The fix is to look whether volume_backend_name is part of the
configuration or falls into the class name in case there is
no backend name.
Patrick East [Mon, 29 Sep 2014 17:54:22 +0000 (10:54 -0700)]
Fix race condition in ISCSIConnector disconnect_volume
The list of devices returned by driver.get_all_block_devices() will
sometimes contain broken symlinks as the SCSI device has been deleted
but the udev rule for the symlink has not yet completed.
Adding in a check to os.path.exists() will ensure that we will not
consider the broken symlinks as an “in use” device.
Vincent Hou [Fri, 12 Sep 2014 08:10:02 +0000 (16:10 +0800)]
IBM Storwize driver: Retype the volume with correct empty QoS
* Currently for Storwzie driver, if the new type does not have QoS
configurations, the old QoS configurations remain in the volume after
retyping it. It should be retyped into a volume with empty QoS for the
Storwize driver.
* Refactor three dicts into one for better maintainance of the QoS keys
for Storwize driver.
VMware:Unquote folder name for folder exists check
vCenter server escapes special characters in the folder name using URL
encoding and returns back the encoded string while querying. This causes
the check for folder existence to return false. Therefore, folder
creation is reattempted which eventually fails. This patch fixes the
problem by decoding the folder name returned by vCenter before
comparison.
ArkadyKanevsky [Wed, 27 Aug 2014 21:47:37 +0000 (16:47 -0500)]
Fixing format for log messages
code_cleanup_batching for EQL driver
Follow log essage format for i18n - http://docs.openstack.org/developer/oslo.i18n/guidelines.html#adding-variables-to-log-messages
Jay S. Bryant [Thu, 25 Sep 2014 20:58:41 +0000 (15:58 -0500)]
Update /etc/cinder/cinder.conf.sample for memcache
It appears that an update to keystone middleware earlier today
added options for memcache_secret_key, memcache_pool_dead_retry,
memcache_pool_maxsize, memcache_pool_socket_timeout,
memcache_pool_unused_timeout, memcache_pool_conn_get_timeout and
memcache_use_advanced_pool. The commit that added these options
was: a7beb50b38be5c3dd4c44d68ad79d1bb206dab6b - "Add an optional
advanced pool of memcached clients".
This has once again caused the check_uptodate.sh script to fail.
During attach to a nova instance, the backing VM corresponding to the
volume is relocated only if the nova instance's ESX host cannot access
the backing's current datastore. The storage profile is ignored and
the volume's virtual disk might end up in a non-compliant datastore.
This patch fixes the problem by checking storage profile compliance of
the current datastore.
Rick Chen [Wed, 24 Sep 2014 09:08:52 +0000 (17:08 +0800)]
Failed to re-detach volume when volume detached.
When first request command detach the volume, but the back-end
storage state is in-processing or busy. Next retry command will
got the error code that describe the volume already detached.
Jay S. Bryant [Fri, 19 Sep 2014 17:46:21 +0000 (12:46 -0500)]
Fix unnecessary WSGI worker warning at API startup
There was a bug in WSGIService in the way that it was
checking the osapi_volume_workers option. It was using
getattr() to see if the option was set, if not it was supposed
to set the value to processutils.get_worker_count(). This,
however, never happened because getattr interpreted the default
'None' value to be a value. So, on any system with no value set
the self.workers < 1 check would be hit and a warning would be
output.
Nova had changed their approach to this option to avoid this
problem. This patch pulls Nova's approach into Cinder for
consistency. Cinder will now use processutils.get_worker_count()
if no option is set in /etc/cinder/cinder.conf and when the user sets
osapi_volume_workers to 0. A negative value will cause an
InvalidInput exception to be thrown.
Mark Sturdevant [Tue, 23 Sep 2014 05:45:14 +0000 (22:45 -0700)]
Fix ssh_host_key_file default in help and config.sample.conf
The commit message and the actual default say the default value for
ssh_host_key_file is $state_path/ssh_known_hosts, but the
config.conf.sample and the config opts help say it is
"$state_path/known_hosts".
Fix the help and config.conf.sample to match the actual default.
Downgrade 'infinite' and 'unknown' capacity in weigher
When FilterScheduler was first introduced into Cinder, drivers were
required for the first time to report capacity. Some drivers preferred
to report 'infinite' or 'unknown' capacity because they were doing
thin-provisioning or the total capacity kept increasing. Now that we
have better support for thin-provisioning and we do find unrealistic
capacity couldn't do us any good in making optimal scheduling decision,
because 'infinite' and 'unknown' would always have the highest weight
when the weight multiplier is positive, which in most cases it is.
Drivers are expected to avoid sending 'infinite' 'unknown' capacity
anymore, instead, should report an actual real number for total/free
capacity.
This fix doesn't fix the driver, instead a small tweak is added to
CapacityWeigher in order to downgrade those drivers who report
'infinite' or 'unknown' as free capacity. In particular, those who
report 'infinite'/'unknown' free capacity will be adjusted to be the
one has lowest weight, no matter in 'spreading' (weight multiplier>0)
or 'stacking' (weight multiplier<0) mode.
Jeremy Stanley [Mon, 22 Sep 2014 12:40:43 +0000 (12:40 +0000)]
Remove unused py33 tox env
Based on review comments in https://review.openstack.org/118771 it's
apparent that Cinder is quite a ways from Py3K support, and
developers are not expected to run the py33 tox env. Rather than add
more envs for later Python interpreter versions which will be
equally broken, just remove it for now.
Xing Yang [Sat, 20 Sep 2014 22:23:11 +0000 (18:23 -0400)]
DB migration 25->24 failed when dropping column
"cinder-manage db sync 24" failed when dropping column cgsnapshot_id
from the snapshots table.
The reason that drop column failed was because of the foreign key
constraint. MySQL cannot drop column until the foreign key constraint
is removed. So the solution is to remove the foreign key first, and
then drop the column. This affects the cgsnapshot_id column in the
snapshots table and the consistencygroup_id column in the volumes table.
With pool support added to Cinder, now we are kind of in an awkward
situation where we require admin to input exact location for volumes
to-be managed (imported) or migrated, which must have pool info, but
there is no way to find out what pools are there for backends except
looking at the scheduler log. That causes bad user experience, and
thus is a bug from UX POV.
This change simply adds a new admin-api extension to allow admin to
fetch all the pool information from scheduler cache (memory), which
closes the gap for end users.
This extension provides two level of pool information: names only or
detailed information:
Pool name only:
GET http://CINDER_API_ENDPOINT/v2/TENANT_ID/scheduler-stats/get_pools
Detailed Pool info:
GET http://CINDER_API_ENDPOINT/v2/TENANT_ID/scheduler-stats/get_pools
\?detail\=True
The latest 3PAR firmware supports a hostname of 31 characters.
This patch increases the hostname on the 3PAR from 23 characters to
31.
The driver currently has a fallback mechanism in place for detecting
existing hosts. This handles the case where upgrading from
Icehouse to Juno where Icehouse hosts have a limit of 23 characters.
Mark Sturdevant [Sun, 14 Sep 2014 00:04:27 +0000 (17:04 -0700)]
HP 3PAR drivers should not claim to have 'infinite' space
The HP 3PAR drivers report 'infinite' space when there is not a limit
set on the CPG. In this case, it would be better to at least use the
free space estimate of the array instead of 'infinite'.
Xing Yang [Thu, 18 Sep 2014 20:50:58 +0000 (16:50 -0400)]
Add tests for consistency groups DB migration
The consistency group patch added 2 new DB migration versions:
025 and 026. However, there were no unit tests to support them in
cinder/tests/test_migrations.py. This patch added missing tests.
John Griffith [Thu, 18 Sep 2014 03:16:18 +0000 (21:16 -0600)]
Verify requested size in volume.api create
Currently we're not checking that the input value requested
for size on volume create is valid, but taskflow portion of
the code expects it to be and as a result when it receives invalid
input we litter the logs with Trace messages.
This patch adds a check that verifies that if we pass in a
size to create_volume in the volume.api that it is in fact
an int or a string representation of an int.
Currently socket options, socket.SO_REUSEADDR and socket.SO_KEEPALIVE
are set only if SSL is enabled.
The above socket options should be set no matter SSL is enabled or not.
Mark Sturdevant [Fri, 12 Sep 2014 21:03:28 +0000 (14:03 -0700)]
HP 3PAR: Allow retype when the old snapshot CPG (3PAR pool) is None
A common provisioning group (CPG) is a virtual pool of logical disks.
A volume is stored in a CPG. The volume's snapshots may be stored in
the same CPG or in a separate CPG. In 3PAR this is the "snapCPG".
Before the 3PAR driver supported "manage", the snapCPG was never None in
OpenStack because when we create volumes, we explicitly default snapCPG
to match the volume CPG unless otherwise specified. So, the original
retype pre-checks raised an exception if this unexpected case occurred
and a unit test was provided for that exception.
Now that we support manage_existing(), it should be a valid use case
to manage an existing volume created outside of OpenStack with a snapCPG
set to None. Unfortunately, when we applied volume-type settings to this
volume we would hit that old exception.
That exception has been removed.
Removing it required the following changes:
1. When retype pre-checks validate the domain of the new snapCPG, it
will not use the optional old snapCPG. Instead it will compare the
new snapCPG domain with the old volume CPG domain. This satisfies
the requirement that domains cannot be mixed and avoids the need
to have an old snapCPG setting.
2. Remove the no longer used old_snap_cpg parameter and remove the
code that raised an exception "if not old_snap_cpg".
3. Remove the unit test that was just to test that obsolete exception.
4. Adjust the exiting test_retype_across_snap_cpg_domains to
verify that a snapCPG domain that does not match the volume
CPG domain will still raise Invalid3PARDomain