John Griffith [Tue, 7 Oct 2014 17:49:58 +0000 (11:49 -0600)]
Make sure device support Direct before setting
We added '-t none' option to the qemu-img convert operation
in image_utils.py a while back to accomodate a couple of
backend devices that didn't flush writes on disconnect.
(Change: I7a04f683add8c23b9125fe837c4048ccc3ac224d)
The only problem here is that some backend devices don't
support Direct mode and raise an exception and fail when
setting this option.
This patch adds a simple check using dd to see if the dest
supports the Direct flag and only sets '-t none' if the device
does in fact support it.
Additionally it was brought up that even yet other backends
are using file devices not blk devices. In their case setting
Direct will still work, however it's sub-optimal as qemu-convert
has internal mechanisms to make sure flushing etc are done
correctly and efficiently for those devices. So to accomodate
that particular use case I'm also adding a check if blk dev
that can be used for determining whether to set Direct for the
qemu-convert process.
Juan Zuluaga [Wed, 24 Sep 2014 22:51:07 +0000 (18:51 -0400)]
ZFSSA iSCSI vol create fails with vol type option
Vol create with volume-type option is not working since
volume_backend_name contains the class name as
predefined string. No matter what was specified in cinder.conf
as volume_backend_name, volume creation failed.
Multi-backend option and using extra specs to create custom volumes
won't work.
The fix is to look whether volume_backend_name is part of the
configuration or falls into the class name in case there is
no backend name.
Sean McGinnis [Fri, 26 Sep 2014 20:21:35 +0000 (15:21 -0500)]
Handle eqlx SSH connection close on abort.
EqualLogic array CLI operation timeout causes the
SSH thread to be aborted. This would cause SSH
sessions to be orphaned and hit a max connection
limit on the array. This fix catches these aborts
and makes sure the connection is closed.
Patrick East [Tue, 30 Sep 2014 18:47:42 +0000 (11:47 -0700)]
Fix race condition in ISCSIConnector _disconnect_volume_multipath_iscsi
This is a similar issue as seen in
https://bugs.launchpad.net/cinder/+bug/1375382
The list of devices returned by driver.get_all_block_devices() in
_disconnect_volume_multipath_iscsi will potentially contain broken
symlinks as the SCSI devices have been deleted from calling
self._linuxscsi.remove_multipath_device(device_realpath) right before
_disconnect_volume_multipath_iscsi but the udev rule for the symlink
may not yet have completed.
Adding in a check to os.path.exists() will ensure that we will not
consider the broken symlinks as an “in use” device.
Clinton Knight [Fri, 26 Sep 2014 16:07:44 +0000 (12:07 -0400)]
Deprecate / obsolete NetApp volume extra specs
The NetApp Data ONTAP (Cluster-mode) NFS & iSCSI drivers for Juno support
the Cinder pools feature, but the drivers are reporting two qualified
extra specs that must be converted to unqualified extra specs in order to
be used by the Cinder scheduler's capability filter. Furthermore, there
are four extra specs that must be deprecated due to having the pools
feature. Warnings will be logged during volume creation if any of the
obsolete or deprecated extra specs are seen in the volume type.
Vincent Hou [Fri, 12 Sep 2014 08:10:02 +0000 (16:10 +0800)]
IBM Storwize driver: Retype the volume with correct empty QoS
* Currently for Storwzie driver, if the new type does not have QoS
configurations, the old QoS configurations remain in the volume after
retyping it. It should be retyped into a volume with empty QoS for the
Storwize driver.
* Refactor three dicts into one for better maintainance of the QoS keys
for Storwize driver.
Windows SMBFS: Handle volume_name in _qemu_img_info
The volume_name is now parsed to the _qemu_img_info wrapper. As
this method is not prone to security issues because this driver
does not support raw images (at least not yet), we don't have to
perform any checks on the backing image file path.
Thus, this method simply ignores this argument that will be parsed
by the base class methods.
Patrick East [Mon, 29 Sep 2014 17:54:22 +0000 (10:54 -0700)]
Fix race condition in ISCSIConnector disconnect_volume
The list of devices returned by driver.get_all_block_devices() will
sometimes contain broken symlinks as the SCSI device has been deleted
but the udev rule for the symlink has not yet completed.
Adding in a check to os.path.exists() will ensure that we will not
consider the broken symlinks as an “in use” device.
VMware:Unquote folder name for folder exists check
vCenter server escapes special characters in the folder name using URL
encoding and returns back the encoded string while querying. This causes
the check for folder existence to return false. Therefore, folder
creation is reattempted which eventually fails. This patch fixes the
problem by decoding the folder name returned by vCenter before
comparison.
ArkadyKanevsky [Wed, 27 Aug 2014 21:47:37 +0000 (16:47 -0500)]
Fixing format for log messages
code_cleanup_batching for EQL driver
Follow log essage format for i18n - http://docs.openstack.org/developer/oslo.i18n/guidelines.html#adding-variables-to-log-messages
Jay S. Bryant [Thu, 25 Sep 2014 20:58:41 +0000 (15:58 -0500)]
Update /etc/cinder/cinder.conf.sample for memcache
It appears that an update to keystone middleware earlier today
added options for memcache_secret_key, memcache_pool_dead_retry,
memcache_pool_maxsize, memcache_pool_socket_timeout,
memcache_pool_unused_timeout, memcache_pool_conn_get_timeout and
memcache_use_advanced_pool. The commit that added these options
was: a7beb50b38be5c3dd4c44d68ad79d1bb206dab6b - "Add an optional
advanced pool of memcached clients".
This has once again caused the check_uptodate.sh script to fail.
During attach to a nova instance, the backing VM corresponding to the
volume is relocated only if the nova instance's ESX host cannot access
the backing's current datastore. The storage profile is ignored and
the volume's virtual disk might end up in a non-compliant datastore.
This patch fixes the problem by checking storage profile compliance of
the current datastore.
Rick Chen [Wed, 24 Sep 2014 09:08:52 +0000 (17:08 +0800)]
Failed to re-detach volume when volume detached.
When first request command detach the volume, but the back-end
storage state is in-processing or busy. Next retry command will
got the error code that describe the volume already detached.
Jay S. Bryant [Fri, 19 Sep 2014 17:46:21 +0000 (12:46 -0500)]
Fix unnecessary WSGI worker warning at API startup
There was a bug in WSGIService in the way that it was
checking the osapi_volume_workers option. It was using
getattr() to see if the option was set, if not it was supposed
to set the value to processutils.get_worker_count(). This,
however, never happened because getattr interpreted the default
'None' value to be a value. So, on any system with no value set
the self.workers < 1 check would be hit and a warning would be
output.
Nova had changed their approach to this option to avoid this
problem. This patch pulls Nova's approach into Cinder for
consistency. Cinder will now use processutils.get_worker_count()
if no option is set in /etc/cinder/cinder.conf and when the user sets
osapi_volume_workers to 0. A negative value will cause an
InvalidInput exception to be thrown.
Mark Sturdevant [Tue, 23 Sep 2014 05:45:14 +0000 (22:45 -0700)]
Fix ssh_host_key_file default in help and config.sample.conf
The commit message and the actual default say the default value for
ssh_host_key_file is $state_path/ssh_known_hosts, but the
config.conf.sample and the config opts help say it is
"$state_path/known_hosts".
Fix the help and config.conf.sample to match the actual default.
Downgrade 'infinite' and 'unknown' capacity in weigher
When FilterScheduler was first introduced into Cinder, drivers were
required for the first time to report capacity. Some drivers preferred
to report 'infinite' or 'unknown' capacity because they were doing
thin-provisioning or the total capacity kept increasing. Now that we
have better support for thin-provisioning and we do find unrealistic
capacity couldn't do us any good in making optimal scheduling decision,
because 'infinite' and 'unknown' would always have the highest weight
when the weight multiplier is positive, which in most cases it is.
Drivers are expected to avoid sending 'infinite' 'unknown' capacity
anymore, instead, should report an actual real number for total/free
capacity.
This fix doesn't fix the driver, instead a small tweak is added to
CapacityWeigher in order to downgrade those drivers who report
'infinite' or 'unknown' as free capacity. In particular, those who
report 'infinite'/'unknown' free capacity will be adjusted to be the
one has lowest weight, no matter in 'spreading' (weight multiplier>0)
or 'stacking' (weight multiplier<0) mode.
Jeremy Stanley [Mon, 22 Sep 2014 12:40:43 +0000 (12:40 +0000)]
Remove unused py33 tox env
Based on review comments in https://review.openstack.org/118771 it's
apparent that Cinder is quite a ways from Py3K support, and
developers are not expected to run the py33 tox env. Rather than add
more envs for later Python interpreter versions which will be
equally broken, just remove it for now.
Xing Yang [Sat, 20 Sep 2014 22:23:11 +0000 (18:23 -0400)]
DB migration 25->24 failed when dropping column
"cinder-manage db sync 24" failed when dropping column cgsnapshot_id
from the snapshots table.
The reason that drop column failed was because of the foreign key
constraint. MySQL cannot drop column until the foreign key constraint
is removed. So the solution is to remove the foreign key first, and
then drop the column. This affects the cgsnapshot_id column in the
snapshots table and the consistencygroup_id column in the volumes table.
With pool support added to Cinder, now we are kind of in an awkward
situation where we require admin to input exact location for volumes
to-be managed (imported) or migrated, which must have pool info, but
there is no way to find out what pools are there for backends except
looking at the scheduler log. That causes bad user experience, and
thus is a bug from UX POV.
This change simply adds a new admin-api extension to allow admin to
fetch all the pool information from scheduler cache (memory), which
closes the gap for end users.
This extension provides two level of pool information: names only or
detailed information:
Pool name only:
GET http://CINDER_API_ENDPOINT/v2/TENANT_ID/scheduler-stats/get_pools
Detailed Pool info:
GET http://CINDER_API_ENDPOINT/v2/TENANT_ID/scheduler-stats/get_pools
\?detail\=True
The latest 3PAR firmware supports a hostname of 31 characters.
This patch increases the hostname on the 3PAR from 23 characters to
31.
The driver currently has a fallback mechanism in place for detecting
existing hosts. This handles the case where upgrading from
Icehouse to Juno where Icehouse hosts have a limit of 23 characters.
Mark Sturdevant [Sun, 14 Sep 2014 00:04:27 +0000 (17:04 -0700)]
HP 3PAR drivers should not claim to have 'infinite' space
The HP 3PAR drivers report 'infinite' space when there is not a limit
set on the CPG. In this case, it would be better to at least use the
free space estimate of the array instead of 'infinite'.
Xing Yang [Thu, 18 Sep 2014 20:50:58 +0000 (16:50 -0400)]
Add tests for consistency groups DB migration
The consistency group patch added 2 new DB migration versions:
025 and 026. However, there were no unit tests to support them in
cinder/tests/test_migrations.py. This patch added missing tests.
John Griffith [Thu, 18 Sep 2014 03:16:18 +0000 (21:16 -0600)]
Verify requested size in volume.api create
Currently we're not checking that the input value requested
for size on volume create is valid, but taskflow portion of
the code expects it to be and as a result when it receives invalid
input we litter the logs with Trace messages.
This patch adds a check that verifies that if we pass in a
size to create_volume in the volume.api that it is in fact
an int or a string representation of an int.
Currently socket options, socket.SO_REUSEADDR and socket.SO_KEEPALIVE
are set only if SSL is enabled.
The above socket options should be set no matter SSL is enabled or not.
Mark Sturdevant [Fri, 12 Sep 2014 21:03:28 +0000 (14:03 -0700)]
HP 3PAR: Allow retype when the old snapshot CPG (3PAR pool) is None
A common provisioning group (CPG) is a virtual pool of logical disks.
A volume is stored in a CPG. The volume's snapshots may be stored in
the same CPG or in a separate CPG. In 3PAR this is the "snapCPG".
Before the 3PAR driver supported "manage", the snapCPG was never None in
OpenStack because when we create volumes, we explicitly default snapCPG
to match the volume CPG unless otherwise specified. So, the original
retype pre-checks raised an exception if this unexpected case occurred
and a unit test was provided for that exception.
Now that we support manage_existing(), it should be a valid use case
to manage an existing volume created outside of OpenStack with a snapCPG
set to None. Unfortunately, when we applied volume-type settings to this
volume we would hit that old exception.
That exception has been removed.
Removing it required the following changes:
1. When retype pre-checks validate the domain of the new snapCPG, it
will not use the optional old snapCPG. Instead it will compare the
new snapCPG domain with the old volume CPG domain. This satisfies
the requirement that domains cannot be mixed and avoids the need
to have an old snapCPG setting.
2. Remove the no longer used old_snap_cpg parameter and remove the
code that raised an exception "if not old_snap_cpg".
3. Remove the unit test that was just to test that obsolete exception.
4. Adjust the exiting test_retype_across_snap_cpg_domains to
verify that a snapCPG domain that does not match the volume
CPG domain will still raise Invalid3PARDomain
This patch adds support for retype operation in the VMDK driver.
The VMDK driver uses the extra spec keys 'vmware:vmdk_type' and
'vmware:storage_profile' to determine the disk provisioning type
and storage policy of the backing VM. Currently it is not possible
to change the disk provisioning type or storage policy after volume
creation. With retype support, user can change the disk provisioning
type and storage policy by changing the volume's type to another type
with extra spec keys set to desired values.
The current datastore selection logic is not modularized and difficult
to extend. The retype API implementation has a requirement to specify
hard anti-affinity requirement with the current backing datastore. Some
of the bug fixes also need to specify hard affinity with one or more
datastore types. To support such requirements and to enable future
extensions, this patch introduces a new module which contains datastore
selection logic. The existing code for datastore selection is reused as
much as possible. The dependency on existing datastore selection logic
will be removed in a separate patch.
The current datastore selection iterates over a list of hosts, for each
host, queries the connected valid datastores and tries to select a
suitable datastore. The filtering is based on space, storage profile,
number of connected hosts and space utilization. The space utilization
is used only for breaking ties. If a suitable datastore is found, further
processing of list of hosts is skipped, which could result in uneven space
utilization. To solve this, the new selection logic introduces a requirement
called 'preferred_utilization_threshold' which can be exposed as a driver
config option.
Ian Govett [Tue, 16 Sep 2014 11:32:49 +0000 (07:32 -0400)]
Fix a problem with 'volume list' when 'all_tenants=0'
This change checks the value of 'all_tenants' and returns a list of volumes
for all tenants when 'all_tenants' is 1 (or True). A tenant-specific volume
list is returned when 'all_tenants' is 0 (or False).
An InvalidInput exception is thrown if the 'all_tenants' value is not 0, 1,
True, or False (case insensitive).
Provided new unit tests for the get_all api which check the all_tenants, and
limits parameters.
Fixed pep8 compliance test fails with code being improperly indented, and
changed calls using str() to use six.text_type()
Nilesh Bhosale [Sun, 24 Aug 2014 07:36:29 +0000 (13:06 +0530)]
IBMNAS: Remove call to set r/w permissions to all
During cinder volume create operation from a volume snapshot or
from an existing volume (volume clone operation), the ibmnas
driver sets 'rw' permissions to all, which is unnecessary and
also poses security concerns.
Fixing this issue, removing the calls to set rw permissions to
all during these operations and adding a call to set 'rw'
permissions only to the owner to make sure even if umask is set
at the filesystem level, which might deny 'rw' access to the owner
we explicitely set the required permissions on the volume file.