Jay S. Bryant [Mon, 20 Apr 2015 23:05:47 +0000 (18:05 -0500)]
Add hacking check for str and unicode in exceptions
One of the comments we are frequently having to make
on reviews is with regards to the use of str() or
unicode() on exceptions. This hacking check pulled
from Nova will catch this problem earlier and avoid
conflicting comments being made in reviews.
Joel Coffman [Thu, 26 Mar 2015 22:14:01 +0000 (18:14 -0400)]
Add test case for volume_encryption_metadata_get
This change adds unit tests for the volume_encryption_metadata_get
function. The unit tests provide protection against regressions when
refactoring this code as part of follow-up patches.
Writing the unit tests also exposed a minor issue with the existing
implementation of the volume_encryption_metadata_get function. If the
volume type is not encrypted, then the existing implementation would
raise an exception due to volume_type_encryption_get returning None.
In practice, this issue would not be encountered due to separate
checks to ensure that the volume type is encrypted, but a small
refactoring obviates the need for these checks and allows the
volume_encryption_metadata_get function to be invoked for both
encrypted and "normal" volumes.
A separate patch will clean up the unnecessary checks to ensure that
the volume type was encrypted prior to calling this function.
Deliang Fan [Fri, 24 Apr 2015 03:16:46 +0000 (11:16 +0800)]
Don't truncate osapi_volume_link prefixes
When osapi_volume_link_prefix is defined and used to update the
links return in API responses, do not drop the path component of
the overriding link prefix.
After the 3PAR drivers were refactored to remove the local
file locks, the login mechanism was logging the common and
client version numbers on every driver entry point.
This patch removes that logging except at driver startup.
John Griffith [Thu, 23 Apr 2015 15:05:57 +0000 (09:05 -0600)]
Sync oslo service module
This does a full sync of the oslo.service module. Note
that we've cherry picked some critical bug-fix changes
to this module already, this commit just syncs the full
module properly and gets us up to date where we should be.
Current HEAD in OSLO:
-----------------------
commit: d5edda00b4eca65d57f94bd0ac1b790e6d1f732e
Date: Wed Apr 22 19:49:00 2015 +0000
Merge "service child process normal SIGTERM exit"
Changes merged with this patch:
--------------------------------- d5edda00 - Merge "service child process normal SIGTERM exit" 702bc569 - service child process normal SIGTERM exit 64b5819e - Revert "Revert "Revert "Optimization of waiting subprocesses
in ProcessLauncher f5646edc - Revert "Revert "Optimization of waiting subprocesses
in ProcessLauncher d23b6589 - Revert "Optimization of waiting subprocesses in ProcessLauncher" 593005b7 - ProcessLauncher: reload config file in parent process on SIGHUP f29e865d - Store ProcessLauncher signal handlers on class level bf92010c - Optimization of waiting subprocesses in ProcessLauncher
NOTE: Commit 702bc569 was actually pulled in with commit d73ac96d .
We shouldn't have merged that individual commit. I include the
commit here to document Oslo level that the service module is at
cumulatively between that commit and patch.
John Griffith [Thu, 23 Apr 2015 18:07:12 +0000 (12:07 -0600)]
Add external genconfig calls
After moving to oslo.config we still were using
incubator config generator. This was ok, but the
problem is we haven't been pulling config options
from the oslo libs.
This is a hack that just appends external lib calls
and appends those options to the sample file being built.
Alex Meade [Tue, 24 Feb 2015 21:22:58 +0000 (16:22 -0500)]
NetApp E-Series: Fix instance live-migration with attached volumes
Currently, live migrations of instances with attached volumes that live
on a NetApp E-Series backend will fail and break connectivity to the
guest. This patch adds the 'netapp_enable_multiattach' configuration
option that enables multiattach operations with the E-Series driver.
It defaults to allowing these operations but needs to be configurable
since allowing for multiple attachments imposes a limit of 256 volumes
on the backend due to how multiple attachments must be managed by
E-Series.
Multiattach operations are enabled by mapping volumes to an E-Series
host group on the backend called 'cinder-host-group'. Host groups can
only have 256 mappings at a time and so we must limit the number of
volumes in order to guarantee any volume created could then be attached.
John Griffith [Sat, 18 Apr 2015 00:19:37 +0000 (00:19 +0000)]
Add resource tag to logging in volume.manager.py
We now have resource tag support in oslo logging,
and our logging is pretty inconsistent and down right
ugly in places. Let's clean things up based on the
standard logging guidelines and use the fancy new
resource tag.
To use set the following in cinder.conf:
logging_context_format_string = \
%(asctime)s.%(msecs)03d %(levelname)s %(name)s [%(request_id)s \
%(project_name)s] %(resource)s%(message)s
This change hits the majority of the code in manager and
should be used as an example/guide. There are some exceptions
around migration and replication where things are kinda ugly,
those should be fixed up as follow up work.
John Griffith [Wed, 22 Apr 2015 04:28:34 +0000 (04:28 +0000)]
Remove force check from copy_volume_to_image
The upload_volume_to_image method allows a force parameter that will
upload a volume even though the volume is attached/in-use. A user can
get away with this with the LVM driver because the LVM backing is
local to the Cinder worker node that's pushing the bits to Glance.
The problem is, this only works for local storage, it won't work
with any iSCSI devices because they can't do multi-attach. Also,
the reason we required that a volume NOT be in-use for this
operation is because we have no way of keeping the guest Instance
from writing to the volume while we're uploading, and corrupting the data.
This has been exposed like this for several releases, so removing it
now likely would not be a good user experience. Instead, this
patch add a config option to enable/disable it (default is to
disable), and deployers can choose whether they would like to
allow the use of --force True or not.
DocImpact Disables the --force option to copy-volume-to-image and
introduces "allow_force_upload" boolean option to re-enable
service.py had some code where the child process would catch the
SIGTERM from the parent just so it could exit with 1 status rather
than with an indication that it exited due to SIGTERM. When
shutting down the parent doesn't care in what way the child ended,
only that they're all gone, so this code is unnecessary.
Also, for some reason this caused the child to never exit while
there was an open connection from a client. Probably something
with eventlet and signal handling.
John Griffith [Mon, 20 Apr 2015 21:38:22 +0000 (15:38 -0600)]
Move unit tests into dedicated directory
This patch moves all of the existing cinder/tests into
cinder unit tests. This is being done to make way for
the addition of cinder/tests/functional.
Yes, this is going to cause significant pain with
any changes that haven't merged behind it in terms
of rebase, but there's no real alternative. We have
to rip the band-aid off at some point, and early in L
seems like a great time to do it.
Ivan Kolodyazhny [Mon, 20 Apr 2015 19:53:14 +0000 (22:53 +0300)]
Move RBD calls to a separate threads
RBD is a python binding for librados which isn't patched by eventlet.
Making long-running tasks like removing big (~100GB, ~1TB) volumes
blocks eventlet loop and all cinder-volume service hangs
until it finished when rados_connect_timeout is disabled. It makes
cinder-volume services unavailable for a while.
This patch moves all rados calls to a separate python thread which
doesn't block eventlet loop.
RBD: Add missing Ceph customized cluster name support
It turns out '--cluster' is also needed when RBD driver talks to
ceph cluster using 'ceph' command (not via librados). This change
appends RBDDriver._ceph_args with '--cluster' when 'rbd_cluster_name'
config option is not None.
John Griffith [Fri, 17 Apr 2015 20:51:06 +0000 (20:51 +0000)]
Standardize logging in volume.api.py
We now have resource tag support in oslo logging,
and our logging is pretty inconsistent and down right
ugly in places. Let's clean things up based on the
standard logging guidelines and use the fancy new
resource tag.
This patch starts with the volume.api file as that's
'easy', so we can enforce things going forward and start
working out other files in future patches.
To use set the following in cinder.conf:
logging_context_format_string = \
%(asctime)s.%(msecs)03d %(levelname)s %(name)s [%(request_id)s \
%(project_name)s] %(resource)s%(message)s
VolMgr: reschedule only when filter_properties has retry
In the task flow for volume manager, create_volume tasks, volume gets
reschedule even when scheduler doesn't indicate so. The problem is
the flow should not only check 'allow_reschedule' and 'request_specs',
but also (more importantly) filter_properties['retry'], which is
populated by scheduler if schedule_max_attempts is set to > 1. This
checks was there before taskflow was introduced, but somehow the
migration missed the check for filter_properties['retry'].
This change adds back the check, so scheduler_max_attempts won't be
treated like scheduler_max_attempts = infinite.
This patch adds the fix that exists in the nova
libvirt volume code to use oslo_utils strutils
to mask passwords that might show up in debug log
messages.
Current RBD driver assumes ceph cluster name to be 'ceph', for
cluster has a different name, the driver won't be able to connect
to the cluster. This change add a new config option
'rbd_cluster_name' to address this issue.
This allows operations that do not conflict with each
other (i.e. are on different volumes) to run concurrently.
The prior locking scheme was too coarse and essentially
made the driver single-threaded.
This patch moves the implementation of the GlusterFS driver locking
scheme to the RemoteFS base driver so that other similar volume
drivers can use it.
Lucian Petrut [Fri, 27 Mar 2015 12:15:25 +0000 (14:15 +0200)]
Windows SMBFS: fix volume extend
The Windows SMBFS driver inherits the Linux SMBFS driver,
overriding Windows specific methods.
This commit Ic89cffc93940b7b119cfcde3362f304c9f2875df added the
volume name as an extra argument to the _do_extend_volume in order
to check if differencing images are pointing to backing files other
than the according volume disks.
Although this is not required on Windows, this method should accept
this extra argument in order to have the same signature as the
method it overrides. At the moment, this raises the following
exception:
Tom Swanson [Thu, 9 Apr 2015 20:24:27 +0000 (15:24 -0500)]
Reworked Dell SC iSCSI target portal return
On initialize_connection the code to determine the portal, lun and
iqn info to return could skip ports and potentially not return the
best portal choice. In the case of multipath being enabled not all
ports would be returned.
Also changed a LOG.debug to a LOG.info in initialize_connection.
Navneet Singh [Wed, 1 Oct 2014 18:31:41 +0000 (00:01 +0530)]
Fix LUN misalignment issue with NetApp iSCSI drivers
This patch fixes LUN misalignment issues that can be
experienced when provisioning Cinder volumes with the iSCSI
protocol. In the fix, two new SAN options for configuring
LUN OS and initiator OS used while attaching volume can be
specified. This gives the admin flexibility to configure
backends correctly for multiple hypervisor and guest OS
platforms.
VNX Cinder Driver should report 0 free_capacity_gb in some scenarios
When the storage pool is Initializing, Offline or Deleting, no more LUNs
can be created before the pool gets out of the state.
So when a pool is in the 3 states, its free capacity is de facto 0.
This patch is to add logic to report 0 free capacity accordingly.
Jon Bernard [Tue, 7 Apr 2015 17:57:36 +0000 (13:57 -0400)]
Include boot properties from glance v2 images
In order for users to take advantage of COW volumes created from
a glance image, Cinder must be configured to use Glance API version
2 (default is 1). In version 2, the required boot metadata (kernel_id
and ramdisk_id) are no long stored in the 'properties' dict, but as
standalone fields in the GET response from glance. The existing cinder
parser for the glance request is not aware of this and the volume
created form a v2 image will lack this required metadata.
This was causing the recent Ceph CI gate failures for
test_volume_boot_pattern.
Sean McGinnis [Thu, 9 Apr 2015 15:32:03 +0000 (10:32 -0500)]
Logging not using oslo.i18n guidelines (brick)
Multi-patch set for easier chunks. This one addresses
the backup and common cinder directories.
Updates have already been made to the os-brick project.
There have been quite a few instances found where the
i18n guidelines are not being followed. I believe this
has helped lead to some of the confusion around how to
correctly do this. Other developers see this code and
assume it is an example of the correct usage.
This patch attempts to clean up most of those violations
in the existing codebase to hopefully help avoid some of
that confusion in reviews.
Some issues address:
* Correct log translation markers for different log levels
* Passing format values as arguments to call, not preformatting
* Not forcing translation via six.text_type and others
Guidelines can be found here:
http://docs.openstack.org/developer/oslo.i18n/guidelines.html
Hacking checks will not be able to identify all violations of
the guidelines, but it could be useful for catching obvious ones
such as LOG.info("No markers!").
Removed sleep before 'YES' is sent to confim an operation
Removed sleep between the time a command is sent
and 'YES' is sent to confim the operation. We have
discussed this internally and have concluded that
the sleep was mistakenly added in the initial coding
from an example of the SSH code. We have done full
functionality testing of the zoning operations on
Brocade switches with the sleep removed and have
found no side effects from the removal
Additionally, no other Brocade management
applications which perform zoning on our switches
require sleep intervals between execution of
commands. We are confident that the sleep is not
necessary and can be removed
Update openstack-common reference in openstack/common/README
The README file under the openstack/common directory references to
openstack-common, but the link points to oslo-incubator (which is
correct). Update the file, so that it makes use of oslo-incubator
instead of openstack-common.
Vincent Hou [Tue, 3 Mar 2015 08:04:41 +0000 (16:04 +0800)]
Delete the temporary volume if migration fails
Issues resolved in this patch include the following changes:
* A temporary volume is created on the destination host before migrating
the data from the source. However, if the creation of this volume fails,
its record will be left in the database as redundant information. This
patch will remove the database record if the creation fails.
* If attaching the remote dest volme fails at initialize_connection
due to timeout, we need to terminate the connection. Otherwise, the dest
volume will not be released and successfully deleted.
Jay S. Bryant [Mon, 6 Apr 2015 19:02:31 +0000 (14:02 -0500)]
Correct cinder hacking check numbering
We have a couple of hacking checks that are specific to
Cinder that were written a while back. Unfortunately, when
they were written the numbering scheme for hacking checks was
not understood. We used N3xx when we should have used C3xx.
Jay S. Bryant [Mon, 6 Apr 2015 16:22:09 +0000 (11:22 -0500)]
Add hacking check for print() statements
We are frequently having to -1 patches because people
forget print() statements that were used for debug in
their development. Can save everyone time and trouble by
adding this simple hacking check.
The check excluded the cinder/cmd directory as the files in there
legitimately need to use the print() command. Also wsgi.py and
the test_hds_nas_backend.py files make use of print, so I have
excluded those from checking as well.
wuyuting [Fri, 20 Mar 2015 10:41:11 +0000 (18:41 +0800)]
Rbd update volume stats in wrong way
Cinder volume uses a RADOS pool to store volumes and
update storage stats periodically. However, rbd driver
reports the whole cluster stats but not the pool's. This
is wrong. The right way is to report the pool stats but
not the whole cluster.