David Pineau [Thu, 6 Nov 2014 14:01:13 +0000 (15:01 +0100)]
Fix the LV NotFound situation for thin-type LVM
If the logical volume is not found, LVM displays on the error output
that the volumes could not be found. So here, we filter on this very
specific situation, and let all the other cases go through the stack.
Added a test for this new code path, which raises an exception of the
proper type to be caught by the new code.
Matt Riedemann [Fri, 7 Nov 2014 14:23:09 +0000 (06:23 -0800)]
Retry remove iscsi target
There is a race in the gate when removing an iscsi
target in the remove export flow where the target
is still active. It's about a 75% failure rate so
simply adding a configurable retry on the delete
call should clean that up.
John Griffith [Tue, 4 Nov 2014 22:34:28 +0000 (23:34 +0100)]
Remove test_barbican from keymgr tests
Unfortunately it seems we have some very poorly
written keymgr unit tests, specifically
keymgr/test_barbican.py does this:
from barbicanclient.common import auth
First problem is that from a unit test perspective
that pretty much sucks, second problem is that
barbicanclient as of version 3.0.0.0 no longer
has an "auth" module, as a result Cinderunit tests
now fail.
The test_barbican.py unit tests need to be rewritten
to mock out the client components and actually test
Cinder components where needed without relying on
the barbicanclient.
For now in order to free up the gate, remove test_barbican.py
Mike Mason [Thu, 23 Oct 2014 13:16:42 +0000 (13:16 +0000)]
Implementing the use of _L’x’/i18n markers
Placing the _Lx markers back into the code. No other cleaner solution has
has been implemented. Patches will be submitted in a series of sub
directories and in a fashion that is manageable.
ZhiQiang Fan [Sat, 1 Nov 2014 18:18:35 +0000 (02:18 +0800)]
Disable python-barbicanclient 3.0.0 version
python-barbicanclient 3.0.0 has introduced cliff module, but pins it
to 1.6.1, while global-requirements set it to >=1.7.0, now the whole
OpenStack projects depends on higher version of cliff, but
python-barbicanclient is needed by cinder, then grenade test will
fail because: pkg_resources.DistributionNotFound: cliff==1.6.1
As long as python-barbicanclient maintainers don't provide a patch
for 3.0.0 (which seems not happen FMPOV), we need to disbable this
version.
Adrien Vergé [Tue, 28 Oct 2014 21:00:09 +0000 (22:00 +0100)]
Cleanly override config in tests
CONF.set_override() is often called in tests but CONF.clear_override()
is never. Create a override_config() method in the base TestCase class
that restores previous conf value after test, as it is done in other
OpenStack projects.
Xing Yang [Thu, 9 Oct 2014 05:26:28 +0000 (01:26 -0400)]
Use look up service for auto zoning
The VMAX FC driver didn't use the look up service for auto zoning.
Instead it built initiator target map itself. However, that
requires the initiator to log into the fabric before zoning
in order to find out target WWNs.
This patch is to use the look up service to find out valid initiator
target WWNS and use that to build initiator target map. With this fix,
the initiator is no longer required to log into the fabric ahead of time.
Xing Yang [Sun, 26 Oct 2014 21:29:26 +0000 (17:29 -0400)]
CiscoFCSanLookupSerive uses extra argument in init
This patch fixed two issues with the __init__ routine in
CiscoFCSanLookupService:
1. There's an extra argument in super(CiscoFCSanLookupService,
self).__init__(self, **kwargs). It should be changed to
super(CiscoFCSanLookupService, self).__init__(**kwargs).
2. The last line 'self.fabric_configs = ""' should be removed.
self.fabric_configs was created in self.create_configuration() in the
middle of the __init__ routine. It shouldn't be cleared out at the end
of the __init__ routine.
John Griffith [Thu, 23 Oct 2014 16:37:11 +0000 (16:37 +0000)]
Fix SolidFire inaccurate model on migrated vols
The general migration impl in Cinder works
by creating a new volume, transfering the data
from the original volume to the new volume, and
then deleting the original and flipping the ID
of the new volume.
Turns out we missed the fact that this creates a
mismatch between the volume Cinder will later ask
for and what the volumes identity is on the backend
device.
This change adds a check on create_volume at the drivers
level to see if it's part of a migration and is infact
going to get renamed. If so, just use the new name
and avoid all the headaches that come later with updating
provider auth and location.
The model info won't change in this case and is accessible
independent of the ID field in the Cinder base and the
crazy change that's going to take place on that value
in the Cinder DB.
abhishekkekane [Tue, 21 Oct 2014 09:31:15 +0000 (02:31 -0700)]
Eventlet green threads not released back to pool
Presently, the wsgi server allows persist connections hence even after
the response is sent to the client, it doesn't close the client socket
connection.
Because of this problem, the green thread is not released back to the pool.
In order to close the client socket connection explicitly after the
response is sent and read successfully by the client, you simply have to
set keepalive to False when you create a wsgi server.
DocImpact:
Added wsgi_keep_alive option (default=True).
In order to maintain the backward compatibility, setting wsgi_keep_alive
as True by default. Recommended is set it to False.
John Griffith [Tue, 21 Oct 2014 23:19:22 +0000 (23:19 +0000)]
Add ability to update migration info on backend
The current migration process creates a new volume,
xfr's it's contents, then deletes the original and
modifies the new volume to have the previous ID.
All in all this is kinda troublesome, but regardless
the bigger problem is that the generic impl doesn't
provide any method to tell the backend devices that
their device names/id's have changed.
This patch provides a method to inform backends
that a migration operation has been completed on
their target volume.
It shouldn't be necessary to do anything with the originating
or source volume because it's deleted as part of the process.
John Griffith [Fri, 24 Oct 2014 14:19:19 +0000 (08:19 -0600)]
Reserve 5 migrations for backports
Reserve 5 migrations incase the need arises to backport any
fixes that require a db migration in stable/juno.
We've never set this up in the past and we did run into a case
last cycle where we had to hack some things around to make it work
without the place holder.
Why 5? Why not? For as little as we touch the DB historically
this number should be more than sufficient.
Tomoki Sekiyama [Wed, 22 Oct 2014 22:30:06 +0000 (18:30 -0400)]
LioAdm: Delete initiator from targets on terminate_connection
In current LioAdm implementation, initiators are remained even if
terminate_connection is called. This keeps volumes exported to hosts
after instances attaching the volumes are live-migrated to another
host, which is not good for security. It also causes an error on the
migration back to the original host, because cinder-rtstool doesn't
update CHAP authentication if the initiator already exists.
With this patch, initiators are deleted on terminate_conection.
'initiator-delete' operation is added to cinder-rtstool.
It makes the following live-migration succeed.
Also, this adds unit tests for initialize_connection and
terminate_connection methods in LioAdm.
This patch allows an OpenStack environment to run as a secure NAS
environment from the client and server perspective, including having
root squash enabled and not running file operations as the 'root'
user. This also sets Cinder file permissions as 660: removing
other/world file access.
The "nas_secure_file_permissions" option controls the setting of file
permissions when Cinder volumes are created. The option defaults to
"auto" to gracefully handle upgrade scenarios. When set to "auto",
a check is done during Cinder startup to determine if there are
existing Cinder volumes: no volumes will set the option to 'true',
and use secure file permissions. The detection of existing volumes will
set the option to 'false', and use the current insecure method of
handling file permissions.
The "nas_secure_file_operations" option controls whether file
operations are run as the 'root' user or the current OpenStack
'process' user. The option defaults to "auto" to gracefully handle
upgrade scenarios. When set to "auto", a check is done during Cinder
startup to determine if there are existing Cinder volumes: no volumes
will set the option to 'true', be secure and do NOT run as the 'root'
user. The detection of existing volumes will set the option to 'false',
and use the current method of running operations as the 'root' user.
For new installations, a 'marker file' is written so that subsequent
restarts of Cinder will know what the original determination had been.
This patch enables this functionality only for the NFS driver.
Other similar drivers can use this code to enable the same
functionality with the same config options.
Tomoki Sekiyama [Mon, 20 Oct 2014 18:32:55 +0000 (14:32 -0400)]
TgtAdm: Don't change CHAP username/password on live migration
As tgtd doesn't update CHAP username/password while the initiator is
connected, CHAP username/password must not be changed while a Nova
instance are performing live-migration; otherwise the compute node
which the instance migrates to cannot login to the volume and the
migration process is aborted.
This fixes TgtAdm implementation not to regenerate random
username/password every time initialize_connection is called.
Also, it enables CHAP auth in unit tests of TargetAdmin helpers.
John Griffith [Fri, 17 Oct 2014 04:43:20 +0000 (22:43 -0600)]
Turn on Flake-8 Complexity Checking
Flake8 provides the ability to measure code complexity. There are
a lot of modules in Cinder that are considered "too complex", the
worst being "cinder/tests/test_huawei_hvs.py:110:1:" with a complexity
ranking of 59.
There's some outlyers at the higher end here, but the majority of the
code checks in at under 30, so let's make that our threshold and ignore
the two offenders that are above that for now.
Granted this may or may not be valuable, but it doesn't hurt to try it
and if we all hate it or find there's no value but it makes life difficult
we can always turn it back off.
See flake8.readthedocs for more info on flake8 and McCabe complexity
checking.
Matt Riedemann [Thu, 16 Oct 2014 15:39:07 +0000 (08:39 -0700)]
Log a warning when getting lvs and vgs takes longer than 60 seconds
We know something is causing lvs/vgs commands to block while deleting a
volume and this is causing Tempest to timeout while waiting for the
volume to be deleted. What we don't have right now is very good
(specific) logging when this happens, unless we get messages in syslog
for lvm tasks taking more than 120 seconds, but that doesn't always
happen when we see the volume delete timeout in Tempest.
This patch adds a check for when getting logical volumes and volume
groups takes longer than 60 seconds and logs a warning if that happens.
This is helpful in production also because the default interval for
periodic tasks is 60 seconds so having these take longer than that time
could cause periodic tasks to block up on each other and you'll get
warnings from the FixedIntervalLoopingCall in oslo which is controlling
the task runs.
Stuart McLaren [Fri, 5 Sep 2014 12:48:04 +0000 (12:48 +0000)]
Add client_socket_timeout option
Add a parameter to take advantage of the new(ish) eventlet socket timeout
behaviour. Allows closing idle client connections after a period of
time, eg:
$ time nc localhost 8776
real 1m0.063s
Setting 'client_socket_timeout = 0' means do not timeout.
Vincent Hou [Fri, 10 Oct 2014 07:46:36 +0000 (15:46 +0800)]
IBM Storwize driver: Add local variable assignment to "ctxt"
* The method get_vdisk_params in helpers.py is missing a local variable
assignment for "ctxt", causing "UnboundLocalError: local variable
'ctxt' referenced before assignment. Adding the assignment should
resolve this issue.
* Add the unit tests coverage for get_vdisk_params.
Patrick East [Tue, 14 Oct 2014 22:45:38 +0000 (15:45 -0700)]
Multipath commands with error messages in stdout fail to parse
This change fixes an issue in find_multipath_device() where the command
output of ‘multipath -l <device>’ would sometimes fail to be parsed if
there were error messages in the stdout string in addition to the
expected output. We will now strip out the error messages before we
attempt to parse the lines.
Andrew Kerr [Thu, 29 May 2014 03:16:23 +0000 (08:46 +0530)]
NetApp fix to set non default server port in api
The non default netapp_server_port config option was not
getting set in api even if specified in cinder.conf. Its
made non mandatory and set if specified in the configuration.
Tomoki Sekiyama [Tue, 14 Oct 2014 23:09:44 +0000 (19:09 -0400)]
Fix LVM iSCSI driver tgtadm CHAP authentication
Currently CHAP Authentication in LVM iSCSI driver with tgtadm does not work.
This is because the tgtadm helper creates the target configuration file
with an 'IncomingUser' entry, which is ignored by tgtd.
This patch fixes it to 'incominguser'.
Mitsuhiro Tanino [Tue, 14 Oct 2014 16:41:41 +0000 (12:41 -0400)]
Export cinder volumes only if the status is 'in-use'
Currently, cinder volumes are exported both 'in-use' and 'available'
after restarting cinder-volume service.
This behavior was introduced following commit.
If the volumes are attached to nova instances, they should be exported
via tgtd after restarting cinder-volume.
But the volumes which are not attached to instances must not be exported
because everyone can connect these volumes.
This patch changes volume export behavior that exports a volume only if
the volume status is 'in-use'.
John Griffith [Fri, 10 Oct 2014 01:22:03 +0000 (19:22 -0600)]
Move SolidFire driver from httplib to requests
The SolidFire driver has been pretty static for a number of
years now, this change is to move from httplib for API calls
to requests. There are a number of advantages to this, including
performance, simplicity and ability to add things like ssl support
easily.
In addtion this change removes the confusing looping/retry mechanisms
that were in the issue_api_request method and replaces it with a
retry decorator for the exceptions we're interested in retrying.
Finally, I realize that my unit tests suck! That will be one of the
follow up items after a bit more clean up in the driver.
we need to to check the value of the configuration item glance_num_retries
in the code in order to ensure the " glance_num_retries " is equal or greater
than 0
The WSDL URL of storage policy service is determined and a session is
created using it in do_setup(). This session is later used to initialize
the datastore selector property (ds_sel), which uses the session for all
storage policy related API calls.
After commit a8fa3ceb1e72bac2ab67f569a2ca009f995f59fd (Integrate
OSprofiler and Cinder), the properties defined in vmdk module are called
before do_setup(). As a result, the ds_sel (datastore selector) property
is initialized with a session instance containing a 'None' PBM (storage
policy service) WSDL URL. This results in failures of all storage policy
related APIs invoked using datastore selector. This patch fixes the
problem by re-initializing the property in do_setup().
Fix exception handling on test_delete_nonexistent_volume
test_delete_nonexistent_volume wants to check the exception would be
received when nonexistent volume was specified. But currently, this
test case checks the exception would be received when nonexistent
metadata was specified.
we need to to check the value of the configuration item eqlx_cli_max_retries
in the code in order to ensure the "eqlx_cli_max_retries" is equal to or
greater than 0
DocImpact: The 'retries' is not a configured number of attempts
Change-Id: If9fadda83a855b4bbda6129d3b3a64d296eb2b54
Closes-Bug: #1372454