Alex O'Rourke [Thu, 10 Mar 2016 21:33:20 +0000 (13:33 -0800)]
Fix up failover_host exceptions to preserve states
UnableToFailOver and InvalidReplicationTarget do not save the state
of replication_status currently. This patch adds host.save() in
order to write the change to the db.
In addition, the manager should honor the current replication state
of the host when InvalidReplicationTarget is raised instead of forcing
it into 'enabled' state.
Michał Dulko [Wed, 9 Mar 2016 15:04:20 +0000 (16:04 +0100)]
Add volume_type to volume object expected_attrs
We haven't had volume_type in expected_attrs for volume objects lists.
This resulted in situation in which although we were joining the
volume_type explicitely in DB API, the object just dropped the data.
Volume type was then lazy loaded when needed, so every "cinder list"
call was making additional DB queries per returned volume, causing
massive performance drop.
Actually there were two unnecessary DB calls per volume, because not
only volume_type was fetched, but also volume_type.extra_specs as a result
of passing 'extra_specs' in expected_attrs when calling
VolumeType._from_db_volume in Volume._from_db_volume (wow, that's
complicated…).
This commit sorts this out by adding volume_type to expected_attrs to
match what we join in the DB. Please note that I'm not adding
consistencygroup and volume_attachment on purpose - addition causes some
unit tests failure and that late in the release it seems risky to try
fixing that. The changes also required minor rework of expected_attrs
infrastructure in the o.vo layer to be able to pass different values
when we query for just a single volume and when we fetch whole list (as
we're doing different joins in the DB layer in both cases).
Vincent Hou [Tue, 1 Mar 2016 19:26:52 +0000 (14:26 -0500)]
Storwize: Update replication to v2.1
This patch updates replication to match the v2.1 spec. This makes
it possible to replicate an entire backend, and upon failover, all
replicated volumes will be failed over together.
cinder.conf should have the replication config group:
The replication can be configured via either multi-backend on one
cinder volume node, or on separate cinder volume nodes.
Options to be put in cinder.conf, where the primary back-end is
located:
Patrick East [Wed, 9 Mar 2016 19:08:27 +0000 (11:08 -0800)]
Switch failover-host from rpc call to cast
There is some concern that with large numbers of volumes it will be
difficult for drivers to failover the host before the rpc timeout hits.
To avoid asking admins to bump the timeout just for these cases we can
switch it to do a non-blocking cast instead of call. The difference now
being that the active_backend_id is not returned from the API call to
failover-host. An admin will have to look at the service-list output
to see when it has changed states from ‘failing-over’ and then check
what its active_backend_id is at that time.
Nate Potter [Wed, 4 Nov 2015 15:45:11 +0000 (15:45 +0000)]
Show qos_specs_id based on policy
Right now qos_specs_id is only shown to an admin user
when showing a volume type. This patch changes that to
be based on policy to allow for more flexibility. It
also adds unit tests for showing a volume type
with policy permissions for qos_specs_id as well as
extra_specs.
Danny Al-Gaaf [Tue, 8 Mar 2016 15:43:15 +0000 (16:43 +0100)]
Pass RBD order to clone call
For cloning of a RBD the rbd_store_chunk_size information from
the cinder.conf should be used to calculate and pass the correct
order information to the clone() call of the rbd library.
Added new test to check for correctly from rbd_store_chunk_size
calculated order while cloning.
Change-Id: Ic5714d3e0d6961bce6ff588006661618130dca07 Co-Authored-By: Logan V <logan2211@gmail.com> Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Wilson Liu [Mon, 22 Feb 2016 05:03:26 +0000 (13:03 +0800)]
Huawei: Check before delete host
Currently we delete the host without checking
whether the host already belongs to a host
group. If a host already belongs to a hostgroup,
an error will occur. So we should do the check
before delete it.
Michal Jura [Mon, 7 Mar 2016 10:29:32 +0000 (11:29 +0100)]
Fix failure with rbd on slow ceph clusters
Make rados connection interval and retries configurable
for _try_remove_volume() function
Otherwise on slow ceph clusters, we can get following problem:
"ImageBusy error raised while deleting rbd volume. This may have been
caused by a connection from a client that has crashed and, if so,
may be resolved by retrying the delete after 30 seconds has elapsed."
Javeme [Wed, 18 Nov 2015 08:26:08 +0000 (16:26 +0800)]
Remove those unnecessary statements "return True"
The reasons to remove them:
* These asynchronous API is not necessary to return a value.
* We do not see the purpose of the statements, because it always
returns true regardless of success or failure.
These statements are unnecessary and misleading, this commit we will
remove these "return True" (Or just remove "True" if it's not at the
end of a function, in order not to change the execution path).
LisaLi [Fri, 4 Mar 2016 07:02:30 +0000 (15:02 +0800)]
Report versions in cinder-manager service list
We set rpc_current_version and object_current_version
in Mitaka.
This patch is to show these two fields in command
'cinder-manage service list' to help admin know the
versions of each service. It helps during upgrade.
Moved CORS middleware configuration into oslo-config-generator
The default values needed for cinder's implementation of cors
middleware have been moved from paste.ini into the configuration
hooks provided by oslo.config. Furthermore, these values have been
added to cinder's default configuration parsing. This ensures
that if a value remains unset in cinder.conf, it will be set
to use sane defaults, and that an operator modifying the
configuration file will be presented with a default set of
necessary sane headers.
Clinton Knight [Fri, 26 Feb 2016 21:21:59 +0000 (16:21 -0500)]
NetApp: volume resize using clone fails with QoS
Data ONTAP cannot resize a LUN past the geometry established
when the LUN was created. So to resize past that limit, the
DOT drivers create a new LUN of the needed size and clone
the original LUN into it. This process doesn't always work
if a QoS policy is in place, and the fix is to not pass the
QoS policy to the clone operation.
The fix is trivial, and I improved the surrounding code a
little, but there was no unit test coverage for the method
in question, so this commit also adds full coverage for the
LUN clone method.
stack [Fri, 4 Mar 2016 09:13:06 +0000 (04:13 -0500)]
Fixes creating volume issue for multiple management IPs
Currently there is an issue with the multiple management IPs in
Storwize SVC where volume creation would fail when
storwize_san_sencondory_ip switch to san_ip.
This patch adds a condition to check the sshpool.ip, if the
sshpool.ip equals to storwize_san_sencondory_ip, switch it back
to san_ip.
wanghao [Tue, 8 Mar 2016 05:57:15 +0000 (13:57 +0800)]
Add volumes table definition when migrating to 67
When migrating DB to 67 readd_iscsi_targets_table,
we will add foreign key "volumes.id", but we miss
to define the volumes table, that will cause the
migrating process fail: 'NoReferencedTableError'.
Fix this issue by adding volume table definition
before creating iscsi_targets table. test is
added as well.
Tom Barron [Mon, 7 Mar 2016 20:05:21 +0000 (15:05 -0500)]
Trim 5s+ from storwize unit tests
The test test_run_ssh_fail_to_secondary_ip test case in the StorWize
Driver does a greenthread.sleep(random.randint()) inside a retry loop.
Its execution time often approaches 5s.
This commit mocks random.randint() in this test so that the
greenthread.sleep() duration is always zero, reducing execution time for
the test to under half a second.
The various test_storwize_consistency_group* tests trigger
FixedIntervalLoopingCalls. Mocking them with ZeroIntervalLoopingCalls
reduces their total execution time from about 8s to under 3s.
We also fix the fake user and project ids in this file so that
it no longer emits FutureWarnings from oslo.versionedobjects about
invalid uuids as documented here [1].
scottda [Fri, 4 Mar 2016 13:45:25 +0000 (06:45 -0700)]
microversion header for legacy endpoints removed
With the current implementation, a microversion header will be returned
even if /v1 or /v2 API endpoints are used.
This is wrong, and constitutes an API change. Remove this header for
legacy endpoints /v1 and /v2.
Ryan McNair [Tue, 1 Mar 2016 18:54:37 +0000 (18:54 +0000)]
Update quotas to handle domain acting as project
The Keystone change Ib22a0f3007cb7ef6b4df6f48da5f4d018e905f55 sets
the domain_id as the top-level parent project. However, since a
domain is not a project and therefore has no effect on quota nesting,
the domain "parent" should not be considered in the nested quota code.
This patch updates the quota code to ignore the domain_id if it's
present in the parent tree.
Philipp Marek [Mon, 7 Mar 2016 09:53:29 +0000 (10:53 +0100)]
DRBD: Policy-based waiting for completion
The more nodes are in a cluster, the higher the chance that one or more
of them are not available. Waiting on all nodes will not work any more,
so give the user a way to define what conditions are acceptable (to the
management, etc.) to continue.
The upcoming DRBDmanage 1.0 release will have a (sample) policy plugin,
so make that available in Cinder, too.
Documentation is at
http://drbd.linbit.com/users-guide-9.0/s-drbdmanage-deployment-policy.html
note: Servers must be prepared to deal with multiple
OpenStack-API-Version headers. This could happen when a client
designed to address multiple services always sends the headers it
thinks it needs. Most Python frameworks will handle this by setting
the value of the header to the values of all matching headers,
joined by a ',' (comma). For example ``compute 2.11,identity
2.114``.
Gorka Eguileor [Thu, 3 Mar 2016 13:37:29 +0000 (14:37 +0100)]
Readd iscsi_target table
Rolling upgrades was broken when iscsi_target table was dropped on
https://review.openstack.org/268320
We cannot stop using a table and drop it in the same release for rolling
upgrades to work, we have to stop using it in one release and then drop
it in the next or in the post rolling upgrade mechanism (which is still
not in place).
So this patch fixes this by removing the dropping and adding another
migration that ensure that the table is really there. That way we can
be sure that anyone using M will have the table, which then will get
dropped in N.
Patrick East [Fri, 4 Mar 2016 05:47:22 +0000 (21:47 -0800)]
Fix issue with Pure drivers delete_snapshot exception handling
We were checking for only a single possible error that can occur when
the snapshot was missing. We now check for both which helps prevent
any snapshots or volumes deleted out from underneath Cinder to get
things stuck in an error state.
This also adjusts the warning message to be a little more descriptive.
Patrick East [Fri, 4 Mar 2016 00:15:49 +0000 (16:15 -0800)]
Add backend id to Pure Volume Driver trace logs
With multi-backend deployments it was very hard to follow which backend
was making which calls. When there was an error it was often unhelpful
to look at the tracing to know which backend had the problem.
This will now print out the active backend id with the tracing log
messages.
Kurt Martin [Wed, 2 Mar 2016 22:57:06 +0000 (14:57 -0800)]
Don't fail on clearing 3PAR object volume key
The 3PARs drivers write a key value pair on the 3PAR backend volumes
to track the instance that the volume is exported to. However, in
certain cases the key is not present and we would throw an
exception and not allow the detach to continue. If the key is not
present then we do not need to clear it. This patch will just
log a warning that it wasn't present and continue with the detaching
the volume.