With l2pop enabled, race exists in delete_port_postcommit
when both create/update_port and delete_port deal with
different ports on the same host, where such ports are
either the first (or) last on same network for that host.
This race happens outside the DB locking zones in
the respective methods of ML2 plugin.
To fix this, we have moved determination of
fdb_entries back to delete_port_postcommit and removed
delete_port_precommit altogether from l2pop mechanism
driver. In order to accomodate dvr interfaces, we
are storing and re-using the mechanism-driver context
which hold dvr-port-binding information while
invoking delete_port_postcommit. We loop through
dvr interface bindings invoking delete_port_postcommit
similar to delete_port_precommit.
Arista L3 Ops is success if it is successful on one peer
This fix checks to see if Arista HW is
configured in MLAG (redundant) mode. If yes,
as long as operation is successful on one of the
paired switches, consider it successful.
first_ip, allocation_pool_id and last_ip, allocation_pool_id
should be unique in the table.
These constraints are essential to detect concurrent modifications
of the IpAvailabilityRange table if the SELECT ... FOR UPDATE
lock is removed
Andrew Boik [Fri, 10 Oct 2014 17:13:45 +0000 (13:13 -0400)]
Update VPN logging to use new i18n functions
For log messages in neutron/services/vpn and neutron/db/vpn, replace
_() marker functions with log-level-specific marker functions: _LI(),
_LW(), _LE() from oslo.i18n.
Also, remove _() functions for debug log messages as debug level log
messages should not be translated.
NSX: drop support to deprecated dist-router extension
The NSX plugin originally implemented its own 'dist-router' extension.
During the Juno timeframe the DVR extension was introduced and the NSX
plugin was ported to support both. At the same time 'dist-router' was
marked for removal in Kilo.
Now that Kilo opened, we can drop the deprecated one.
Carl Baldwin [Thu, 9 Oct 2014 20:33:23 +0000 (20:33 +0000)]
Remove an argument that is never used
This code was creating a dict with a gw_exists key. I was curious to
know who was interested in receiving this information down the line.
However, no one is ever interested in that key. It took me some time
to follow this through wondering what was going on and found only dead
ends.
Carl Baldwin [Thu, 2 Oct 2014 20:35:21 +0000 (20:35 +0000)]
Refactor _process_routers to handle a single router
The method _process_routers no longer handles multiple routers. The
only caller of this method would construct a list of exactly one
router in order to make the call. This made the for loop unnecessary.
The method's logic is too heavy for its current purpose. This commit
removes much of the weight.
The use of the sets in this method is also no longer necessary. It
became clear that all of it boiled down to "if the router is not
compatible with with this agent but it is known in router_info from
before then we need to remove it." This is an exceptional condition
that shouldn't be handled in this method so I raise an exception and
handle it in process_router_update where other router removal is
handled. Logging was added for this exceptional condition.
The eventlet pool was also obsolete. It was used to spawn two methods
and there was a waitall at the end. The other refactoring made it
clear that the two spawns were mutually exclusive. There was only one
thread spawned for any given invocation of the method and the eventlet
pool is overkill.
Mark McClain [Wed, 8 Oct 2014 18:49:20 +0000 (18:49 +0000)]
Add database relationship between router and ports
Add an explicit schema relationship between a router and its ports. This
change ensures referential integrity among the entities and prevents orphaned
ports.
Henry Gessau [Wed, 8 Oct 2014 00:38:38 +0000 (20:38 -0400)]
Disable PUT for IPv6 subnet attributes
In Juno we are not ready for allowing the IPv6 attributes on a subnet
to be updated after the subnet is created, because:
- The implementation for supporting updates is incomplete.
- Perceived lack of usefulness, no good use cases known yet.
- Allowing updates causes more complexity in the code.
- Have not tested that radvd, dhcp, etc. behave OK after update.
Therefore, for now, we set 'allow_put' to False for the two IPv6
attributes, ipv6_ra_mode and ipv6_address_mode. This prevents the
modes from being updated via the PUT:subnets API.
Note: There are several other unrelated unit tests that also break with a
randomized PYTHONHASHSEED, but they are not addressed here. They will be
addressed in separate patches.
Assaf Muller [Tue, 7 Oct 2014 19:45:41 +0000 (22:45 +0300)]
Forbid update of HA property of routers
While the HA property is update-able, and resulting router-get
invocations suggest that the router is HA, the migration
itself fails on the agent. This is deceiving and confusing
and should be blocked until the migration itself is fixed
in a future patch.
Brian Haley [Fri, 3 Oct 2014 21:32:01 +0000 (17:32 -0400)]
Teach DHCP Agent about DVR router interfaces
When DVR is enabled and enable_isolated_metadata=True,
the DHCP agent should only inject a metadata host route
when there is no port with the gateway IP address configured
on the subnet. Add a check for DEVICE_OWNER_DVR_INTERFACE
when we look at each port's device_owner field, otherwise
it will always add this route to the opts file when DVR
is enabled.
Miguel Angel Ajo [Fri, 19 Sep 2014 16:59:58 +0000 (18:59 +0200)]
Modify the ProcessMonitor class to have one less config parameter
It's a follow up patch, as agreed on the ProcessMonitor review
patch to coalesce the check_child_processes parameter into
check_child_process_interval.
When this parameter is set to 0, the feature is disabled.
Kevin Benton [Tue, 7 Oct 2014 11:34:41 +0000 (04:34 -0700)]
Big Switch: Don't clear hash before sync
This patch removes the step of clearing the consistency
hash from the DB before a topology sync. This will ensure
that inconsistency will be detected if the topology sync
fails.
This logic was originally there to make sure the hash header
was not present on the topology sync call to the backend.
However, the hash header is ignored by the backend in a sync
call so it wasn't necessary.
John Schwarz [Tue, 23 Sep 2014 12:24:47 +0000 (15:24 +0300)]
Divide _cleanup_namespaces for easy extensibility
This division of the function to 2 different functions allows for
easier overwriting in the l3 test agent used by the HA functional
tests, and later by the integration tests.
John Schwarz [Tue, 23 Sep 2014 11:41:54 +0000 (14:41 +0300)]
L3 Agent should generate ns_name in a single place
Currently the l3 agent has 2 places where it allows generating ns_name
of specific router_ids (ie. qrouter-<router_id>): in the RouterInfo's
constructor, and in _cleanup_namespaces. This patch proposes a
unification of this creation code with a property which lives in
RouterInfo's namespace. A simpler fix was also made for snat_ns_name.
This patch also offers a single way to initialize a new RouterInfo.
John Schwarz [Mon, 29 Sep 2014 13:28:18 +0000 (16:28 +0300)]
L3 agent should always use a unique CONF object
The l3 agent accepts an oslo configuration in its constructor and uses
it throughout the code, but there are some references to the global
configuration object held by the oslo library. Since HA functional
tests need to create two agents, the configuration should be consistent
throughout the code.
The important difference between the agents is their 'host' value so
that they create different namespaces, and 'state_path' value so the
agents get their own filesystem root.
John Schwarz [Thu, 18 Sep 2014 08:24:53 +0000 (11:24 +0300)]
Iterate over same port_id if more than one exists
In certain cases where multiple ports can have the same
external_ids:iface_id property, the ovs agent will arbitrarily choose
one and ignore the rest. In case the chosen port isn't on the
integration bridge the ovs agent is managing, an error is returned to
the calling function. This is faulty since one of the other ports may
belong to the correct bridge and it should be chosen instead.
This is interesting for future L3 HA integration tests, where 2
different instances of l3 agents are needed to run on the same machine.
In this case, both agents will register ports which have the same
iface_id property, but obviously only one of the ports is relevant for
each agent.
Mark McClain [Wed, 24 Sep 2014 04:00:54 +0000 (04:00 +0000)]
remove openvswitch plugin
This changeset removes the openvswitch plugin, but retains the agent for ML2
The database models were not removed since operators will need to migrate the
data.
Fix pid file location to avoid I->J changes that break metadata
Changes in commit 7f8ae630b87392193974dd9cb198c1165cdec93b moved
pid files handled by agent/linux/external_process.py from
$state_path/external/<uuid>.pid to $state_path/external/<uuid>/pid
that breaks the neutron-ns-metadata-proxy respawn after upgrades
becase the l3 or dhcp agent can't find the old pid file so
they try to start a new neutron-ns-metadata-proxy which won't
succeed, because the old one is holding the port already.
Ed Bak [Mon, 29 Sep 2014 20:15:52 +0000 (14:15 -0600)]
Don't fail when trying to unbind a router
If a router is already unbound from an l3 agent, don't fail. Log
the condition and go on. This is harmless since it can happen
due to a delete race condition between multiple neutron-server
processes. One delete request can determine that it needs to
unbind the router. A second process may also determine that it
needs to unbind the router. The exception thrown will result
in a port delete failure and cause nova to mark a deleted instance
as ERROR.
Mark McClain [Wed, 24 Sep 2014 01:50:06 +0000 (01:50 +0000)]
remove linuxbridge plugin
This changeset removes the linuxbridge plugin, but retains the agent for ML2.
The database models were not removed since operators will need to migrate the
data.
Additionally, the ml2 migration script was altered to support Juno. For
testing, a user must either run the migration against the icehouse
scheme or run the update, manually change alembic_version to juno and
then run the migration script. Once the juno migration is added, this
manually step will not be required.
Erik Colnick [Wed, 18 Jun 2014 17:31:52 +0000 (11:31 -0600)]
Add admin tenant name to nova notifier
This change introduces the ability to use the nova admin
tenant name with the nova notifier in place of the nova
admin tenant id which may not be available when the neutron
service is being configured as is the case with tripleo
installations where the neutron service is configured and
started before the nova admin tenant has been configured in
keystone and thus does not have a known id.
DocImpact
Introduces the nova_admin_tenant_name configuration entry as
an optional configurable value in neutron.conf. If the
nova_admin_tenant_name is configured and the nova_admin_tenanat_id
is not configured, a performance impact may be seen because
keystone will become involved in communication between neutron
and nova.
Kevin Benton [Tue, 30 Sep 2014 03:21:23 +0000 (20:21 -0700)]
ML2: move L3 cleanup out of network transaction
Move _process_l3_delete out of the delete_network
transaction to eliminate the semaphore deadlock that
occurs when it tries to delete the ports associated
with existing floating IPs.
It makes more sense to live outside of the transaction
anyway because the operations it performs cannot be
rolled back only in the database if the L3 plugin makes
external calls for floating IP creation/deletion.
e.g. if delete_floatingip is successful, it may have
deleted external resources and restoring the DB records
would make things inconsistent.
If a failure to delete the network does occur, any cleanup
done by _process_l3_delete will not be reversed.
Robert Pothier [Wed, 3 Sep 2014 15:09:15 +0000 (11:09 -0400)]
ML2 Cisco Nexus MD: Fix UT to send one create vlan message
With the commit of https://review.openstack.org/#/c/113009,
test_nexus_add_trunk needs to have the device_id set to
a unique value. Combining test_nexus_add_trunk and
test_nexus_enable_vlan_cmd to reduce duplicate code,
which fixes the original issue.
Ann Kamyshnikova [Wed, 19 Feb 2014 11:39:19 +0000 (15:39 +0400)]
Implement ModelsMigrationsSync test from oslo.db
Add tests to verify that database migrations produce
the same schema as the database models.
Also for MySQL, check that all tables are configured to use InnoDB
as the storage engine.
These tests make use of the ModelsMigrationsSync test class from
oslo.db and the load_tests protocol from Python unittest.
John Kasperski [Thu, 25 Sep 2014 15:38:45 +0000 (10:38 -0500)]
Update migration scripts to support DB2
Three of the migration scripts are causing failures with DB2.
- DB2 doesn't support nullable column in primary key
- Hard coded SQL statements which use False/True as Boolean arguments
are not compatible with DB2. In DB2, Boolean columns are created as
small integer with a constraint to allow only 0 & 1.
- Hardcoded update rows from other table sql is not compatible with DB2
Note: There are several other unrelated unit tests that also break with a
randomized PYTHONHASHSEED, but they are not addressed here. They will be
addressed in separate patches.
When admin policy p1 is shared and is used by firewall f1 of different tenant,
then updating p1 with shared=False should not be allowed as it is in use.