Brian Haley [Thu, 25 Sep 2014 01:45:06 +0000 (21:45 -0400)]
Make L2 DVR Agent start successfully without an active neutron server
If the L2 Agent is started before the neutron controller
is available, it will fail to obtain its unique DVR MAC
address, and fall-back to operate in non-DVR mode
permanently.
This fix does two things:
1. Makes the L2 Agent attempt to retry obtaining a DVR MAC
address up to five times on initialization, which should be
enough time for RPC to be successful. On failure, it will
fall back to non-DVR mode, ensuring that basic switching
continues to be functional.
2. Correctly obtains the current operating mode of the
L2 Agent in _report_state(), instead of only reporting
the configured state. This operating mode is carried
in 'in_distributed_mode' attribute of agent state, and
is separate from the existing enable_distributed_routing
static config that is already sent.
Kevin Benton [Sat, 18 Oct 2014 07:38:57 +0000 (00:38 -0700)]
Rename constant to a more appropriate name
The DB_MAX_RETRIES implies that a query will be
retried that many times. 'retry' means it happened
once before. In the current code, if DB_MAX_RETRIES
is set to 1, the query won't be retried at all.
If it's set to 0, the query won't even be run.
This constant should actually be called DB_MAX_ATTEMPTS
to indicate that the variable includes the first try.
Kevin Benton [Wed, 29 Oct 2014 04:39:04 +0000 (21:39 -0700)]
Big Switch: Fix SSL version on get_server_cert
The ssl.get_server_certificate method uses SSLv3 by default.
Support for SSLv3 was dropped on the backend controller in
response to the POODLE vulnerability. This patch fixes it
to use TLSv1 like the wrap_socket method.
Check for concurrent port binding deletion before binding the port
When agent tries to update port binding (DVR or regular), the port
might have already been deleted via API call.
This is not an error condition but should be handled to avoid traces
in the logs.
Kevin Benton [Fri, 26 Sep 2014 16:40:44 +0000 (09:40 -0700)]
Batch ports from security groups RPC handler
The security groups RPC handler calls get_port_from_device
individually for each device in a list it receives. Each
one of these results in a separate SQL query for the security
groups and port details. This becomes very inefficient as the
number of devices on a single node increases.
This patch adds logic to the RPC handler to see if the core
plugin has a method to lookup all of the device IDs at once.
If so, it uses that method, otherwise it continues as normal.
The ML2 plugin is modified to include the batch function, which
uses one SQL query regardless of the number of devices.
ML2 Cisco Nexus MD - not overwriting existing config
The Cisco Nexus ML2 MD overwrites any existing switchport
VLAN config such as management VLANs that are preconfigured
in the compute node ToR interfaces.
This bug addresses that issue and ensures the config is not
wiped out.
marios [Wed, 22 Oct 2014 10:11:02 +0000 (13:11 +0300)]
Reorder operations in (l3_dvr) update floating ip
This review overrides update_floatingip (L3_NAT_dbonly_mixin)
in l3_dvr_db (L3_NAT_with_dvr_db_mixin) to reorder the garbage
collection to after the floating ip is updated and rpc called.
This was previously being called in the (already) overridden
_update_fip_assoc.
Since this call is moved, the _update_fip_assoc for l3_dvr_db
is exactly the same as l3_db and is thus removed completely.
This tidy up was created whilst looking at bug 1381617. The
intention was to mitigate the timing issues exposed by [1]
and discussed in the bug report. It seems the problem persists
with more discussion around 'properly fixing' this at [2].
Oleg Bondarev [Wed, 27 Aug 2014 11:19:18 +0000 (15:19 +0400)]
Use RPC instead of neutron client in metadata agent
RPC is a standard way of interacting between Neutron server and agents
Using neutron client is also inefficient as it results in unneeded
keystone load and may become a bottleneck at scale
DocImpact
When upgrading, one should upgrade neutron server first,
then metadata agent. However there is a fallback in case
metadata agent fails to get info from server by rpc -
it will return to using neutron client.
Itzik Brown [Tue, 2 Sep 2014 07:02:22 +0000 (10:02 +0300)]
Adds an option to enable broadcast replies to Dnsmasq
Adds a flag for DHCP agent configuration
to add dhcp-broadcast flag to Dnsmasq process
In order to support virtual network on top of Infiniband
Fabric, there is a requirement to receive DHCP response
via broadcast message (according to IB Spec).
Kyle Mestery [Wed, 18 Jun 2014 11:04:52 +0000 (11:04 +0000)]
Add advsvc role to neutron policy file
Add in a default "advsvc" user and the logic in the Neutron policy
infrastructure which will allow this user to create/get/update/delete
ports on other tenants networks, as well as view other tenants
networks. This is for the use case of letting advanced services have
a user to put ports on other tenants networks. By default, we do not
define any roles for the policy "context_is_advsvc", but rely on
operators to specify the likely value of "role advsvc".
NSX: allow multiple networks with same vlan on different phy_net
Previously, the NSX plugin prevented one from creating multiple networks on
the same vlan even if they were being created on different physical_networks.
This patch corrects this issue and allows this to now occur.
Mark McClain [Mon, 13 Oct 2014 20:38:43 +0000 (20:38 +0000)]
Remove XML support
XML support in Neutron has always been a second class feature to the
JSON API and broken for many extensions and outputs. The XML API been marked as
deprecated for the Icehouse and Juno release and is ready for removal in
Kilo.
Kevin Benton [Thu, 16 Oct 2014 08:49:19 +0000 (01:49 -0700)]
_update_router_db: don't hold open transactions
This patch prevents the L3 _update_router_db method from
starting a transaction before calling the gateway interface
removal functions. With these port changes now occuring
outside of the L3 DB transaction, a failure to update the
router DB information will not rollback the port deletion
operation.
The 'VPN in use' check had to be moved inside of the DB deletion
transaction now that there isn't an enclosing transaction to undo
the delete when an 'in use' error is raised.
===Details===
The router update db method starts a transaction and calls
the gateway update method with the transaction held open.
This becomes a problem when the update results in an
interface removal which uses a port table lock.
Because the delete_port caller is still holding open a
transaction, other sessions are blocked from getting an
SQL lock on the same tables when delete_port starts
performing RPC notifications, external controller calls,
etc. During those external calls, eventlet will
yield and another thread may try to get a lock on the
port table, causing the infamous mysql/eventlet deadlock.
This separation of L2/L3 transactions is similiar to change
I3ae7bb269df9b9dcef94f48f13f1bde1e4106a80 in nature. Even
though there is a loss in the atomic behavior of the interface
removal operation, it was arguably incorrect to begin with.
The restoration of port DB records during a rollback after some
other failure doesn't undo the backend operations (e.g. REST calls)
that happened during the original deletion. So, having a delete
rollback without corresponding 'create_port' calls to the backend
causes a loss in consistency.
Terry Wilson [Thu, 16 Oct 2014 01:56:17 +0000 (20:56 -0500)]
Only resync DHCP for a particular network when their is a failure
The previous implementation will loop through and restart the dhcp
process for all active networks any time there is an exception calling
a dhcp driver function. This allows a tenant who can create an exception
to cause every dhcp process to restart. On systems with lots of networks
this can easily take longer than the default resync timeout leading to a
system that becomes unresponsive because of the load continually restarting
causes.
This patch restarts only dhcp processes related to the network on which
operations are failing. It should be noted that if there was some kind
of missed notification for a subnet update, the previous implementation
may have incidentally fixed it by restarting everything on the off
chance that something else caused an exception, but obviously relying
on that would be a bad idea as exceptions should be, well, exceptional.
Angus Lees [Thu, 21 Aug 2014 04:08:07 +0000 (14:08 +1000)]
Hyper-V: Remove useless use of "else" clause on for loop
"else" on for loops is only important if the loop contains a "break"
statement. Without a "break", the else block is _always_ executed and
it is clearer just to omit "else".
This change also enables the corresponding pylint warning, now that the
only offending case has been fixed.
Angus Lees [Tue, 21 Oct 2014 22:24:21 +0000 (09:24 +1100)]
Enable no-name-in-module pylint check
Add _MovedItems (from six.moves) to pylintrc ignored-modules, and adjust
one import of sqlalchemy.orm.properties.RelationshipProperty.
s.o.p.RelationshipProperty is created at import-time in a rather
exciting manner - rearranging the import in this way forces the
import-time code to be executed and seems sufficient to satisfy the
pylint static check.
Carl Baldwin [Mon, 20 Oct 2014 21:48:42 +0000 (21:48 +0000)]
Move disabling of metadata and ipv6_ra to _destroy_router_namespace
I noticed that disable_ipv6_ra is called from the wrong place and that
in some cases it was called with a bogus router_id because the code
made an incorrect assumption about the context. In other case, it was
never called because _destroy_router_namespace was being called
directly. This patch moves the disabling of metadata and ipv6_ra in
to _destroy_router_namespace to ensure they get called correctly and
avoid duplication.
YAMAMOTO Takashi [Fri, 17 Oct 2014 03:30:38 +0000 (12:30 +0900)]
tox.ini: Avoid using bash where unnecessary
Switch to sh, which is hopefully more ubiquitously available than bash.
A recent change (commit 085a35d657cf0fa41a402f2af66c4beaa0f60db2)
introduced bash dependency for "tox -e pep8". It broke my environment,
where bash is not available. This change aims to restore it.
As far as I undestand, the change in question doesn't actually need
the specific shell dialect. So switch to sh, which is expected to be
available for any POSIX-like systems, would improve the situation.
rajeev [Mon, 13 Oct 2014 20:25:36 +0000 (16:25 -0400)]
Fix race condition on processing DVR floating IPs
Fip namespace and agent gateway port can be shared by multiple dvr routers.
This change uses a set as the control variable for these shared resources
and ensures that Test and Set operation on the control variable are
performed atomically so that race conditions do not occur among
multiple threads processing floating IPs.
Limitation: The scope of this change is limited to addressing the race
condition described in the bug report. It may not address other issues
such as pre-existing issue with handling of DVR floatingips on agent
restart.
Jakub Libosvar [Tue, 14 Oct 2014 14:36:02 +0000 (16:36 +0200)]
neutron-db-manage finds automatically config file
This patch lets oslo.config find config file containing connection
string to database by itself. Config file can be overriden by
--config-file cli parameter as it was before, so it is backward
compatible.