Michael Smith [Mon, 10 Nov 2014 23:49:14 +0000 (15:49 -0800)]
Fix for FIPs duplicated across hosts for DVR
For DVR, FIPs should be hosted on the single node
which hosts the VM assigned with the fixed_ip of the FIP.
The l3_agent should only take action on the correct FIP per
host by filtering the FIPs based on the 'host' value
of the FIP.
A recent refactor on the l3_agent moved the host filtering logic
from process_router_floating_ip_addresses() to
_get_external_device_interface_name(). The local floating_ips var
was not altered as it was before the refactor.
This resulted in network disruption across multiple hosts
since more than one namespace contained the FIP. This problem
would only be seen in a mutli-host environment where the same
router hosting FIPs was present on more than one node.
The fix is to return the host filtering logic by adding a
call to get_floating_ips(). In addition, the unit test
test_process_router_dist_floating_ip_add() was modified to
pass two FIPs instead of one. One FIP matches the host
of the agent, one does not. Only one should be processed,
not two.
lzklibj [Tue, 21 Oct 2014 06:55:53 +0000 (23:55 -0700)]
fix event_send for re-assign floating ip
Neutron can associate a floating ip to a new port
without disassociate from original instance port.
This situation will send network changed event only
for new instance port, and that event object contains
the new instance's id.
In this case nova will update new instance's info
but not original one's in nova's database table
instance_info_caches. For nova can get new instance's
id from the above event. So in table instance_info_caches,
both original instance and new instance will have the
same floating ip in their records. And this make it
possible that, in most situation, after your re-assign
floating ip, run "nova list" will return incorrect info,
multiple instances have a same floating ip, and this may
confuse users.
Nova will sync data in table instance_info_caches, but it
may take dozens of seconds.
The new added code will send network changed event for the
original instance, and this will make nova update instance_
_info_caches table in a few seconds.
Sachi King [Sun, 2 Nov 2014 13:35:51 +0000 (00:35 +1100)]
Fix L3 HA network creation to allow user to create router
Update HA Network creation to use an admin context to allow Neutron
to create the tenant-less network required for the HA router when
it does not yet exist and is being created by a non-admin user.
Neutron creates these resources without a tenant so users cannot see
or modify the HA network, ports, etc. Port creation and association
already use elivated admin contexts to allow their function when
an user attempts to create a HA L3 router.
Samer Deeb [Tue, 4 Nov 2014 11:58:19 +0000 (13:58 +0200)]
SRIOV: Fix Wrong Product ID for Intel NIC example
Some Examples and default Values of the Product ID for Intel NIC contains
Wrong Product ID that belongs to the PF product,
it should be replaced with the product ID of the VF.
Since python2.6, python has a proper ternary construct "A if PRED else
B". The older idiom "PRED and A or B" has a hidden trap - when A is
itself false, the result is (unexpectedly) B.
This change removes all cases of the older construct found using a
trivial git grep " and .* or " - except one case in oslo common
code (fixed in oslo upstream).
Elena Ezhova [Wed, 5 Nov 2014 10:01:51 +0000 (13:01 +0300)]
Replace "nova" entries in iptables_manager with "neutron"
In iptables_manager docstrings there are still some references to
nova left from nova/network/linux_net.py.
Remove these references and update the docstrings.
Brian Haley [Thu, 25 Sep 2014 01:45:06 +0000 (21:45 -0400)]
Make L2 DVR Agent start successfully without an active neutron server
If the L2 Agent is started before the neutron controller
is available, it will fail to obtain its unique DVR MAC
address, and fall-back to operate in non-DVR mode
permanently.
This fix does two things:
1. Makes the L2 Agent attempt to retry obtaining a DVR MAC
address up to five times on initialization, which should be
enough time for RPC to be successful. On failure, it will
fall back to non-DVR mode, ensuring that basic switching
continues to be functional.
2. Correctly obtains the current operating mode of the
L2 Agent in _report_state(), instead of only reporting
the configured state. This operating mode is carried
in 'in_distributed_mode' attribute of agent state, and
is separate from the existing enable_distributed_routing
static config that is already sent.
Kevin Benton [Sat, 18 Oct 2014 07:38:57 +0000 (00:38 -0700)]
Rename constant to a more appropriate name
The DB_MAX_RETRIES implies that a query will be
retried that many times. 'retry' means it happened
once before. In the current code, if DB_MAX_RETRIES
is set to 1, the query won't be retried at all.
If it's set to 0, the query won't even be run.
This constant should actually be called DB_MAX_ATTEMPTS
to indicate that the variable includes the first try.
Kevin Benton [Wed, 29 Oct 2014 04:39:04 +0000 (21:39 -0700)]
Big Switch: Fix SSL version on get_server_cert
The ssl.get_server_certificate method uses SSLv3 by default.
Support for SSLv3 was dropped on the backend controller in
response to the POODLE vulnerability. This patch fixes it
to use TLSv1 like the wrap_socket method.
Check for concurrent port binding deletion before binding the port
When agent tries to update port binding (DVR or regular), the port
might have already been deleted via API call.
This is not an error condition but should be handled to avoid traces
in the logs.
Kevin Benton [Fri, 26 Sep 2014 16:40:44 +0000 (09:40 -0700)]
Batch ports from security groups RPC handler
The security groups RPC handler calls get_port_from_device
individually for each device in a list it receives. Each
one of these results in a separate SQL query for the security
groups and port details. This becomes very inefficient as the
number of devices on a single node increases.
This patch adds logic to the RPC handler to see if the core
plugin has a method to lookup all of the device IDs at once.
If so, it uses that method, otherwise it continues as normal.
The ML2 plugin is modified to include the batch function, which
uses one SQL query regardless of the number of devices.
ML2 Cisco Nexus MD - not overwriting existing config
The Cisco Nexus ML2 MD overwrites any existing switchport
VLAN config such as management VLANs that are preconfigured
in the compute node ToR interfaces.
This bug addresses that issue and ensures the config is not
wiped out.
marios [Wed, 22 Oct 2014 10:11:02 +0000 (13:11 +0300)]
Reorder operations in (l3_dvr) update floating ip
This review overrides update_floatingip (L3_NAT_dbonly_mixin)
in l3_dvr_db (L3_NAT_with_dvr_db_mixin) to reorder the garbage
collection to after the floating ip is updated and rpc called.
This was previously being called in the (already) overridden
_update_fip_assoc.
Since this call is moved, the _update_fip_assoc for l3_dvr_db
is exactly the same as l3_db and is thus removed completely.
This tidy up was created whilst looking at bug 1381617. The
intention was to mitigate the timing issues exposed by [1]
and discussed in the bug report. It seems the problem persists
with more discussion around 'properly fixing' this at [2].
Oleg Bondarev [Wed, 27 Aug 2014 11:19:18 +0000 (15:19 +0400)]
Use RPC instead of neutron client in metadata agent
RPC is a standard way of interacting between Neutron server and agents
Using neutron client is also inefficient as it results in unneeded
keystone load and may become a bottleneck at scale
DocImpact
When upgrading, one should upgrade neutron server first,
then metadata agent. However there is a fallback in case
metadata agent fails to get info from server by rpc -
it will return to using neutron client.
In neutron/tests/unit/test_api_v2.APIv2TestCase.test_page_reverse
there is no second call to test_case with 'page_reverse': 'False,
because of use assert_called_once_with.
In proposed change before second testcase, reset_mock is called.
Remove also second 'instance' initialization.
Itzik Brown [Tue, 2 Sep 2014 07:02:22 +0000 (10:02 +0300)]
Adds an option to enable broadcast replies to Dnsmasq
Adds a flag for DHCP agent configuration
to add dhcp-broadcast flag to Dnsmasq process
In order to support virtual network on top of Infiniband
Fabric, there is a requirement to receive DHCP response
via broadcast message (according to IB Spec).
Kyle Mestery [Wed, 18 Jun 2014 11:04:52 +0000 (11:04 +0000)]
Add advsvc role to neutron policy file
Add in a default "advsvc" user and the logic in the Neutron policy
infrastructure which will allow this user to create/get/update/delete
ports on other tenants networks, as well as view other tenants
networks. This is for the use case of letting advanced services have
a user to put ports on other tenants networks. By default, we do not
define any roles for the policy "context_is_advsvc", but rely on
operators to specify the likely value of "role advsvc".
NSX: allow multiple networks with same vlan on different phy_net
Previously, the NSX plugin prevented one from creating multiple networks on
the same vlan even if they were being created on different physical_networks.
This patch corrects this issue and allows this to now occur.
Mark McClain [Mon, 13 Oct 2014 20:38:43 +0000 (20:38 +0000)]
Remove XML support
XML support in Neutron has always been a second class feature to the
JSON API and broken for many extensions and outputs. The XML API been marked as
deprecated for the Icehouse and Juno release and is ready for removal in
Kilo.
NEC plugin: Allow to apply Packet filter on OFC router interface
Config parameter support_packet_filter_on_ofc_router is added
only to make the pluign work with the old version of PFC v5
which has no support of packet filter on vrouter interface.
Kevin Benton [Thu, 16 Oct 2014 08:49:19 +0000 (01:49 -0700)]
_update_router_db: don't hold open transactions
This patch prevents the L3 _update_router_db method from
starting a transaction before calling the gateway interface
removal functions. With these port changes now occuring
outside of the L3 DB transaction, a failure to update the
router DB information will not rollback the port deletion
operation.
The 'VPN in use' check had to be moved inside of the DB deletion
transaction now that there isn't an enclosing transaction to undo
the delete when an 'in use' error is raised.
===Details===
The router update db method starts a transaction and calls
the gateway update method with the transaction held open.
This becomes a problem when the update results in an
interface removal which uses a port table lock.
Because the delete_port caller is still holding open a
transaction, other sessions are blocked from getting an
SQL lock on the same tables when delete_port starts
performing RPC notifications, external controller calls,
etc. During those external calls, eventlet will
yield and another thread may try to get a lock on the
port table, causing the infamous mysql/eventlet deadlock.
This separation of L2/L3 transactions is similiar to change
I3ae7bb269df9b9dcef94f48f13f1bde1e4106a80 in nature. Even
though there is a loss in the atomic behavior of the interface
removal operation, it was arguably incorrect to begin with.
The restoration of port DB records during a rollback after some
other failure doesn't undo the backend operations (e.g. REST calls)
that happened during the original deletion. So, having a delete
rollback without corresponding 'create_port' calls to the backend
causes a loss in consistency.