blueprint ipv6-router (ChangeID:Iaefa95f788053ded9fc9c7ff6845c3030c6fd6df),
supports an IPv6 Router where the router gateway port has no subnet.
The BP implements the following. If an external network (without any subnet)
is attached to the Neutron router, it reads the ipv6_gateway config parameter
(LLA of upstream router) from l3_agent.ini file and adds a default route that
points to this LLA. If the ipv6_gateway config value is not configured, it
would configure the gateway interface to accept router advts from upstream
router to build the default route. For an HA router, we would have to
configure keepalived to perform this operation.
This patch is a bug fix for the broken feature in kilo.
Ryan Tidwell [Mon, 4 May 2015 22:56:41 +0000 (15:56 -0700)]
Block subnet create when a network hosts subnets allocated from different pools
This change will ensure that all subnets with the same ip_version on a given
network have been allocated from the same subnet pool or no pool. This
provides cleaner subnet overlap detection.
Support multiple IPv6 prefixes on internal router ports for an HA Router
As part of BP multiple IPv6 prefixes, we can have multiple IPv6 prefixes on
router internal ports. Patch, I7d4e8194815e626f1cfa267f77a3f2475fdfa3d1, adds
the necessary support for a legacy router.
For an HA router, instead of configuring the addresses on the router internal
ports we should be updating the keepalived config file and let keepalived
configure the addresses depending on the state of the router.
Following are the observations with the current code for an HA router.
1. IPv6 addresses are configured on the router internal ports (i.e., qr-xxx)
irrespective of the state of the router. As the same IP is configured on multiple
ports you will notice dadfailed status on the ports.
2. Keepalived configuration is not updated with the new IPv6 addresses.
This patch addresses the above issues for an HA Router.
OVS-agent: Ignore IPv6 addresses for ARP spoofing prevention
The flow rules to match on ARP headers for spoofing prevention
fail to install when an IPv6 address is used. These should be
skipped since the ARP spoofing prevention doesn't apply to IPv6.
Kevin Benton [Thu, 28 May 2015 00:38:32 +0000 (17:38 -0700)]
Process port IP requests before subnet requests
When a port requests multiple fixed IPs, process the requests
for specific IP addresses before the ones asking for a subnet.
This prevents an error where the IP that was requested happens
to be the next up for allocation so the subnet request takes it
and causes a DBDuplicateEntry.
Kevin Benton [Tue, 26 May 2015 01:55:44 +0000 (18:55 -0700)]
Persist DHCP leases to a local database
Due to issues caused by dnsmasq restarts sending DHCPNAKs,
change Ieff0236670c1403b5d79ad8e50d7574c1b694e34 passed the
'dhcp-authoritative' option to dnsmasq. While this solved the
restart issue, it broke the multi-DHCP server scenario because
the dnsmasq instances will NAK requests to a server ID that
isn't their own.
Problem DHCP Request Lifecycle:
Client: DHCPDISCOVER(broadcast)
Server1: DHCPOFFER
Server2: DHCPOFFER
Client: DHCPREQUEST(broadcast with Server-ID=Server1)
Server1: DHCPACK
Server2: DHCPNAK(in response to observed DHCPREQUEST with other Server-ID)
^---Causes issues
This change removes the authoritative option so NAKs are not
send in response to DHCPREQUEST's to other servers. To handle
the original issue that Ieff0236670c1403b5d79ad8e50d7574c1b694e34
was inteded to address, this patch also allows changes to be persisted
to a local lease file.
In order to handle the issue where a DHCP server may be scheduled
to another agent, a fake lease file is generated for dnsmasq to start
with. The contents are populated based on all of the known ports for
a network. This should prevent dnsmasq from NAKing clients renewing
leases issued before it was restarted/rescheduled.
Kevin Benton [Sat, 16 May 2015 02:44:16 +0000 (19:44 -0700)]
Match order of iptables arguments to iptables-save
The way we were forming our iptables rules was not matching
the output of iptables-save. This caused the logic that preserves
counters to miss many of the rules.
This patch corrects the order for the comments and the allowed address
pairs to match the output order of iptables-save.
This fix let flows in br-tun automatically recover from an Exception,
which is an ideal situation.
Simplly improve a missed flag will make sure OVS restart properly
after we walked out of Exception loop.
Kevin Benton [Tue, 21 Apr 2015 09:01:39 +0000 (02:01 -0700)]
Block allowed address pairs on other tenants' net
Don't allow tenants to use the allowed address pairs extension
when they are attaching a port to a network that does not belong
to them.
This is done because allowed address pairs can allow things like
ARP spoofing and all tenants attached to a shared network might not
implicitly trust each other.
Currently radvd is spawned in all the HA routers irrespective of the
state of the router. This approach has the following issues.
1. While processing the internal router ports (i.e., qr-xxx), ha_router
removes the LLA of the interface and adds it as a VIP to Keepalived conf.
Radvd daemon is spawned after this operation in the router namespace
(if the port is associated with any IPv6 subnets). Radvd notices that
qr-xxx interface does not have the LLA, so does not transmit any Router
Advts. In this state, VMs fail to acquire IPv6 addresses because of the
missing RAs. Radvd does not recover even after keepalived configures the
LLA of the interface. The only solution is to restart/reload radvd daemon.
Currently keepalived-state-change monitor does not do any radvd related
operations when a state transition happens. So we endup in this state
forever.
2. For all the routers in Backup state, qr-xxx interface does not have LLA
as it is managed by keepalived and configured only on the Master HA router.
In such agents syslog is flooded with the messages [1] and this can cause
loss of other useful info.
[1] - resetting ipv6-allrouters membership on qr-2e373555-97
This patch implements the following.
1. If the router is already in the Master state, we configure the LLA as a VIP
in keepalived conf but do not delete the LLA of the internal interface.
2. We spawn radvd only if the router is in the Master State.
3. Keepalived-state-change monitor takes care of enabling/disabling radvd upon
state transitions.
Stephen Ma [Tue, 24 Feb 2015 23:31:33 +0000 (23:31 +0000)]
Router is not unscheduled when the last port is deleted
When checking for ports that are still in use on a DVR router,
the L3 agent scheduler makes the assumption that a port's
network must be owned by the same tenant. This isn't always
true as the admin could have created a shared network that
other tenants may use. The result of this assumption is that
the router associated with the shared network may not be
unscheduled from a VM host when the last VM (created by a
non-admin tenant) using the shared network is deleted from
the compute node.
The owner of a VM may not own all the ports of a shared
network. Other tenants may have VMs using the same shared
network running on the same compute node. Also the VM owner
may not own the router ports. In order to check whether a
router can be unscheduled from a node has to be run with
admin context so all the ports associated with router are
returned from database queries.
This patch fixes this problem by using the admin context to
make the queries needed for the DVR scheduler to make the
correct unschedule decision.
Kevin Benton [Fri, 24 Apr 2015 07:35:31 +0000 (00:35 -0700)]
Don't resync on DHCP agent setup failure
There are various cases where the DHCP agent will try to
create a DHCP port for a network and there will be a failure.
This has primarily been caused by a lack of available IP addresses
in the allocation pool. Trying to fix all availability corner cases
on the server side will be very difficult due to race conditions between
multiple ports being created, the dhcp_agents_per_network parameter, etc.
This patch just stops the resync attempt on the agent side if a failure
is caused by an IP address generation problem. Future updates to the subnet
will cause another attempt so if the tenant does fix the issue they will
get DHCP service.
Kevin Benton [Tue, 31 Mar 2015 03:29:51 +0000 (20:29 -0700)]
Handle no ofport in get_vif_port_to_ofport_map
Newly added ports to OVSDB might not yet have an
ofport number assigned to them. This causes the
return from the DB query to return a list instead
of a port number.
This patch handles that by attempting to convert
each result into an integer and then catching the
exception and continuing through the iteration to
ignore uninitialized ports like these.
It also adds a unit test based on data from a
failure observed in the gate.
tests: confirm that _output_hosts_file does not log too often
I3ad7864eeb2f959549ed356a1e34fa18804395cc didn't include any regression unit
tests to validate that the method won't ever log too often again,
reintroducing performance drop in later patches. It didn't play well
with stable backports of the fix, where context was lost when doing the
backport, that left the bug unfixed in stable/juno even though the patch
was merged there [1].
The patch adds an explicit note in the code that suggests not to add new
log messages inside the loop to avoid regression, and a unit test was
added to capture it.
Once the test is merged in master, it will be proposed for stable/juno
inclusion, with additional changes that would fix the regression again.
The increase in ovs testing is resulting in job failure due to
timeouts in test_killed_monitor_respawns. Giving the test more
time to complete should reduce the failure rate.
Restrict subnet create/update to avoid DHCP resync
As we know, IPs in subnet CIDR are used for
1) Broadcast port
2) Gateway port
3) DHCP port if enable_dhcp is True, or update to True
4) Others go into allocation_pools
Above 1) to 3) are created by default, which means if CIDR doesn't
have that much of IPs, subnet create/update will cause a DHCP resync.
This fix is to add some restricts to the issue:
A) When subnet create, if enable_dhcp is True, /31 and /32
cidrs are forbidden for IPv4 subnets while /127 and /128 cidrs are
forbidden for IPv6 subnets.
B) When subnet update, if enable_dhcp is changing to True and there are no
more IPs in allocation_pools, the request should be denied.
Change-Id: I2e4a4d5841b9ad908f02b7d0795cba07596c023d Co-authored-by: Andrew Boik <dboik@cisco.com>
Closes-Bug: #1443798
(cherry picked from commit 0c1f96ad5a6606c1205bd50ea944c3a383892cde)
Kevin Benton [Tue, 21 Apr 2015 05:26:22 +0000 (22:26 -0700)]
Only update MTU in update code for MTU
The ML2 create_network_db was re-passing in the entire network
with extensions like vlan_transparency present that was causing
issues in the base update function it was calling.
This corrects the behavior by having it only update the MTU, which
is the only thing it was intending to update in the first place.
As DVR routers use a different type of interface, this patch
amends the DHCP agent code ensuring that a metadata proxy is
spawned when the metadata network feature is enabled on the
DHCP agent.
Kevin Benton [Fri, 17 Apr 2015 10:53:45 +0000 (03:53 -0700)]
Defer creation of router JSON in get_routers RPC
The get_routers method in the l3 RPC code has a log.debug
statement that formats all of the router data as indented
JSON. This method can be expensive if there are hundreds
of routers being synced and it happens even if debugging
is disabled since the function call result is the parameter
to the debug statement.
This patch adds and leverages a small helper class that takes a
callable and its args and defers calling it until the __str__ method
is called on it when it's actually trying to be rendered to a string.
Kevin Benton [Fri, 17 Apr 2015 11:28:58 +0000 (04:28 -0700)]
Remove double queries in l3 DB get methods
Two frequently called functions were querying the routerport table
and the corresponding ports just to get the port ID. Then they were
calling get_ports again with those port IDs, resulting in two queries
to the port table when there should have only been one.
This eliminates the second call to get_ports since all of the necessary
data hase been retrieved from the port table.
Kevin Benton [Fri, 17 Apr 2015 10:36:50 +0000 (03:36 -0700)]
Set loading strategy to joined for Routerport/Port
The RouterPort model has a relationship to the ports model which
is frequently relied on to get the port IDs of interfaces attached
to a router. However, this defaults to the loading strategy to
'select', which meant a new query was being emitted for every
interface to the ports table just to get the ID.
This patch adjusts the relationship to be 'joined' by default so
one query will fetch the related ports.
Another option would have been not to use the port object at all since
the ID is all that the callers were usually interested in. However,
they would end up using the ID to do a port lookup, which is being
optimized away in another patch anyway so the full port object from
the relationship will end up getting used.
mathieu-rohon [Sat, 7 Mar 2015 12:30:49 +0000 (13:30 +0100)]
ML2: Change port status only when it's bound to the host
Currently, nothing prevents the port status to be changed to BUILD
state when get_device_details() is sent by a host that doesn't own
the port.
In some cases the port might stay in BUILD state.
This could happen during a live-migration, or for multi-hosted ports
such as HA ports.
This commit allows the port status modification only if the port
is bound to the host that is asking for it.
Andreas Jaeger [Mon, 20 Apr 2015 09:07:37 +0000 (11:07 +0200)]
Release Import of Translations from Transifex
Manual import of Translations from Transifex. This change also removes
all po files that are less than 66 per cent translated since such
partially translated files will not help users.
This updates also recreates all pot (translation source files) to
reflect the state of the repository.
This change needs to be done manually since the automatic import does
not handle the proposed branches and we need to sync with latest
translations.
Fixes race condition and boosts the scheduling performance
This patch fixes a race-condition that occurs when the
scheduler tries to check for dvr serviceable ports before
it schedules a router when a subnet is associated with
a router.
Sometimes the dhcp port creation is delayed and so the
router is not scheduled to the l3-agent.
Also it boosts the scheduling performance on dvr-snat
node for scheduling a router.
This patch will provide a work around to fix this race
condition and to boost the scheduling performance
by scheduling a router on a dvr-snat when
dhcp is enabled on the provided subnet, instead of checking
all the available ports on the subnet.
Kevin Benton [Tue, 31 Mar 2015 06:52:56 +0000 (23:52 -0700)]
Set IPset hash type to 'net' instead of 'ip'
The previous hash type was 'ip' and this caused a major
issue with the allowed address pairs extension since it
results in CIDRs being passed to ipset. When the hash type
is 'ip', a CIDR is completely enumerated into all of its
addresses so 10.100.0.0/16 results in ~65k entries. This
meant a single allowed_address_pairs entry could easily
exhaust an entire set.
This patch changes the hash type to 'net', which is designed
to handle a CIDRs as a single entry.
This patch also changes the names of the ipsets because
creating an ipset with different parameters will cause an
error and our ipset manager code isn't robust enough to handle
that at this time. There is another ongoing patch to fix
that but it won't be ready in time.[1]
The related bug was closed by increasing the set limit, which
did alleviate the problem. However, this change would also
address the issue because the gate tests run an allowed address
pairs extension test with the CIDR mentioned above.
The current ipset manager code isn't robust enough to handle
ipsets that already exist with different parameters. This reverts
the ability to change the parameters so we don't break upgrades
to Kilo.
Dane LeBlanc [Thu, 9 Apr 2015 14:32:33 +0000 (10:32 -0400)]
IPv6 SLAAC subnet create should update ports on net
If ports are first created on a network, and then an IPv6 SLAAC
or DHCPv6-stateless subnet is created on that network, then the
ports created prior to the subnet create are not getting
automatically updated (associated) with addresses for the
SLAAC/DHCPv6-stateless subnet, as required.
Dane LeBlanc [Sat, 4 Apr 2015 22:50:36 +0000 (18:50 -0400)]
Re-use context session in ML2 DB get_port_binding_host
This patch modifies ML2 DB get_port_binding_host method so that it
reuses the existing context session to do the database query
rather than creating a new database session.
Note that there are other methods in ML2 DB that do not re-use
the caller's session (get_port_from_device_mac() and
get_sg_ids_grouped_by_port()). These will be modified using
a separate bug (https://bugs.launchpad.net/neutron/+bug/1441205).
Change-Id: I8aafb0a70f40f9306ccc366e5db6860c92c48cce
Closes-Bug: #1440183
Change eba4c2941ee introduced these tests. However they are not that useful as they
simply mimick the code, without really ensuring that the behavior is expected, so
they provide negative value ([1]), plus, they fail randomly.
This patch removes them in favor of a more useful functional check.
Maru Newby [Tue, 24 Mar 2015 19:45:46 +0000 (19:45 +0000)]
Enhance TESTING.rst
Add detail about api testing and provide better visual separation
between the different types of testing.
The current testing guidelines are mainly about running tests, and
this change does little to fix that. The intention is to add detail
about writing tests in subsequent changes.
Arbitrarily restricting ourselves from using bash because developers on
platforms like netbsd don't want to install bash from ports doesn't
make sense. Any non-trivial shell script is likely to use features
like arrays or string manipulation that are poorly supported (if at
all) by sh, and the continued bumping of the number of expected bash
scripts is an indication that the check is not serving its purpose
anyway.
Along with removing the check, all shebang references to /bin/bash
have been replaced with /usr/bin/env bash in an attempt to be more
compatible across different hosts.
Ed Bak [Mon, 9 Feb 2015 23:13:18 +0000 (23:13 +0000)]
Return from check_ports_exist_on_l3agent if no subnet found
The call to get_subnet_ids_on_router can return an empty list.
If the subnet_ids list is empty, the subsequent call to get
the ports on a subnet returns all ports. If this occurs
when doing a remove_router_interface, the performance
of a remove_router_interface degrades significantly. This change
returns immediately from check_ports_exist_on_l3agents if no
subnet is found. A new unit test has been added to cover
the specific case of returning immediately without calling
get_ports when a remove_router_interface operation is performed.
YAMAMOTO Takashi [Wed, 18 Mar 2015 04:27:15 +0000 (13:27 +0900)]
linuxbridge UT: Fix a regression of the recent ip_lib change
A recently merged change, I07d1d297f07857d216649cccf717896574aac301,
changed IPWrapper.get_devices to use /sys instead of executing ip command.
Unfortunately it broke linuxbridge unit tests, which seems to assume that
mocking utils.execute is enough in some places. This commit fixes the
regression.
Recent refactor to the L3 Agent have introduced
this problem. When we create a VM after we
attach an interface to a router or when we add
an interface with an existing VM to a router, in
both cases the arp entries for the dvr serviced
ports are not getting populated in the Router
Namespace.
This change moves plugin test modules to conform to the new rules on
unit test tree structure (see TESTING.rst).
Vendor plugin paths continue to be ignored, and unit test modules that
test features instead of modules are also ignored pending their
removal to the functional test tree.
The unit test reorg is about moving files around so a test module is
clearly associated with the code module it targets, but the test
modules in this change needed to be manually merged because they both
targeted the same module.
This change ensures that the structure of the unit test tree matches
that of the code tree to make it obvious where to find tests for a
given module. A check is added to the pep8 job to protect against
regressions.
The plugin test paths are relocated to neutron/tests/unit/plugins
but are otherwise ignored for now.
Brian Haley [Fri, 3 Apr 2015 01:11:06 +0000 (21:11 -0400)]
Add ipset element and hashsize tunables
Recently, these messages have been noticed in both tempest
logs, as well as reported by downstream users syslog:
Set IPv4915d358d-2c5b-43b5-9862 is full, maxelem 65536 reached
So the default of 64K is not sufficient enough.
This change adds two config options to control both the number
of elements as well as the hashsize, since they should be
tuned together for best performance. Slightly different
formats were required for 'ipset create' and 'ipset restore'.
The default values for these are now set to 131072 (maxelem) and
2048 (hashsize), which is an increase over their typical default values
of 65536/1024 (respectively), in order to fix the errors seen in
the tempest tests.
Cedric Brandily [Tue, 17 Mar 2015 15:20:07 +0000 (15:20 +0000)]
Allow metadata proxy running with nobody user/group
Currently metadata proxy cannot run with nobody user/group as metadata
proxy requires to connect to metadata_proxy_socket when queried.
This change allows to run metadata proxy with nobody user/group by
allowing to choose the metadata_proxy_socket mode with the new option
metadata_proxy_socket_mode (4 choices) in order to adapt socket
permissions to metadata proxy user/group.
This change refactors also where options are defined to enable
metadata_proxy_user/group options in the metadata agent.
In practice:
* if metadata_proxy_user is agent effective user or root, then:
* metadata proxy is allowed to use rootwrap (unsecure)
* set metadata_proxy_socket_mode = user (0o644)
* else if metadata_proxy_group is agent effective group, then:
* metadata proxy is not allowed to use rootwrap (secure)
* set metadata_proxy_socket_mode = group (0o664)
* set metadata_proxy_log_watch = false
* else:
* metadata proxy has lowest permissions (securest) but metadata proxy
socket can be opened by everyone
* set metadata_proxy_socket_mode = all (0o666)
* set metadata_proxy_log_watch = false
An alternative is to set metadata_proxy_socket_mode = deduce, in such
case metadata agent uses previous rules to choose the correct mode.