Kevin Benton [Fri, 24 Apr 2015 07:35:31 +0000 (00:35 -0700)]
Don't resync on DHCP agent setup failure
There are various cases where the DHCP agent will try to
create a DHCP port for a network and there will be a failure.
This has primarily been caused by a lack of available IP addresses
in the allocation pool. Trying to fix all availability corner cases
on the server side will be very difficult due to race conditions between
multiple ports being created, the dhcp_agents_per_network parameter, etc.
This patch just stops the resync attempt on the agent side if a failure
is caused by an IP address generation problem. Future updates to the subnet
will cause another attempt so if the tenant does fix the issue they will
get DHCP service.
tests: confirm that _output_hosts_file does not log too often
I3ad7864eeb2f959549ed356a1e34fa18804395cc didn't include any regression unit
tests to validate that the method won't ever log too often again,
reintroducing performance drop in later patches. It didn't play well
with stable backports of the fix, where context was lost when doing the
backport, that left the bug unfixed in stable/juno even though the patch
was merged there [1].
The patch adds an explicit note in the code that suggests not to add new
log messages inside the loop to avoid regression, and a unit test was
added to capture it.
Once the test is merged in master, it will be proposed for stable/juno
inclusion, with additional changes that would fix the regression again.
The increase in ovs testing is resulting in job failure due to
timeouts in test_killed_monitor_respawns. Giving the test more
time to complete should reduce the failure rate.
Restrict subnet create/update to avoid DHCP resync
As we know, IPs in subnet CIDR are used for
1) Broadcast port
2) Gateway port
3) DHCP port if enable_dhcp is True, or update to True
4) Others go into allocation_pools
Above 1) to 3) are created by default, which means if CIDR doesn't
have that much of IPs, subnet create/update will cause a DHCP resync.
This fix is to add some restricts to the issue:
A) When subnet create, if enable_dhcp is True, /31 and /32
cidrs are forbidden for IPv4 subnets while /127 and /128 cidrs are
forbidden for IPv6 subnets.
B) When subnet update, if enable_dhcp is changing to True and there are no
more IPs in allocation_pools, the request should be denied.
Change-Id: I2e4a4d5841b9ad908f02b7d0795cba07596c023d Co-authored-by: Andrew Boik <dboik@cisco.com>
Closes-Bug: #1443798
(cherry picked from commit 0c1f96ad5a6606c1205bd50ea944c3a383892cde)
Kevin Benton [Tue, 21 Apr 2015 05:26:22 +0000 (22:26 -0700)]
Only update MTU in update code for MTU
The ML2 create_network_db was re-passing in the entire network
with extensions like vlan_transparency present that was causing
issues in the base update function it was calling.
This corrects the behavior by having it only update the MTU, which
is the only thing it was intending to update in the first place.
mathieu-rohon [Sat, 7 Mar 2015 12:30:49 +0000 (13:30 +0100)]
ML2: Change port status only when it's bound to the host
Currently, nothing prevents the port status to be changed to BUILD
state when get_device_details() is sent by a host that doesn't own
the port.
In some cases the port might stay in BUILD state.
This could happen during a live-migration, or for multi-hosted ports
such as HA ports.
This commit allows the port status modification only if the port
is bound to the host that is asking for it.
Andreas Jaeger [Mon, 20 Apr 2015 09:07:37 +0000 (11:07 +0200)]
Release Import of Translations from Transifex
Manual import of Translations from Transifex. This change also removes
all po files that are less than 66 per cent translated since such
partially translated files will not help users.
This updates also recreates all pot (translation source files) to
reflect the state of the repository.
This change needs to be done manually since the automatic import does
not handle the proposed branches and we need to sync with latest
translations.
Fixes race condition and boosts the scheduling performance
This patch fixes a race-condition that occurs when the
scheduler tries to check for dvr serviceable ports before
it schedules a router when a subnet is associated with
a router.
Sometimes the dhcp port creation is delayed and so the
router is not scheduled to the l3-agent.
Also it boosts the scheduling performance on dvr-snat
node for scheduling a router.
This patch will provide a work around to fix this race
condition and to boost the scheduling performance
by scheduling a router on a dvr-snat when
dhcp is enabled on the provided subnet, instead of checking
all the available ports on the subnet.
Kevin Benton [Tue, 31 Mar 2015 06:52:56 +0000 (23:52 -0700)]
Set IPset hash type to 'net' instead of 'ip'
The previous hash type was 'ip' and this caused a major
issue with the allowed address pairs extension since it
results in CIDRs being passed to ipset. When the hash type
is 'ip', a CIDR is completely enumerated into all of its
addresses so 10.100.0.0/16 results in ~65k entries. This
meant a single allowed_address_pairs entry could easily
exhaust an entire set.
This patch changes the hash type to 'net', which is designed
to handle a CIDRs as a single entry.
This patch also changes the names of the ipsets because
creating an ipset with different parameters will cause an
error and our ipset manager code isn't robust enough to handle
that at this time. There is another ongoing patch to fix
that but it won't be ready in time.[1]
The related bug was closed by increasing the set limit, which
did alleviate the problem. However, this change would also
address the issue because the gate tests run an allowed address
pairs extension test with the CIDR mentioned above.
The current ipset manager code isn't robust enough to handle
ipsets that already exist with different parameters. This reverts
the ability to change the parameters so we don't break upgrades
to Kilo.
Dane LeBlanc [Thu, 9 Apr 2015 14:32:33 +0000 (10:32 -0400)]
IPv6 SLAAC subnet create should update ports on net
If ports are first created on a network, and then an IPv6 SLAAC
or DHCPv6-stateless subnet is created on that network, then the
ports created prior to the subnet create are not getting
automatically updated (associated) with addresses for the
SLAAC/DHCPv6-stateless subnet, as required.
Dane LeBlanc [Sat, 4 Apr 2015 22:50:36 +0000 (18:50 -0400)]
Re-use context session in ML2 DB get_port_binding_host
This patch modifies ML2 DB get_port_binding_host method so that it
reuses the existing context session to do the database query
rather than creating a new database session.
Note that there are other methods in ML2 DB that do not re-use
the caller's session (get_port_from_device_mac() and
get_sg_ids_grouped_by_port()). These will be modified using
a separate bug (https://bugs.launchpad.net/neutron/+bug/1441205).
Change-Id: I8aafb0a70f40f9306ccc366e5db6860c92c48cce
Closes-Bug: #1440183
Change eba4c2941ee introduced these tests. However they are not that useful as they
simply mimick the code, without really ensuring that the behavior is expected, so
they provide negative value ([1]), plus, they fail randomly.
This patch removes them in favor of a more useful functional check.
Maru Newby [Tue, 24 Mar 2015 19:45:46 +0000 (19:45 +0000)]
Enhance TESTING.rst
Add detail about api testing and provide better visual separation
between the different types of testing.
The current testing guidelines are mainly about running tests, and
this change does little to fix that. The intention is to add detail
about writing tests in subsequent changes.
Arbitrarily restricting ourselves from using bash because developers on
platforms like netbsd don't want to install bash from ports doesn't
make sense. Any non-trivial shell script is likely to use features
like arrays or string manipulation that are poorly supported (if at
all) by sh, and the continued bumping of the number of expected bash
scripts is an indication that the check is not serving its purpose
anyway.
Along with removing the check, all shebang references to /bin/bash
have been replaced with /usr/bin/env bash in an attempt to be more
compatible across different hosts.
Ed Bak [Mon, 9 Feb 2015 23:13:18 +0000 (23:13 +0000)]
Return from check_ports_exist_on_l3agent if no subnet found
The call to get_subnet_ids_on_router can return an empty list.
If the subnet_ids list is empty, the subsequent call to get
the ports on a subnet returns all ports. If this occurs
when doing a remove_router_interface, the performance
of a remove_router_interface degrades significantly. This change
returns immediately from check_ports_exist_on_l3agents if no
subnet is found. A new unit test has been added to cover
the specific case of returning immediately without calling
get_ports when a remove_router_interface operation is performed.
YAMAMOTO Takashi [Wed, 18 Mar 2015 04:27:15 +0000 (13:27 +0900)]
linuxbridge UT: Fix a regression of the recent ip_lib change
A recently merged change, I07d1d297f07857d216649cccf717896574aac301,
changed IPWrapper.get_devices to use /sys instead of executing ip command.
Unfortunately it broke linuxbridge unit tests, which seems to assume that
mocking utils.execute is enough in some places. This commit fixes the
regression.
Recent refactor to the L3 Agent have introduced
this problem. When we create a VM after we
attach an interface to a router or when we add
an interface with an existing VM to a router, in
both cases the arp entries for the dvr serviced
ports are not getting populated in the Router
Namespace.
This change moves plugin test modules to conform to the new rules on
unit test tree structure (see TESTING.rst).
Vendor plugin paths continue to be ignored, and unit test modules that
test features instead of modules are also ignored pending their
removal to the functional test tree.
The unit test reorg is about moving files around so a test module is
clearly associated with the code module it targets, but the test
modules in this change needed to be manually merged because they both
targeted the same module.
This change ensures that the structure of the unit test tree matches
that of the code tree to make it obvious where to find tests for a
given module. A check is added to the pep8 job to protect against
regressions.
The plugin test paths are relocated to neutron/tests/unit/plugins
but are otherwise ignored for now.
Brian Haley [Fri, 3 Apr 2015 01:11:06 +0000 (21:11 -0400)]
Add ipset element and hashsize tunables
Recently, these messages have been noticed in both tempest
logs, as well as reported by downstream users syslog:
Set IPv4915d358d-2c5b-43b5-9862 is full, maxelem 65536 reached
So the default of 64K is not sufficient enough.
This change adds two config options to control both the number
of elements as well as the hashsize, since they should be
tuned together for best performance. Slightly different
formats were required for 'ipset create' and 'ipset restore'.
The default values for these are now set to 131072 (maxelem) and
2048 (hashsize), which is an increase over their typical default values
of 65536/1024 (respectively), in order to fix the errors seen in
the tempest tests.
Cedric Brandily [Tue, 17 Mar 2015 15:20:07 +0000 (15:20 +0000)]
Allow metadata proxy running with nobody user/group
Currently metadata proxy cannot run with nobody user/group as metadata
proxy requires to connect to metadata_proxy_socket when queried.
This change allows to run metadata proxy with nobody user/group by
allowing to choose the metadata_proxy_socket mode with the new option
metadata_proxy_socket_mode (4 choices) in order to adapt socket
permissions to metadata proxy user/group.
This change refactors also where options are defined to enable
metadata_proxy_user/group options in the metadata agent.
In practice:
* if metadata_proxy_user is agent effective user or root, then:
* metadata proxy is allowed to use rootwrap (unsecure)
* set metadata_proxy_socket_mode = user (0o644)
* else if metadata_proxy_group is agent effective group, then:
* metadata proxy is not allowed to use rootwrap (secure)
* set metadata_proxy_socket_mode = group (0o664)
* set metadata_proxy_log_watch = false
* else:
* metadata proxy has lowest permissions (securest) but metadata proxy
socket can be opened by everyone
* set metadata_proxy_socket_mode = all (0o666)
* set metadata_proxy_log_watch = false
An alternative is to set metadata_proxy_socket_mode = deduce, in such
case metadata agent uses previous rules to choose the correct mode.
The example retargetable test that previously ran as part of the
functional suite is now skipped due to the fullstack example's db
fixture usage causing the test to fail if it the fullstack example
runs first on the same worker.
The unit test reorg is about moving files around so a test module is
clearly associated with the code module it targets, but the test
modules in this change needed to be manually merged because they both
targeted the same module.
test_api_v2 is also updated to use the path of neutron/tests/base.py
as the root of path to test implementations of extensions.