Miguel Angel Ajo [Thu, 11 Jun 2015 11:15:17 +0000 (13:15 +0200)]
Fix hostname roaming for ml2 tunnel endpoints.
Change I75c6581fcc9f47a68bde29cbefcaa1a2a082344e introduced
a bug where host name changes broke tunneling endpoint updates.
Tunneling endpoint updates roaming a hostname from IP to IP
are a common method for active/passive HA with pacemaker and
should happen automatically without the need for API/CLI calls [1].
delete_endpoint_by_host_or_ip is introduced to allow cleanup of
endpoints that potentially belonged to the newly registered agent,
while preventing the race condition found when deleting ip1 & ip2
in the next situation at step 4:
1) we have hostA: ip1
2) hostA goes offline
3) hostB goes online, with ip1, and registers
4) hostA goes online, with ip2, and registers
Jakub Libosvar [Thu, 17 Sep 2015 13:26:05 +0000 (13:26 +0000)]
Introduce kill_signal parameter to AsynProcess.stop()
All stop() calls of instances of AsyncProcess class were sending
hardcoded SIGKILL signal to its process. This patch leaves the default
behavior to SIGKILL but offers any number to be sent to kill command.
Note: Internal private methods also got a new parameter which is not
appended. Given that those methods are private and thus not used
outside of the class, we can afford it.
Previously, it was possible for None to be passed to context.session.delete()
if a port was not found (usually a result of a concurrent delete). This
resulted in an UnmappedInstanceError. This is avoided now by calling
query.delete() directly which does not raise any exceptions.
Now tempest-lib provides token_client modules as library and the
interface is stable. So neutron repogitory doesn't need to contain
these modules.
This patch makes neutron use tempest-lib's token_client and removes
the own modules for the maintenance.
Jakub Libosvar [Thu, 13 Aug 2015 09:08:20 +0000 (09:08 +0000)]
Fix establishing UDP connection
Previously, in establish_connection() for UDP protocol data were sent
but never read on peer socket. That lead to successful read on peer side
if this connection was filtered. Having constant testing string masked
this issue as we can't distinguish to which test of connectivity data
belong.
This patch makes unique data string per test_connectivity() and
also makes establish_connection() to create an ASSURED entry in
conntrack table. Finally, in last test after firewall filter was
removed, connection is re-established in order to avoid troubles with
terminated processes or TCP continuing sending packets which weren't
successfully delivered.
Moshe Levi [Mon, 10 Aug 2015 09:25:59 +0000 (12:25 +0300)]
QoS agent extension and driver refactoring
Moved some code common to all drivers into base
qos driver abstract class, so related bugfixes go all in one
place and we simplify the logic for every qos drivers.
Port/Policy mapping moved out to a separate class.
Similar to IPv4 arp protection support, this patch adds the necessary OVS
rules to prevent ports attached to agent from sending any icmpv6 neighbor
advertisement messages that contain an IPv6 address not belonging to the port.
For details please refer to "Figure 3. Attack against IPv6 Address Resolution"
http://www.cisco.com/web/about/security/intelligence/ipv6_first_hop.html
Cedric Brandily [Mon, 24 Aug 2015 20:24:10 +0000 (22:24 +0200)]
Remove out-of-tree vendor AGENT_TYPE_* constant
AGENT_TYPE_* constants[1] defines all agent types BUT the only vendor
one(AGENT_TYPE_NEC) is only used in out-of-tree networking-nec repo.
This changes removes out-of-tree AGENT_TYPE_NEC constant (dependant
change defines it in networking-nec repo).
Jakub Libosvar [Mon, 14 Sep 2015 14:54:34 +0000 (14:54 +0000)]
func: Don't use private method of AsyncProcess
In functional test we simulate crash of AsyncProcess by calling
_kill_process(). This method is a private method and such usage
introduced a race where process was respawned prior to calling wait() of
killed process, leading to infinite wait on newly spawned process.
This patch adds manual send of kill and then active waiting for process
to be respawned, similarly like done with recent keepalived patch [1].
Per [1] we are using a better way to keep tunnel connectivity,
so reset_bridge isn't used anymore. Bug in [2] was caused by
using method reset_bridge which will delete and recreate bridge.
For [1] makes method reset_bridge deprecated, it makes sense to
remove this method, and make [2] no longer produce.
Kevin Benton [Thu, 3 Sep 2015 17:01:40 +0000 (10:01 -0700)]
Add utility function for checking trusted port
Ports that have a device_owner that starts with 'network:'
are trusted in several places throughout the codebase. Each
of these did a startswith check on each field and it's not
immediately obvious why it's done.
This patch adds a utility function called 'is_port_trusted'
that performs the same check and makes it obvious what is
being done.
Currently, the vip of lbaasV2 will not have l3 network with DVR.
This prevent the usercase of lbaasV2 + DVR. This patch aims to
enable servicing lbaasv2 vip by DVR.
Cloud deployed at scale most likely will use these scheduler
drivers because they allow a fairer resource allocation compared
to chance schedulers (which randomly place resources on the hosts).
Because of their importance, it's only wise to test them in
the gate on a continuous basis, so that we do not get surprised
by accidental regressions.
Rather than pushing this down through devstack-gate/project-config
patches, this chance alters the default of the scheduler
drivers, so that users can also pick these up out of the box.
This means that after an upgrade they would observe a change in
the scheduling behavior, if they relied on the default config.
Fix BadRequest error on add_router_interface for DVR
This operation for DVR is made of multiple steps, some of
which are not within the same DB transaction. For this
reason, if a failure occurs, the rollback will be partial.
This inconsistent state leads the retry logic to fail with
BadRequest, because the router is believed to be already
connected to the subnet.
To fix this condition, it is necessary to delete the port
should the DB deadlock occur.
This test initial design is problematic: it spawns keepalived,
it asserts the process is up, then it attempts to kill it.
However, this is when problems may arise:
a) it does so by using the disable method on the process - we
should be more rude than that if we want to simulate a crash!
b) keepalived may be forking while it is starting and it is
possible that for a moment the ppid changes and the process
owner invoking the kill has no rights to kill the spawned
process. This is the most plausible explaination I could find
as to why kill returns 1 with no standard error
c) it does not verify that the process has indeed disappeared
(what if the pm.disable didn't work?) - this means that the
test can pass, and yet the monitor may not work.
Bottom line: this test relied on the correctness of the very code
that was meant to validate...and that's not cool. To this aim, we
wait for the process to be active, kill the process with a kill -9
and verify that the process after the kill is indeed different.
Reservations: Don't count usage if resource is unlimited
If a resource is unlimited (ie: limit<0) then there is no need
to verify headroom for it. This also means that there no need for
counting it; therefore it is possible to save some DB operations
by skipping the count phase.
ARP does not support IPv6 addresses, so when we try to apply the flow, it
fails, with all other flows deferred for the same transaction. It results in
random flow breakages, depending on the order of the bad flow in the
transaction.
Change-Id: I0ecf167653e5a7d0916e091e05050406a026a1e2 Co-Authored-By: Thomas Carroll <Thomas.Carroll@pnnl.gov>
Closes-Bug: #1477253
Configure gw_iface for RAs only in Master HA Router
For an HA Router which does not have any IPv6 subnets in the external network
and when ipv6_gateway is not set, Neutron configures the gateway interface of
the router to receive Router Advts for default route. In an HA router, only
the Master instance has the IP addresses while the Backup instance does not
have any addresses (including LLA). In Kernel version 3.10, when the last
IPv6 address is removed from the interface, IPv6 proc entries corresponding
to the iface are also deleted. This is however reverted in the later versions
of kernel code.
This patch addresses this issue by configuring the proc entry only for the
Master HA Router instance instead of doing it un-conditionally.
Ryan Moats [Fri, 11 Sep 2015 12:41:38 +0000 (07:41 -0500)]
Remove useless log from periodic_sync_routers_task
Logging that peridoic_sync_routers_task is starting with fullsync
False just adds noise to devstack logs. Reposition the log
statement to indicate that the task is starting if it is going
to be doing real processing.
Change-Id: I73def1e20218b01c135769d0b8fbce449dad17ea Signed-off-by: Ryan Moats <rmoats@us.ibm.com>