Currently radvd is spawned in all the HA routers irrespective of the
state of the router. This approach has the following issues.
1. While processing the internal router ports (i.e., qr-xxx), ha_router
removes the LLA of the interface and adds it as a VIP to Keepalived conf.
Radvd daemon is spawned after this operation in the router namespace
(if the port is associated with any IPv6 subnets). Radvd notices that
qr-xxx interface does not have the LLA, so does not transmit any Router
Advts. In this state, VMs fail to acquire IPv6 addresses because of the
missing RAs. Radvd does not recover even after keepalived configures the
LLA of the interface. The only solution is to restart/reload radvd daemon.
Currently keepalived-state-change monitor does not do any radvd related
operations when a state transition happens. So we endup in this state
forever.
2. For all the routers in Backup state, qr-xxx interface does not have LLA
as it is managed by keepalived and configured only on the Master HA router.
In such agents syslog is flooded with the messages [1] and this can cause
loss of other useful info.
[1] - resetting ipv6-allrouters membership on qr-2e373555-97
This patch implements the following.
1. If the router is already in the Master state, we configure the LLA as a VIP
in keepalived conf but do not delete the LLA of the internal interface.
2. We spawn radvd only if the router is in the Master State.
3. Keepalived-state-change monitor takes care of enabling/disabling radvd upon
state transitions.
network.external is only present if one is using the external_net_db
mixin. This patch just adds a check to see network has the attribute
external to avoid an Attribute error.
Cedric Brandily [Sun, 1 Mar 2015 22:08:58 +0000 (22:08 +0000)]
Define FakeMachine helper for functional/fullstack tests
The change defines the FakeMachine fixture/helper which emulates a
machine through a namespace with:
* a port bound to a bridge,
* an ip on the port,
* a gateway (if requested).
The FakeMachine class can be used to emulate:
* a VM for testing network features (ex: metadata service),
* an external machine for testing "external" network features (ex:
routing/natting),
* a server for low level tests of network features (ex: iptables).
The change also defines PeerMachines fixture/helper to create some fake
machines bound to a bridge.
Terry Wilson [Fri, 17 Apr 2015 21:13:09 +0000 (16:13 -0500)]
Correct typo for matching non-dict ovsdb rows
As can be seen just above, the correct operator for the equality
test is '=' and not '=='. This match isn't currently being used
in the neutron code, but will be used by the OVN driver.
The previous code would also raise NotImplemented when there was
no match.
Fixes race condition and boosts the scheduling performance
This patch fixes a race-condition that occurs when the
scheduler tries to check for dvr serviceable ports before
it schedules a router when a subnet is associated with
a router.
Sometimes the dhcp port creation is delayed and so the
router is not scheduled to the l3-agent.
Also it boosts the scheduling performance on dvr-snat
node for scheduling a router.
This patch will provide a work around to fix this race
condition and to boost the scheduling performance
by scheduling a router on a dvr-snat when
dhcp is enabled on the provided subnet, instead of checking
all the available ports on the subnet.
mathieu-rohon [Sat, 7 Mar 2015 12:30:49 +0000 (13:30 +0100)]
ML2: Change port status only when it's bound to the host
Currently, nothing prevents the port status to be changed to BUILD
state when get_device_details() is sent by a host that doesn't own
the port.
In some cases the port might stay in BUILD state.
This could happen during a live-migration, or for multi-hosted ports
such as HA ports.
This commit allows the port status modification only if the port
is bound to the host that is asking for it.
Kevin Benton [Fri, 17 Apr 2015 11:28:58 +0000 (04:28 -0700)]
Remove double queries in l3 DB get methods
Two frequently called functions were querying the routerport table
and the corresponding ports just to get the port ID. Then they were
calling get_ports again with those port IDs, resulting in two queries
to the port table when there should have only been one.
This eliminates the second call to get_ports since all of the necessary
data hase been retrieved from the port table.
Kevin Benton [Fri, 17 Apr 2015 11:18:56 +0000 (04:18 -0700)]
Strip unnecessary overrides in extraroute_db mixin
The extra route DB mixin seemed to be overriding the
get_router and get_routers method for no reason. They
both just called the super version of themselves with
the same arguments.
This patch just pulls those functions out. Found in
tracebacks while working on a related bug.
Kevin Benton [Fri, 17 Apr 2015 10:36:50 +0000 (03:36 -0700)]
Set loading strategy to joined for Routerport/Port
The RouterPort model has a relationship to the ports model which
is frequently relied on to get the port IDs of interfaces attached
to a router. However, this defaults to the loading strategy to
'select', which meant a new query was being emitted for every
interface to the ports table just to get the ID.
This patch adjusts the relationship to be 'joined' by default so
one query will fetch the related ports.
Another option would have been not to use the port object at all since
the ID is all that the callers were usually interested in. However,
they would end up using the ID to do a port lookup, which is being
optimized away in another patch anyway so the full port object from
the relationship will end up getting used.
Kevin Benton [Tue, 31 Mar 2015 06:52:56 +0000 (23:52 -0700)]
Set IPset hash type to 'net' instead of 'ip'
The previous hash type was 'ip' and this caused a major
issue with the allowed address pairs extension since it
results in CIDRs being passed to ipset. When the hash type
is 'ip', a CIDR is completely enumerated into all of its
addresses so 10.100.0.0/16 results in ~65k entries. This
meant a single allowed_address_pairs entry could easily
exhaust an entire set.
This patch changes the hash type to 'net', which is designed
to handle a CIDRs as a single entry.
This patch also changes the names of the ipsets because
creating an ipset with different parameters will cause an
error and our ipset manager code isn't robust enough to handle
that at this time. There is another ongoing patch to fix
that but it won't be ready in time.[1]
The related bug was closed by increasing the set limit, which
did alleviate the problem. However, this change would also
address the issue because the gate tests run an allowed address
pairs extension test with the CIDR mentioned above.
This change simply changes the Quota model class to obtain
the tenant_id from the mixin class. As the attribute in the
mixin is identical to that in the model there is no need for
a migration.
This patch also removes a reference to quota classes in the
docstring, as Neutron does not implement those. It is good
to be careful when copying and paste code.
This patch cleans up the init logic for the plugin so that
we better separate the tasks required for establishing
the integration with DHCP and RPC layers.
In other words: some bikeshedding whilst dealing with bug #1444112
Dane LeBlanc [Thu, 9 Apr 2015 14:32:33 +0000 (10:32 -0400)]
IPv6 SLAAC subnet create should update ports on net
If ports are first created on a network, and then an IPv6 SLAAC
or DHCPv6-stateless subnet is created on that network, then the
ports created prior to the subnet create are not getting
automatically updated (associated) with addresses for the
SLAAC/DHCPv6-stateless subnet, as required.
Gal Sagie [Wed, 15 Apr 2015 06:26:54 +0000 (09:26 +0300)]
Enhance OVSDB Transaction timeout configuration
OVSDB Transaction currently takes the timeout parameter
from a context object that assume to have a vsctl_timeout attribute
This doesnt fit well for other users of this class (like OVN)
This fix configure the transaction timeout in a more common way
Aman Kumar [Fri, 23 Jan 2015 09:34:00 +0000 (01:34 -0800)]
Added config variable for External Network type in ML2
Description:
With the ML2 Plugin, every network created has segments with
provider:network_types being tenant_network_types.
When applied to external networks, the types that could be in
tenant_network_types parameter (like vxlan or gre) are not appropriate.
Implementation:
Added new config variable 'external_network_type' in ml2_conf.ini
which contains the default network type for external networks
when no provider attributes are specified, by default it is None.
It also includes small code re-factoring/renaming of import statement.
This patch updates the progress chart, now that the first cycle after the
decomp started. For the fully decomposed plugins/drivers and for known
projects that integrate with Neutron, this patch proposes a new
summary table that provides a go-to reference for everything Neutron related.
Assaf Muller [Fri, 27 Mar 2015 23:31:51 +0000 (19:31 -0400)]
Stop running L3 functional tests with both OVSDB interfaces
Running the L3 functional tests with both OVSDB interfaces doubles
the run time and may discourage developers from running them
frequently during development. Since the OVSDB interfaces
are tested explicitly, I don't think the trade off is worth it
here. The L3 functional tests use OVS in a *really* trivial way
and won't catch any issues that the explicit tests won't.
Added an OVSInterfaceDriverTestCase plug functional test that runs with
both OVS interfaces to make it harder to introduce regressions.
Kevin Benton [Mon, 30 Mar 2015 18:49:40 +0000 (11:49 -0700)]
Pass correct port ID back to RPC caller
The previous response to get_device_details calls was returning
whatever the caller requested as the port_id in the response.
This was only correct in the case where the port_id was used
directly. In cases where device names were passed in, there was
no way to retrieve the full port ID.
This corrects that behavior by using the port ID from the database
and adds tests to ensure the behavior remains correct.
Brian Haley [Thu, 9 Apr 2015 21:48:40 +0000 (17:48 -0400)]
Fix intermittent ipset_manager test failure
Change ipset_manager _refresh_set() to make a copy of the list of
IPs when creating a set, instead of using a reference, else any
change to the set could update the caller's data.
Also made the IpsetManagerTestCase classes always pass maxelem and
hashsize to the parent class.
John Schwarz [Tue, 14 Oct 2014 11:12:35 +0000 (14:12 +0300)]
Add full-stack test
Currently, the full-stack framework has only one test which only uses
the neutron-server. This patch adds an actual test which makes sure that
once a router is created, an actual namespace is create for it. Since
this test requires 3 processes (neutron-server, l3-agent, ovs-agent),
existing full-stack code is modified to add more streamlined support for
such code.
John Schwarz [Thu, 2 Apr 2015 15:17:03 +0000 (18:17 +0300)]
create_resource should return maximum length str
Previously, get_rand_name(max_length, prefix) returned a randomized
suffix integer which was concatenated to the end of the given prefix.
Effectively, the suffix was any decimal number between 1 and
0x7fffffff, so multiple calls to the function could return strings with
different length. This is unexpected since running an already
randomized name into the same function shouldn't return a different
string.
The suggested solution is to actually fill all the space needed until
the string is 'max_length' in size. Also, a check is added to
create_resource to make sure that it only generates a new port name if
the input prefix is less than the maximum device name and if the prefix
is long enough, don't generate a random port suffix.
Sudipta Biswas [Wed, 18 Mar 2015 18:05:57 +0000 (23:35 +0530)]
Add clock sync error detection on agent registration
For the server to determine if an agent is alive or not,
it depends on the agent's clock being mostly in sync with the server
clock. The neutron-server may reject and return the request if
there's a timestamp difference between the two nodes. Currently
there's no good way to detect this condition from the agent code.
This fix will improve the error handling
logic by writing an appropriate log in the neutron server's log
file for an early detection of the problem.
This fix targets quite rare case of race condition between
port creation and subnet deletion. This usually happens
during API tests that do things quickly.
DHCP port is being created after delete_subnet checks for
DHCP ports, but before it checks for IPAllocations on subnet.
The solution is to apply retrying logic, which is really necessary
as we can't fetch new IPAllocations with the same query and within
the active transaction in mysql because of REPEATABLE READ
transaction isolation.
Romil Gupta [Mon, 23 Mar 2015 15:05:41 +0000 (08:05 -0700)]
Move values for network_type to plugins.common.constants.py
It is quite confusing to have values for network type in common.constants.py
instead of having in plugins.common.constants.py.
Currently, the plugins/common/constants.py consists network_type constants
like VLAN, VXLAN, GRE etc. but values for network type like ranges
are defined in common.constants.py which is not good, it is better to have
both things at the same place.
This patch set addresses the same.
Moved out few methods which are predominantly used in plugins
from common.utils.py to plugins.common.utils.py.
Removed constants which were used in neutron-fwaas from
plugins.common.constants.py: https://review.openstack.org/#/c/168709/
Gal Sagie [Mon, 6 Apr 2015 05:36:01 +0000 (08:36 +0300)]
Add OVSDB connection as a parameter to the transaction
This adds the ovsdb connection as a parameter to the transaction
in the IDL implementation.
This allows other users to use this with a different connection
Adds DVR functional test for multi-external networks
This patch adds DVR functional test for multiple
external networks related to FIP namespace.
This test validates that FIP namespaces are created
based on the external networks associated with the
router.
Ihar Hrachyshka [Sat, 28 Feb 2015 12:48:18 +0000 (13:48 +0100)]
context: reuse base oslo.context class for to_dict()
It is need to conform to expectations of consumers that rely on
oslo.context behaviour (f.e. oslo.log that relies [1] on user_identity
field being set for context objects).