Carl Baldwin [Tue, 1 Sep 2015 16:58:22 +0000 (16:58 +0000)]
Make ip address optional to add_route and delete_route
The add_route and delete_route methods require that the ip (actually
"via" in ip route terms) be passed. Some routes don't require this.
This patch makes it optional while maintaining the position for those
callers who do pass it by position.
Carl Baldwin [Fri, 28 Aug 2015 21:28:39 +0000 (21:28 +0000)]
Add list routes
This adds list routes while refactoring list_onlink_routes to share
implementation. It changes test_onlink_routes to be consistent in the
type of data that it returns with the new list_routes.
Carl Baldwin [Fri, 28 Aug 2015 21:19:40 +0000 (21:19 +0000)]
Make ip rule comparison more robust
I found that ip rules would be added multiple times in new address
scopes code because the _exists method was unable to reliably
determine if the rule already existed. This commit improves this by
more robustly canonicalizing what it reads from the ip rule command so
that like rules always compare the same.
Mike Bayer [Fri, 14 Aug 2015 18:44:28 +0000 (14:44 -0400)]
Add non-model index names to autogen exclude filters
The SQLAlchemy MySQL dialect generates implicit indexes
in the less-common case of an integer column within a composite
primary key where autoincrement is not set to False.
Add a rule to ignore these indexes when performing
autogenerate against a target database.
Mike Bayer [Mon, 20 Jul 2015 22:34:15 +0000 (18:34 -0400)]
Implement expand/contract autogenerate extension
Makes use of new Alembic 0.8 features to allow
altering of the "alembic revision" stream such
that operations for expand and contract are
directed into separate branches.
Delete FIP agent gateway port with external gw port
FIP agent gateway ports are associated with external
networks and specific host.
Today FIP agent gateway ports are deleted for
every floatingip associate and disassociate. This
introduces race conditions in the port delete and also
un-necessary access to the db.
This patch will delete the FIP agent gateway port when
the last gateway port of the external network is deleted.
The child patch linked to this parent patch will clean
up the FIP agent gateway port delete when associate,
disassociate and delete of floatingip happens.
This should also cover the case when an agent for some
reason was unable to request agent gw port delete.
(agent died).
Previous changes[1] have been merged as enablers[2] to fix the bug 1274034 but an alternative solution has been choosen and now we can
consider the introduced code as dead code.
This changes removes [2], associated tests and rootwrap filters.
Kevin Benton [Wed, 26 Aug 2015 05:03:27 +0000 (22:03 -0700)]
Stop device_owner from being set to 'network:*'
This patch adjusts the FieldCheck class in the policy engine to
allow a regex rule. It then leverages that to prevent users from
setting the device_owner field to anything that starts with
'network:' on networks which they do not own.
This policy adjustment is necessary because any ports with a
device_owner that starts with 'network:' will not have any security
group rules applied because it is assumed they are trusted network
devices (e.g. router ports, DHCP ports, etc). These security rules
include the anti-spoofing protection for DHCP, IPv6 ICMP messages,
and IP headers.
Without this policy adjustment, tenants can abuse this trust when
connected to a shared network with other tenants by setting their
VM port's device_owner field to 'network:<anything>' and hijack other
tenants' traffic via DHCP spoofing or MAC/IP spoofing.
Aman Kumar [Tue, 17 Mar 2015 10:41:54 +0000 (03:41 -0700)]
ovs agent resync may miss port remove event
In OVS Agent rpc_loop() resync mechanism clears the registered ports and
rescans them again, and it might result in missing some "port removed"
event and treat_devices_removed will not be called.
This fix rescans the newly updated ports when resync mechanism called,
without clearing the current registered ports.
The registered ports will be cleared only if there are too many
consecutive resyncs to avoid resycing forever because of the same
faulty port.
Retry metadata request on connection refused error
This testcase may fail intermittently on 'Connection refused' error.
This could be due to the fact that the metadata proxy setup is not exactly
complete at the time the request is issued; in fact there is no
synchronization between the router being up and the metadata request being
issued, and clearly this may be the reason of accidental but seldom failures.
In order to rule out this possibility and stabilize the test, let's retry
on connection refused only. If we continue to fail, then the next step would
be to dump the content of iptables to figure out why the error occurs.
This patch doesn't changes behaviour of dhcp-agent
but adds the opportunity to use user-defined config,
that will make dhcp-agent more flexible
and allows to run functional tests correctly
(without changing global oslo.config CONF)
This patch deals with the lock wait timeout and the deadlock errors
observed under high concurrency (api_workers >= 4) with the pymysql
driver. It includes the following changes:
- Stop setting dirty status for resource usage when creating
reservation, as usage of reserved resources is not tracked anymore;
- Add a variable, increasing delay when retrying make_reservation
upon a DBDeadlock error in order to reduce the chances of further
collisions;
- Enable transaction retry upon DBDeadlock errors for set_quota_usage;
- Do not resync quota usage while making reservation. This puts a lot
of stress on the database and is also wasteful since resource usage
is very likely to change again once the transaction is committed;
- Use autonested_transaction to simplify logic around when the
nested flag should be used.
Moshe Levi [Tue, 18 Aug 2015 05:48:24 +0000 (08:48 +0300)]
Qos SR-IOV: Refactor extension delete to get mac and pci slot
When calling delete we need the pci slot details to reset the VF rate. The problem
is that when the VM is deleted libvirt return the VF to the hypervisor and eswitch
manager will mark the pci_slot as unassigned so can't know from the mac which pci slot (VF)
to reset. Also newer libvirt version reset the mac when deleteing VM, so than it is
not possible at all.
The solution is to keep pci slot details locally in the agent since upon removal event
you cannot get pci_slot from the neutron server as it is for create/update since port
is already removed from neutron.
This patch pairs the mac and pci_slot for a device (VF) so when calling the extension
port delete api we can have the pci_slot and reset the VF rate.
It is also add a mapping between mac to port_id so we can pass the port_id
when calling the extention port delete api.
OVS agent: handle deleted ports on each rpc_loop iteration
Currently rpc loop processes ports only in case polling is required
(message from ovsdb monitor) or there are port_updated notifications from
server or security group notifications.
In case of just port_deleted notifications port processing is not
triggered during rpc loop.
This may lead to agent accumulating a big amount of deleted ports
and processing all of them at once during next iteration when polling is
required or any notification from server, which might be quite tough for
the agent. Tough means agent will be irresponsive while processing deleted
ports.
The patch makes port deletion processing more gradual.
Shweta P [Thu, 27 Aug 2015 20:53:13 +0000 (16:53 -0400)]
Final decomposition of Cisco plugin
This patch follows the previous patch(listed as dependent) and moves
the remaining cisco db models from neutron to networking-cisco.
The patch deletes l3_model and cisco_router_plugin and their associated
config and helper files from neutron
Fixed functional test that validates graceful ovs agent restart
The async_ping function returns a callable that returns True when all ping
futures are done. Since those futures are running for 10 secs, there was no
chance that the result of the callable was True.
The test was bailing out without calling bridge reset even a single time,
effectively leaving the feature untested in gate.
Another thing to note is that for some reason the patch fixed oslo rootwrap
errors in the test when executed locally. Since I still don't understand how
it's possible that it fixes the issue for me, I mark the bug as related only,
and will track logstash after it's merged to see whether it applies unknown
magic to gate jobs too.
rossella [Wed, 26 Aug 2015 16:06:25 +0000 (16:06 +0000)]
_bind_devices query only existing ports
If a port is deleted right before _bind_devices is called,
get_ports_attributes will throw an exception since the row
corresponding to the port doesn't exist in the OVS DB.
Avoid that setting if_exists to True. The port will be
processed as deleted by the agent in the following iteration.
OVS agent: flush firewall rules for all deleted ports at once
In some cases, under high load OVS agent has to delete a big amount of
ports during rpc_loop. remove_devices_filter() does iptables-save/restore
for IPv4 and IPv6 which is 4 system calls. It is very expensive and
inefficient to call it for each port individually.
* Skip TestWSGIServerWithSSL[1] for Python 3 since it seems wsgi + ssl +
eventlet setup does not behave correctly now,
* Skip test_json_with_utf8[2] until we solve unicode/utf8 encode/decode,
* Fix some more tests to pass for py3,
* Replace print by print() in docs/docstrings.
Tu Hong Jun [Thu, 20 Aug 2015 06:08:07 +0000 (14:08 +0800)]
Changed filter field to router_id
The get_sync_interfaces query will always return all router ports
from database even it is supposed to query specific ones that
belong to a certain router. In large L3 scale environment with
number of route ports in place, this would lag the response time
for adding router interface and router L3 agent binding.
Sergey Vilgelm [Mon, 31 Aug 2015 14:06:48 +0000 (17:06 +0300)]
Fix a wrong condition for the _purge_metering_info function
Fix a situation for the _purge_metering_info function
when the items will never be deleted from the metering_info.
Delete the metering_info dict and use the metering_infos instead.
Fix the problem with changing a dictionary during iteration.
Add the unit tests for the _purge_metering_info and
_add_metering_info functions.
Make sure service providers can be loaded correctly
This patch fixes a regression where, if neutron was loaded using
--config-dir, the service_providers option was no longer available.
We bring the logic back (removed by 61121c5f2af), alongside the ability
to load the option auto-magically. This is especially required for DevStack
deployments as of today, because neutron-server is only loaded by passing
--config-file (...)neutron.conf and --config-file (...)ml2_conf.ini
Some SRIOV drivers/devices don't support link state setting,
meaning that 'ip link' fails like this when trying to set state:
# ip l set dev p2p1 vf 6 state disable
RTNETLINK answers: Operation not supported
The sriov-nic-agent tries to do that in
SriovNicSwitchAgent.treat_device() and fails because of non-zero
exit status from 'ip link' and, therefore, doesn't reach the code
that updates the actual port status, so port could hang in a BUILD
state even if binding was successful.
This patch fixes problem of nova not being able to successfully bind
or cleanup such a port. It does not fix a case when user manually
updates admin_state_up for a port via API, it's subject to a separate
fix.
Also, replace LOG.exception with LOG.warning for set_device_state()
as the exception would be logged by PciDeviceIPWrapper.set_vf_state()
anyway.
Terry Wilson [Tue, 16 Jun 2015 03:52:28 +0000 (22:52 -0500)]
Add support for PluginWorker and Process creation notification
There are several cases where plugin initialization should be
handled after neutron-server forks API/RPC workers. For example,
starting a client connection to an SDN controller before forking
copies the fd of the socket to the child process, but then you have
multiple processes trying to read/write the same socket connection.
It is also useful for a plugin to be able to do something in only
one process, regardless of how many workers are forked. One example
would be handling syncing from an external system to the neutron
database.
This patch does 3 things:
1) Treats rpc_workers=0 as = 1. This simplifies the code for
handling notification that forking has completed. In the
existing code, calling the notification in the Worker object's
start() method would happen twice in the case where both api
and rpc workers were 0, despite there being only one process.
An earlier patch already changed the default api_workers to be
the number of processors.
2) Adds notification of forking via the callbacks mechanism.
Plugins can subscribe to resources.PROCESS, event.AFTER_CREATE
and do any post-fork initialization that needs to be done for
every spawned process.
3) Adds core/service plugin calls to get_workers() which defaults
to returning (). Plugins that need additional processes to spawn
should just return an iterable of NeutronWorkers that will be
spawned in their own process.
Nick [Sun, 19 Jul 2015 14:41:27 +0000 (22:41 +0800)]
Implement external physical bridge mapping in linuxbridge
In some deployment scenario, it is not allowed to remove system
ethernet configuration from physical interface to newly-created
physical bridge by neutron due to some IT regulations.
End-users require to take advantage of the pre-existed(user-defined)
physical bridge to connect tap devices for neutron.
Oleg Bondarev [Wed, 12 Aug 2015 17:02:01 +0000 (20:02 +0300)]
Avoid DB errors when deleting network's ports and subnets
DB errors may occur when accessing query results
after the transaction was closed (like ObjectDeletedError).
Hence it's better to avoid DB object access especially
when it's not needed.
This patch changes _delete_ports() and _delete_subnets() to accept
only ids. Indeed, there is no need to pass db objects to these methods.
Kevin Benton [Fri, 28 Aug 2015 05:12:48 +0000 (22:12 -0700)]
Better message on allowed address pairs error
Neutron was throwing a 500 error when a non-iterable was passed
into allowed address pairs. This patch just catches that and
converts it into a regular badrequest message.
Closes-Bug: #1477829
Change-Id: I3c6f55df4912c7a9480fa097988f910b254572fd Signed-off-by: Kevin Benton <blak111@gmail.com>
Assaf Muller [Sat, 29 Aug 2015 15:32:19 +0000 (11:32 -0400)]
Add info to debug test_keepalived_respawns gate failure
Current theory is that there's a bug in external_process.active,
it returns True when it shouldn't, then kill -15 on the process
pid fails because the process isn't up. Added ps -p output to
see if the process is up or not.
James Arendt [Fri, 28 Aug 2015 23:33:44 +0000 (16:33 -0700)]
Make Neutron service flavor save service_type
While the service_type exists in the resource attributes and as
a database field for a Flavor, the creation dictionary did not
pass the value so the service_type was not being persisted
in the database nor returned.
Enhanced unit test to show problem. Test fails on old code
to save or return the input service_type.