In a case when first attempt to fetch default security group
fails and attempt to add it fails too due to a concurrent insertion,
later attempt to fetch the same default sg may fail due to
REPEATABLE READ transaction isolation level.
For this case RetryRequest should be issued to restart the
whole transaction and be able to see default group.
The patch also removes 'while True' logic as it's unsafe
Assaf Muller [Fri, 12 Jun 2015 19:07:17 +0000 (15:07 -0400)]
Add a fullstack fake VM, basic connectivity test
* Full stack tests' fake VMs are represented via a namespace,
MAC, IP address and default gateway. They're plugged to an OVS
bridge via an OVS internal port. As opposed to the current
fake machine class used in functional testing, this new fake
machine also creates a Neutron port via the API and sets the
IP and MAC according to it. It also sets additional attributes
on the OVS port to allow the OVS agent to do its magic.
* The functional fake machine and the full stack fake machine
should continue to share commonalities.
* The fullstack fake machine currently takes the IP address
from the port and statically assigns it to the namespace
device. Later when I'll add support for the DHCP agent
in full stack testing this assignment will look for the dhcp
attribute of the subnet and either assign the IP address
via 'ip' or call a dhcp client.
* Added a basic L2 connectivity test between two such machines
on the same Neutron network.
* OVSPortFixture now uses OVSInterfaceDriver to plug the port
instead of replicate a lot of the code. I had to make a
small change to _setup_arp_spoof_for_port since all OVS ports
are now created with their external-ids set.
Sandhya Dasu [Mon, 17 Aug 2015 10:26:53 +0000 (06:26 -0400)]
Final decomposition of ML2 Cisco UCSM driver
The ML2 Cisco UCSM driver's entry point is being switched to the
networking-cisco vendor repo. The definition of the driver's db
file and all references to it in the neutron branch are removed.
Ann Kamyshnikova [Wed, 19 Aug 2015 11:19:11 +0000 (14:19 +0300)]
Fix query in get_reservations_for_resources
For PostgreSQL if you're using GROUP BY everything in the SELECT
list must be an aggregate SUM(...) or used in the GROUP BY.
For reference:
http://www.postgresql.org/message-id/200402271700.28133.dev@archonet.com
Closes-bug: #1486467
Miguel Angel Ajo [Tue, 18 Aug 2015 06:35:00 +0000 (08:35 +0200)]
Fix tenant access to qos policies
fix policy.json to not allow tenants to create policies or rules
by default and allow tenants attach ports and networks to policies,
please note that policy access is checked in the QoSPolicy neutron
object in such case.
The reservation engine is subject to failures due to concurrency;
the switch to pymysql is likely to also have a part in observed
failures. While no gate failures have been observed so far, this
is a time bomb waiting to explode and must be addressed.
For this reason this patch acts conservatively by ensuring the
API controllers do not use anymore reservation. The code for
reservation management is preserved, and will wired again on the
controller when these issues are sorted.
The devref for neutron quotas is updated accordingly as a part
of this patch.
The patch makes L3 agent aware of possible SNAT role
rescheduling to/from it.
The gist is to compare gw_port host change.
If it was changed and agent is not on target host then
it needs to clear snat namespace if one exists. If agent
is on target host it needs to create snat namespace from
scratch if it doesn't exist.
Host field was excluded from gw_port comparison on
agent side as part of HA Router feature implementation.
This code was moved to corresponding module.
Doug Hellmann [Fri, 14 Aug 2015 22:30:46 +0000 (22:30 +0000)]
Add logging to debug oslo.messaging failure
It looks like recent changes to oslo.messaging master are conflicting
with changes in neutron master with the way RPC services are started
when the rpc_workers value == 0.
This patch does a simple fix to the quota DB driver in order
to ensure its compatibility with python3 and adds the quota
enforcement unit tests to the list of those executed as a part
of the py34 test environment.
Add the concept of resource reservation in neutron.
Usage tracking logic is also updated to support reservations.
Reservations are not however available with the now deprecated
configuration-based quota driver.
The base API controller will now use reservations to perform
quota checks rather than counting resource usage and then
invoking the limit_check routine.
The limit_check routine however has not been removed and
depreacated as a part of this patch. In order to ensure all
quota drivers expose a consistent interface, a
make_reservation method has been added to the configuration
based driver as well. This method simply performs "old-style"
limit checks by counting resource usage and then invoking
limit_check.
Doug Wiegley [Mon, 17 Aug 2015 15:17:46 +0000 (09:17 -0600)]
Don't fatal error during initialization for missing service providers
Sometime during the split, code was added to fixup driver paths,
which imports service providers even for plugins which are not
in use. That, combined with neutron including default service
providers for VPN and LOADBALANCER, resulted in a really messy
mess in terms of removing VPN from the main neutron test suites.
This change stops the imports, so that if one of the services is
missing, neutron server can still start. It likely breaks the driver
path fixup, which can be fixed outside of this gate blockage.
This merge commit introduces QoS feature into Liberty release of
Neutron.
The feature is documented in: doc/source/devref/quality_of_service.rst
included with the merge patch.
It includes:
- QoS API service plugin with QoS policy and QoS bandwidth limit
(egress) rule support;
- core plugin mechanism to determine supported rule types, with its ML2
implementation;
- new agent extension manager;
- QoS agent extension with pluggable backend QoS drivers (Open vSwitch
and SR-IOV support is included).
To extend network and port core resources with qos_policy_id attribute,
a new ML2 extension driver (qos) was introduced that relies on the QoS
core resource extension (the idea is that eventually we'll get a core
resource extension manager that can be directly reused by core plugins).
Agent-server interaction is based on:
- get_device_details() method that is extended with qos_policy_id;
- a new push/pull mechanism that allows agents and servers to
communicate using oslo.versionedobjects based objects sent on the
wire.
The merge includes the following types of test coverage:
- unit tests;
- functional tests for OVS agent, QoS agent extension, and low level
ovs_lib changes;
- API tests to cover port/network qos_policy_id attribute and new QoS
resources.
This merge also disables qos extension API tests until the service is
enabled in master gate.
Local changes apart from conflicts:
- updated down_revision for qos migration to reflect master expand head;
- disabled qos API tests with gate_hook.sh until we have it enabled in
master gate;
- bumped oslo.versionedobjects requirement to reflect what is in
openstack/requirements' global-requirements.txt
shihanzhang [Wed, 12 Aug 2015 09:12:27 +0000 (17:12 +0800)]
Rename function '_update_port_down'
The function _update_port_down is renamed to _get_agent_fdb
because it generates the fdb entries which are send to
related l2 agents, but the old name is hard to understand.
The idea here was to remove redundant unit tests.
The approach here has been that if the function being tested does not
implement any custom logic (apart from calling ovsdb), the unit test
does not help.
Refer to the bug description for more details of the specific tests
removed.
Kevin Benton [Sun, 16 Aug 2015 09:32:39 +0000 (02:32 -0700)]
Get rid of exception converter in db/api.py
The exception converter was necessary because the exceptions the
oslo db decorator looked for before were statically defined. The
retry decorator now accepts an exception_checker argument that takes
a function to call on exceptions to determine if they should be
caught.
This patch gets rid of the converted and replaces the one use case
with the new exception_checker argument.
Kevin Benton [Fri, 31 Jul 2015 01:07:03 +0000 (18:07 -0700)]
Use a conntrack zone per port in OVS
Conntrack zones per network are not adequate because VMs
on the same host communicating with each other cross iptables
twice. If conntrack is sharing the same zone for each cross,
the first one can remove the connection from the table on a RST
and then the second one marks the RST as invalid.
This patch adjusts the logic to use a conntrack zone per port
instead of per network. In order to avoid interrupting upgrades
or restarts, the initial zone map is built from the existing
iptables rules so existing port->zone mappings are maintained.
Matthew Treinish [Fri, 14 Aug 2015 15:41:49 +0000 (11:41 -0400)]
Fix some issues around tempest in fullstack testing doc
The why section in the fullstack testing doc gives a good explanation
of the rational behind the testing and where it fits in the testing
pyramid. However, some of the drawbacks of tempest mentioned aren't
accurate or are misleading. This commit attempts to reword that
piece to clear up any potential sources of confusion.
The difficulty in running tempest doesn't change depending on the
nature of the deployment, since tempest is an external test suite that
interacts with any deployment only through the api. The configuration
and run mechanics do not change whether your cloud is 1 or multiple
nodes. The real difficulty lies in setting up a multinode deployment.
For the failure reporting, if you can't figure out why something
failed from a tempest run it's the same for any end user of the API.
It should be treated as a bug in the project if an end user
can't figure out why something failed from logs and what gets
returned by the API. But, since the fullstack tests are a bit lower
level its not necessarily trying to catch bugs like that. This commit
attempts to reword it to make that distinction clear.
Ryan Moats [Fri, 14 Aug 2015 13:25:17 +0000 (08:25 -0500)]
Add dashboard folder and graphite dashboard to doc
Create a dashboard folder to hold HTML files that provide
dashboard views into various parts of neutron. This allows
the dashboards to be "living code" rather than frozen in
amber via shortened URLs.
The first dashboard example is a simple HTML file that
shows thumbnails of graphite plots of all neutron jobs
in the check pipeline. Clicking a thumbnail brings up
the larger graphite plot page.
Change-Id: I47e7718c2aae41c8308fd331377984e47a892294 Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Kyle Mestery [Thu, 13 Aug 2015 16:33:18 +0000 (16:33 +0000)]
lieutenants: Add Neutron infra lieutenants
It's become clear we need to have a centralized contact point
(or points) for Neutron interactions with infra. Lets start out
by making that Doug and Armando for now. Note this list is
alphabetized by last name for those curious on the ordering.
DVR: do not reschedule router for down agents on compute nodes
Scheduling/unscheduling of DVR routers with l3 agents in 'dvr' mode
running on a compute nodes is done according to DVR serviced ports
created/deleted on that compute nodes. It doesn't make sense to reschedule
router from l3 agent on compute node even if it's down - no other l3 agent
can handle VMs running on that compute node.
Isaku Yamahata [Tue, 21 Oct 2014 02:30:32 +0000 (11:30 +0900)]
Replace internal calls of create_{network, subnet, port}
When API controller calls method create_{network, subnet, port),
it made sure that the necessary default values for attrs are filled properly
according to attr mapping.
However, internal calls to these methods do not follow the convention,
when extension codes miss these values, exceptions will be thrown.
This patch introduces helper functions to fix up arguments and replaces
the direct callers of those methods.
Co-Authored-By: gong yong sheng <gong.yongsheng@99cloud.net> Co-Authored-By: yalei wang <yalei.wang@intel.com>
Change-Id: Ibc6ff897a1a00665a403981a218100a698eb1c33
Closes-Bug: #1383546
This patch is clean up to prevent future breakage by eliminating
potentially dangerous code.
l3_db and related code use L2 plugin _get_subnet and related method
unnecessarily instead of get_subnet.
It's dangerous because _get_subnet returns ORM db object which allows
the caller to update db rows directly. So the caller of _get_subnet
may update subnet db without notifying L2 plugin unintentionally.
In that case, L2 plugin or ML2 mechanism driver will be confused.
This patch replaces _get_subnet and _get_subnets_by_network with
get_subnet, get_subnets_by_network where possible.
This patch is clean up to prevent future breakage by eliminating
potentially dangerous code.
l3_db uses L2 plugin _get_port method unnecessarily instead of get_port.
It's dangerous because _get_port returns ORM db object which allows
the caller to update db rows directly. So the caller of _get_port may
update port db without notifying L2 plugin unintentionally.
In that case, L2 plugin or ML2 mechanism driver will be confused.
This patch replace _get_port with get_port method where possible.
Kevin Benton [Thu, 13 Aug 2015 23:58:02 +0000 (16:58 -0700)]
Break down _bind_port_if_needed in ML2
Separate the looping and retry logic in _bind_port_if_needed
from the actual binding attempts. This also eliminates the
'while True' loop with a regular for loop counter to make it
a little easier to reason about.
A suggestion to do this came up in a code review for
I437290affd8eb87177d0626bf7935a165859cbdd because the function
was difficult to reason about.
Brian Haley [Thu, 13 Aug 2015 20:57:59 +0000 (16:57 -0400)]
Remove 'action' argument from _handle_fip_nat_rules()
There's only one caller of _handle_fip_nat_rules(), and they
always specify 'add_rules' as the argument, so it's not
necessary any more. Also, the interface passed must be valid
since the caller has already used it, and would have thrown
an exception before this call was made. Found during another
code review.