Carl Baldwin [Mon, 18 Nov 2013 23:32:19 +0000 (23:32 +0000)]
Simplify ip allocation/recycling to relieve db pressure
I found that multiple calls to delete_port can pile up on the
_recycle_ip operation. This patch simplifies this operation. It
reduces the _recycle_ip operation to a single row delete in the ip
allocations table and doesn't touch the availability table.
To acheive the recycling of ips in a pool, this code runs a more
complex operation of rebuilding the availability table when it is
exhausted. Only one API process will perform this more expensive
operation and others waiting for allocation will immediately benefit.
The amortized cost of this operation is much less than the cumulative
cost of running the more expensive _recycle_ip operation for every
port delete.
IP allocation behaves a bit differently with this patch. Instead of
giving out the first IP available in a pool, the entire pool will be
allocated before wrapping around and recycling ip addresses that have
been released. This is a desirable feature as it puts ip addresses in
a sort of quarantine after they are released. It is easier to
distinguish newly allocated ips from old ones.
Carl Baldwin [Fri, 24 Jan 2014 22:35:48 +0000 (22:35 +0000)]
Reduce severity of log messages in validation methods
I noticed this while reviewing Ic2c87174. When I read through log
files, I don't want to see errors like this that come from validating
bad user input. Info severity is more appropriate.
Stephen Ma [Wed, 29 May 2013 01:52:27 +0000 (18:52 -0700)]
L3 Agent restart causes network outage
When a L3 agent controlling multiple qrouter namespaces
restarts, it destroys all qrouter namespaces even if
some of them are still in use. As a result, network
traffic could be stopped on the VMs that use the
networks associated with these namespaces.
So what is needed is for the L3 agent to preserve those
qrouter namespaces a L3 agent instance recognizes and to
destroy those it does not know about.
Maru Newby [Mon, 20 Jan 2014 19:28:03 +0000 (19:28 +0000)]
Minimize the cost of checking for api worker exit
A recent change to oslo allows the configuration of the interval
that ProcessLauncher waits between checks of child exit. The
default interval of 0.01s resulted in the neutron service consuming
unnecessary cpu cycles checking whether api workers had exited (5%
cpu on idle in a VM). This patch extends the interval to 1s to
minimize the cost of the checks.
Aaron Rosen [Mon, 13 Jan 2014 21:57:04 +0000 (13:57 -0800)]
Remove and recreate interface if already exists
If the dhcp-agent machine restarts when openvswitch comes up it logs the
following warning messages for all tap interfaces that do not exist:
bridge|WARN|could not open network device tap2cf7dbad-9d (No such device)
Once the dhcp-agent starts it recreates the interfaces and re-adds them to the
ovs-bridge. Unfortunately, ovs does not reinitialize the interfaces as they
are already in ovsdb and does not assign them a ofport number.
This situation corrects itself though the next time a port is added to the
ovs-bridge which is why no one has probably noticed this issue till now.
In order to correct this we should first remove interface that exist and
then readd them.
Carl Baldwin [Fri, 17 Jan 2014 19:28:10 +0000 (19:28 +0000)]
Use an independent iptables lock per namespace
Since iptables is independent from namespace to namespace, it makes
sense to use an independent lock per namespace. This improvement is
aimed at improving the parallel performance in the L3 agent.
In the NVP plugin, metadata processing performs several plugin operations
with the same context (and db session). The first operation might leave
persisted objects in the session instance which then conflict with objects
created in the second operation.
Roman Podoliaka [Thu, 12 Dec 2013 06:20:15 +0000 (08:20 +0200)]
Fix the migration adding a UC to agents table
The migration script mistakenly assumes that all core
plugins use agents extension, which is not true (e.g.
plumgrid and bigswitch don't).
Apply this migration script only for plugins that are
stated in the original migration script adding agents
table (511471cc46b_agent_ext_model_supp.py).
Eugene Nikanorov [Mon, 13 Jan 2014 14:58:59 +0000 (18:58 +0400)]
Fix race condition in delete_port method. Fix update_port method
Port can be gone between
l3plugin.prevent_l3_port_deletion(context, id)
and port query. Need to handle it properly.
ALso need to handle non-existing port in update_port properly.
Carl Baldwin [Tue, 12 Nov 2013 22:52:47 +0000 (22:52 +0000)]
Use information from the dnsmasq hosts file to call dhcp_release
Certain situations can cause the DHCP agent's local cache to get out
of sync with the leases held internally by dnsmasq. This method of
detecting when to call dhcp_release is idempotent and not dependent on
the cache. It is more robust.
Oleg Bondarev [Mon, 16 Dec 2013 09:17:48 +0000 (13:17 +0400)]
LBaaS: handle NotFound exceptions in update_status callback
LBaaS agent may send update_status requests to server on objects
which were already deleted from db: this is due to creating and
deleting objects with a high rate (like tempest API tests do).
As a result errors and stacktraces appear in server and agent logs.
The proposed solution is to catch NotFound exceptions and print a warning.
Édouard Thuleau [Tue, 24 Dec 2013 10:48:46 +0000 (11:48 +0100)]
[ML2] l2-pop MD handle multi create/delete ports
If more than one port is added or removed simultaneously, port db entry
have status BUILD or DOWN and pass to ACTIVE when agent have finish to
configured it.
l2-pop mechanism driver use events port pass to ACTIVE or DOWN to send
fdb entries. In case of port is the first or the last network port on
an agent, the flooding entry need to be add or removed.
This patch fix the method to determine how many ports are active on a
agent by adding filter on status port to be ACTIVE.
Sylvain Afchain [Wed, 4 Dec 2013 20:01:06 +0000 (21:01 +0100)]
Dnsmasq uses all agent IPs as nameservers
Add dhcp option which provides all agent IPs
which will be used as nameserver entries when
neutron uses multiple dhcp agent per network and
when there is no dns nameserver provided by the
neutron server.
Oleg Bondarev [Thu, 12 Dec 2013 08:13:22 +0000 (12:13 +0400)]
LBaaS: fix handling pending create/update members and health monitors
When agent requests loadbalancer logical config from server,
server returns only active pool members and health_monitors.
Need to make server return also members and monitors which are in pending states.
Also a small refactoring moving ACTIVE_PENDING set to common place
Fix pip install failure due to missing nvp.ini file
It looks like sdist does not support symlinks, therefore
letting nvp.ini point to nsx.ini is not a good solution.
Since nvp.ini is going away, leave a copy for now, but
add a warning so that users are aware of the switch,
whilst preserving full backward-compatibility.
This patch adds a new configuration variable for the timeout on
ovs-vsctl commands, and sets the default timeout to 10 seconds.
This is aimed at allowing users to tune the agents in order to avoid
timeout errors on their deployments.
Fix race condition on ml2 delete and update port methods
Synchronize access to ports table when deleting and updating
a port. Otherwise concurrent update/delete request for the same port
may cause neutron server to throw an exception and return
'500 Internal server error' for such requests.
When the rename of quantum->neutron occurred here ee3fe4e8 it also renamed
the the table creation from quantum_nvp_port_mapping to
neutron_nvp_port_mapping. This went undetected for a long time because
when neutron-server starts up it pushes down the scheme for tables that
are not there so the table would be created.
Because of this the following migration 50e86cb2637a called
op.rename_table('neutron_nvp_port_mapping', 'neutron_nsx_port_mappings')
though the table name being used was quantum_nvp_port_mapping. Because of this
the quantum_id->nvp_id mapping was never migrated over to the new table and
you would be left with a quantum_nvp_port_mapping table hanging around.
In addition, the downgrade would rename the table to neutron_nvp_port_mapping
instead of quantum_nvp_port_mapping. This patch addresses this issues.
Ann Kamyshnikova [Fri, 20 Sep 2013 11:48:37 +0000 (15:48 +0400)]
Update lockutils and fixture in openstack.common
lockutils: included commits: 8b2b0b7 Use hacking import_exceptions for gettextutils._ 6d0a6c3 Correct invalid docstrings 12bcdb7 Remove vim header 79e6bc6 fix lockutils.lock() to make it thread-safe ace5120 Add main() to lockutils that creates temp dir for locks 537d8e2 Allow lockutils to get lock_path conf from envvar 371fa42 Move LockFixture into a fixtures module d498c42 Fix to properly log when we release a semaphore 29d387c Add LockFixture to lockutils 3e3ac0c Modify lockutils.py due to dispose of eventlet 90b6a65 Fix locking bug 27d4b41 Move synchronized body to a first-class function 15c17fb Make lock_file_prefix optional 1a2df89 Enable H302 hacking check
fixture: created, included commits: 45658e2 Fix violations of H302:import only modules 12bcdb7 Remove vim header 3970d46 Fix typos in oslo 371fa42 Move LockFixture into a fixtures module f4a4855 Consolidate the use of stubs 6111131 Make openstack.common.fixture.config Py3 compliant 3906979 Add a fixture for dealing with config d332cca Add a fixture for dealing with mock patching. 1bc3ecf Start adding reusable test fixtures.
Also tox.ini was corrected to let lockutils work in tests.
This change is needed for work on bp: db-sync-models-with-migrations
Aaron Rosen [Wed, 8 Jan 2014 21:03:29 +0000 (13:03 -0800)]
Add test to port_security to test with security_groups
This patch adds a missing testcase to the port_security tests to test
for creating a port with port_security_enabled=False and passing in
a security group.
VMware NSX: Fix db integrity error on dhcp port operations
If the dhcp port and network disappear, ensure that
the integrity constraint violation that results from
inserting the neutron/nsx port mapping to the DB does
not propagate the exception all the way through, but
instead is caught and handled correctly.
Akihiro MOTOKI [Sat, 26 Oct 2013 12:53:21 +0000 (21:53 +0900)]
Remove plugin_name_v2 and extension_manager in test_config
There are two ways to specify a core plugin and an extension manager
in the unit tests: test_config and arguments of the constructor.
Both are used and it sometimes makes it a bit difficult to debug.
This patch removes the way via test_config and makes constructor
arguments the only way to do it.
Also removes the default entries in test_config because they are
not used anywhere.
Rename nicira configuration elements to match new naming structure
- Every config item prefixed with nvp is prefixed with nsx
- deprecation qualifiers are added to preserve bw compatibility
- nicira/nvp.ini is renamed to vmware/nsx.ini
- symlink nicira/nvp.ini is created to point to vmware/nsx.ini
- UT added to verify that nvp.ini and old config items can still
parsed correctly; bw-compat will be dropped in Juno
Eugene Nikanorov [Tue, 24 Dec 2013 11:08:22 +0000 (15:08 +0400)]
Fix race in get_network(s) in OVS plugin
Load network bindings eagerly with networks.
Otherwise a different db query could try to fetch network bindings
for already deleted networks. The issue is reproducible with
concurrent tempest network API tests.