May 21, 2013

Cloudscaling Engineering

Stacker Voices: Mark McLoughlin, Red Hat

Mark McLoughlinMark McLoughlin is a principal engineer at Red Hat who’s also the company’s OpenStack technical lead. He serves on both the OpenStack Technical Committee and as individual director on the OpenStack Foundation Board of Directors.  More importantly, he’s the top committer to the Grizzly release.

We spoke at the OpenStack Summit in Portland about the often overlooked Oslo (openstack-common) project within OpenStack, which Mark leads. The Olso project produces a set of python libraries containing code shared by various OpenStack projects. The goal is to provide a common set of high-quality API libraries for the project, to follow a Don’t Repeat Yourself (DRY) model across projects, and to create a model for cross-project collaboration.

In the video, we discuss:

  • an overview of Olso (openstack-common) and how it enables DRY and cross-project collaboration

  • addressing technical debt to help OpenStack move more quickly and keep up with the six-month release cycles

  • how the governance model for OpenStack provides a balance among the interests of users, operators and developers

  • brief comparison of different governance models (Gnome Foundation vs. OpenStack Foundation)

  • the technical meritocracy nature of OpenStack

 

Check out the video, below. Or, watch on YouTube.

by Randy Bias at May 21, 2013 02:30 PM

Dell TechCenter

Interview with Anne Gentle, OpenStack Technical Writer

Dell: Tell us about yourself and what you are doing? Anne Gentle: I’m Anne Gentle, I work at...

May 21, 2013 02:16 PM

Sean Dague

How an Idea becomes a Commit in OpenStack

My talk from the OpenStack summit is now up on youtube, where I walked people through the process of getting your idea into OpenStack. A big part of the explanation is what’s going on behind the scenes with code reviews and our continuous integration system.

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="360" src="http://www.youtube.com/embed/3zH6yL0js1M" width="640"></iframe>

I’m hoping it pulls away some of the mystery of the process, and provides a more gentle on ramp to everything for new contributors. I’ll probably be giving some version of this again at future events, so feedback (here or on youtube) is appreciated.

by Sean Dague at May 21, 2013 12:13 AM

May 20, 2013

Jay Pipes

Working with the OpenStack Code Review and CI system – Chef Edition

For too long, the state of the OpenStack Chef world had been one of duplicative effort, endless forks of Chef cookbooks, and little integration with how many of the OpenStack projects choose to control source code and integration testing. Recently, however, the Chef + OpenStack community has been getting its proverbial act together. Folks from lots of companies have come together and pushed to align efforts to produce a set of well-documented, flexible, but focused Chef cookbooks that install and configure OpenStack services.

My sincere thanks go out to the individuals who have helped to make progress in the last couple weeks, the individuals on the upstream openstack.org continuous integration team, and of course, the many authors of cookbooks whose code and work is being merged together.

OK, so what’s happened?

StackForge now hosting a set of Chef cookbooks for OpenStack

Individual cookbooks for each integrated OpenStack project have been created in the StackForge GitHub organization. Each cookbook name is prefixed with cookbook-openstack- followed by the OpenStack service name (not the project code name):

Note that we have not yet created the cookbook for Heat, but that will be coming in the Havana timeframe, for sure. Also note that the Ceilometer (metering) cookbook is empty right now. We’re in the process of pulling the ceilometer recipes out of the compute cookbook into a separate cookbook.

In addition to the OpenStack project cookbooks listed above, there are three other related cookbooks:

Finally, there will be another repository called openstack-chef-repo that will contain example Chef roles, databags and documentation showing how all the OpenStack and supporting cookbooks are tied together to create an OpenStack deployment.

Code in cookbooks gated by Gerrit like any other OpenStack project

The biggest advantage of hosting all these Chef cookbooks on the StackForge GitHub repository is the easy integration with the upstream continuous integration system. The upstream CI team has built a metric crap-ton (technical term) of automation code that enabled us to quickly have Gerrit managing the patch queues and code reviews for all these cookbook repositories as well as have each repository guarded by a set of gate jobs that run linter and unit tests against the cookbooks.

The rest of this blog post explains how to use the development and continuous integration systems when working on the OpenStack Chef cookbooks housed in Stackforge.

Prepare to develop on a cookbook

OK, so you want to start working on one of the OpenStack Chef cookbooks? Great! The first thing you need to do is clone the appropriate Git repository containing the cookbook code and set up your Gerrit credentials. Here is the code to do that:

git clone git@github.com:stackforge/cookbook-openstack-$SERVICE
cd cookbook-openstack-$SERVICE
git review -s

Of course, replace $SERVICE above with one of common, compute, identity, image, block-storage, object-storage, network, metering, or dashboard. What that will do is clone the upstream Stackforge repository for the corresponding cookbook to your local machine, change directory into that clone’d repository, and set up a git remote called “gerrit” pointing to the review.openstack.org Gerrit system.

If everything was successful, you should see something like this:

jpipes@uberbox:~/gerrit-tut$ git clone git@github.com:stackforge/cookbook-openstack-common
Cloning into 'cookbook-openstack-common'...
remote: Counting objects: 506, done.
remote: Compressing objects: 100% (168/168), done.
remote: Total 506 (delta 246), reused 503 (delta 243)
Receiving objects: 100% (506/506), 81.97 KiB, done.
Resolving deltas: 100% (246/246), done.
jpipes@uberbox:~/gerrit-tut$ cd cookbook-openstack-common/
jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ git review -s
Creating a git remote called "gerrit" that maps to:
	ssh://jaypipes@review.openstack.org:29418/stackforge/cookbook-openstack-common.git

Repeat the above for each cookbook you wish to clone and work on locally, or simply execute this to clone them all:

for i in common compute identity image block-storage object-storage network metering dashboard;\
do git clone git@github.com:stackforge/cookbook-openstack-$i; cd cookbook-openstack-$i; git review -s; cd ../;\
done

Start to develop on a cookbook

Now that you have git clone’d the upstream cookbook repository and set up your Gerrit remote properly, you can begin coding on the cookbook. Remember, however, that you should never make changes in your local “master” branch. Always work in a local topic branch. This allows you to work on a branch of code separately from the local master branch you will use to bring in changes from other developers.

Create a new topic branch like so:

git checkout -b <TOPIC_NAME>

Here is an example of what you can expect to see:

jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ git checkout -b tut-example
Switched to a new branch 'tut-example'

Once you are checked out into your topic branch, you can now add, edit, delete, and move files around as you wish. When you have made the changes you want to make, you then need to commit your changes to the working tree in source control.

IMPORTANT NOTE:: If you created any new files while working in your branch, you will need to tell Git about those new files before you commit. An easy way to check if you’ve added any new files that should be added to Git source control is to always call git status before doing your commit. git status will tell you if there are any untracked files in your working tree that you may need to add to Git:

jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ touch something_new.txt
jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ git status
# On branch tut-example
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#	something_new.txt
nothing added to commit but untracked files present (use "git add" to track)

As the note shows, you use the git add command to add the untracked file to source control:

jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ git status
# On branch tut-example
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#	new file:   something_new.txt
#

If you make changes to files, they will show up in git status as changed files, as shown here:

jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ vi README.md 
jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ git status
# On branch tut-example
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#	new file:   something_new.txt
#
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#	modified:   README.md
#

As you can see, I edited the README.md file, and the call to git status shows that file as modified. If you want to review the changes you made, use the git diff command:

jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ git diff
diff --git a/README.md b/README.md
index 4fbbd57..e197d3b 100644
--- a/README.md
+++ b/README.md
@@ -24,6 +24,8 @@ of all the settable attributes for this cookbook.
 
 Note that all attributes are in the `default["openstack"]` "namespace"
 
+TODO(jaypipes): Should we list all the attributes in the README?
+
 Libraries
 =========

If you are happy with the changes, you’re now ready to commit those changes to source control. Call git commit, like so:

git commit -a

This will open up your text editor and present you with an area to write your commit message describing the contents of your patch. Commit messages should be properly formatted and abide by the upstream conventions. Feel free to read that link, but here is a brief rundown of stuff to keep in mind:

  • Make the first line of the commit message 50 chars or less
  • Separate the first line from the rest of the commit message with a blank newline
  • Make the commit message descriptive of what the patch is and what the motivation for the patch was
  • Do NOT make the commit message into a list of the things in the patch you changed in each revision — we can already see what is contained in the patch

Save and close your editor to finalize the commit. Once successfully committed, you now need to push your changes to the Gerrit code review and patch management system on review.openstack.org. You do this using a call to git review.

When you issue a call to git review for any of the cookbooks, a patch review is created in Gerrit. Behind the scenes, the git review plugin is simply doing the call for you to git push to the Gerrit remote. You can always see what git review is doing by passing the -v flag, like so:

jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ git commit -a
[tut-example 66f844a] Simple patch for tutorial -- please IGNORE
 1 file changed, 2 insertions(+)
 create mode 100644 something_new.txt
jpipes@uberbox:~/gerrit-tut/cookbook-openstack-common$ git review -v
2013-05-20 12:48:54.333823 Running: git log --color=never --oneline HEAD^1..HEAD
2013-05-20 12:48:54.337044 Running: git remote
2013-05-20 12:48:54.339673 Running: git branch -a --color=never
2013-05-20 12:48:54.342580 Running: git rev-parse --show-toplevel --git-dir
2013-05-20 12:48:54.345139 Running: git remote update gerrit
Fetching gerrit
2013-05-20 12:48:55.475616 Running: git rebase -i remotes/gerrit/master
2013-05-20 12:48:55.616412 Running: git reset --hard ORIG_HEAD
2013-05-20 12:48:55.620552 Running: git config --get color.ui
2013-05-20 12:48:55.623098 Running: git log --color=always --decorate --oneline HEAD --not remotes/gerrit/master --
2013-05-20 12:48:55.626745 Running: git branch --color=never
2013-05-20 12:48:55.629703 Running: git log HEAD^1..HEAD
Using local branch name "tut-example" for the topic of the change submitted
2013-05-20 12:48:55.634665 Running: git push gerrit HEAD:refs/publish/master/tut-example
remote: Resolving deltas: 100% (2/2)
remote: Processing changes: new: 1, done    
remote: 
remote: New Changes:
remote:   https://review.openstack.org/29797
remote: 
To ssh://jaypipes@review.openstack.org:29418/stackforge/cookbook-openstack-common.git
 * [new branch]      HEAD -> refs/publish/master/tut-example
2013-05-20 12:48:56.775939 Running: git rev-parse --show-toplevel --git-dir

As you can see in the above output, a new patch review was created at the location https://review.openstack.org/29797. You can go to that URL and see the patch review, shown here before any comments or reviews have been made on the patch.

gerrit-review-29797

Reviewing patches with Gerrit

Gerrit has three separate levels of reviews: Verify, Code-Review, and Approve.

The Verify (V) review level

The Verify level is limited to the Jenkins Gerrit user, which runs the automated tests that protect each cookbook repository’s master branch. These automated tests are known as gate tests.

When you push code to Gerrit, there are a set of automatic tests that are run against your code by Jenkins. Jenkins is a continuous integration job system that the upstream OpenStack CI team has integrated into Gerrit so that projects managed by Gerrit may have a series of automated check jobs run against proposed patches to the project. Below, you can see that the Jenkins user in Gerrit has already executed two jobs — gate-cookbook-openstack-common-chef-lint and gate-cookbook-openstack-common-chef-unit — against the proposed code changes. The jobs (expectedly) both pass, as I haven’t actually changed anything in the code, only added a blank file and added a line to the README file.

jenkins-jobs-29797

Curious about how those gate jobs are set up? Check out the github.com openstack-infra/config project. Hint: look at these two files.

If the Jenkins jobs fail, you will see Jenkins issue a -1 in the V column in the patch review. Any -1 from Jenkins as a result of a failed gate test will prevent the patch from being merged into the target branch, regardless of the reviews of any human.

The Code-Review (R) review level

Anyone who is logged in to Gerrit can review any proposed patch in Gerrit. To log in to Gerrit, click the “Sign In” link in the top right corner and log in using the Launchpad Single-Signon service. Note: This requires you to have an account on Launchpad.

Once logged in, you will see a “Review” button on each patch in the patchset. You can see this Review button in the images above. If you were the one that pushed the commit, you will also see buttons for “Abandon”, “Work in Progress”, and “Rebase Change”. The “Abandon” button simply lets you mark the patchset as abandoned and gets the patch off of the review radar. “Work in Progress” lets you mark the patchset as not ready for reviews, and “Rebase Change” is generally best left alone unless you know what you’re doing. ;)

Each file (including the commit message itself) has a corresponding link that you can view the diff of that file and add inline comments similar to how GitHub pull requests allow inline commenting. Simply double click directly below the line you wish to comment on, and a box for your comments will appear, as shown below:

inline-review-29797

IMPORTANT NOTE: Unlike GitHub inline commenting on pull requests, your inline comments on Gerrit reviews are NOT viewable by others until you finalize your review by clicking the “Review” button. Your comments will appear in red as “Draft” comments on the main page of the patch review, as shown below:

inline-draft-29797

To put in a review for the patch, click the “Review” button. You will see options for:

  • +1 Looks good to me, but someone else must approve
  • +0 No score
  • -1 I would prefer you didn’t merge this

If you are a core reviewer, in addition to the above three options, you will also see:

  • +2 Looks good to me (core reviewer)
  • -2 Do Not Merge

There is also a comment are for you to put your review comments, which is directly above an area that shows all the inline comments you have made:

review-29797

After selecting the review +/- that matches your overall thoughts on the patch, and entering any comment, click the “Publish Comments” button, and your review will show in the comments of the patch, as shown below:

reviewed-29797

The Approve (A) review level

Members of the core review team also see a separate area in the review screen for the Approve (A) review level. This level tells Gerrit to either proceed with the merge of the patch into the target branch (+1 Approve) or to wait (0 No Score).

The general rule of thumb is that core reviewers should not hit +1 Approve until 2 or more core reviewers (can include the individual doing the +1 Approve) have added a +2 (R) to the patch in reviews. This rule is subject to the discretion of the core reviewer for trivial changes like typo fixes, etc.

Summary

I hope this tutorial has been a help for those new to the Gerrit and Jenkins integration used by the OpenStack upstream projects. Contributing to the Chef OpenStack cookbooks should be no different than contributing to the upstream OpenStack projects now, and additional gate tests — including full integration test runs using Vagrant or even a multi-node deployment — are on our TODO radar. Please sign up on the OpenStack Chef mailing list if you haven’t already. We look forward to your contributions!

by jaypipes at May 20, 2013 05:37 PM

eNovance

How to create specific Nova flavors for tenants

In the case of a private Cloud, we recently had to create specific Nova flavors for one tenant, and didn’t want to expose this flavor to all tenants.

 

First of all, you need to know that by default only the « admin » tenant can manage flavors because of default policy in Nova :

"compute_extension:flavormanage": "rule:admin_api"

 

If you want to let the possibility to all tenants to create flavors, you can delete the rule to have :

"compute_extension:flavormanage": ""

Now we are going to create a flavor :

nova flavor-create flavor-name flavor-ID RAM-in-MB root-disk-in-GB VCPUs-number 
--ephemeral ephemeral-disk-in-GB --swap swap-in-MB --is-public False

 

Example :

nova flavor-create enocloud-xxl 50 32 200 8 --is-public False

The next step is to associate the flavor to the tenant :

nova flavor-access-add <flavor-id> <tenant-id>

 

Example :

nova flavor-access-add 50 4f1b0b9ce3354a439db8ef10cf456d6f

 

Hope that helps !

by Emilien at May 20, 2013 02:44 PM

Mirantis

OpenStack Project Technical Lead Interview Series #2: Monty Taylor, OpenStack CI(Continuous Integration) Project

Here’s the second in our series of interviews with OpenStack Project Technical Leads on the Mirantis blog. Our goal is to educate the broader tech community and help people understand how they can contribute to and benefit from OpenStack. Naturally, these are the opinions of the interviewee, not of Mirantis. We’ve edited the interview for [...]

The post OpenStack Project Technical Lead Interview Series #2: Monty Taylor, OpenStack CI(Continuous Integration) Project appeared first on Mirantis.

by Admin at May 20, 2013 07:01 AM

May 19, 2013

Loïc Dachary

Virtualizing legacy hardware in OpenStack

A five years old hardware is being decommissioned and hosts fourteen vservers on a Debian GNU/Linux lenny running a 2.6.26-2-vserver-686-bigmem linux kernel. The April non profit relies on these services (mediawiki, pad, mumble, etc. ) for the benefit of its 5,000 members and many working groups. Instead of migrating each vserver individually to an OpenStack instance, it was decided that the vserver host would be copied over to an OpenStack instance.
The old hardware has 8GB of RAM, 150GB disk and a dual Xeon totaling 8 cores. The munin statistics show that no additional memory is needed, the disk is half full and an average of one core is used at all times. A 8GB RAM, 150GB disk and dual core openstack instance is prepared. The instance will be booted from a 150GB volume placed on the same hardware to get maximum disk I/O speed.
After the volume is created, it is mounted from the OpenStack node and the disk of the old machine is rsync’ed to it. It is then booted after modifying a few files such as fstab. The OpenStack node is in the same rack and the same switch as the old hardware. The IP is removed from the interface of the old hardware and it is bound to the OpenStack instance. Because it is running on nova-network with multi-host activated, it is bound to the interface of the OpenStack node which can take over immediately. The public interface of the node is set as an ARP proxy to advertise the bridge where the instance is connected. The security group of the instance are disabled ( by opening all protocols and ports ) because a firewall is running in the instance.

Collocated hardware

The OpenStack cluster used to migrate the legacy hardware is configured to allow the collocation of instances and volumes on the same hardware. One OpenStack availability zone groups hardware located in the same rack and uses the same switch as the legacy hardware. This allows for a migration that does not involve changing the IP of the machine. If the OpenStack nodes were located in a different autonomous system, a DNS change would be necessary and require additional preparations.

Maintenance LAN connection

The primary IP address used by the legacy hardware is also used by a number of services provided by the vservers it hosts. Moving this IP address to the OpenStack instance would mean losing access to the legacy hardware, without any hope to fallback, should something unexpected happen. Because both machines involved in the migration are connected to the same switch and use the same VLAN, an additional IP address is manually added to preserve communications:

ns1 : ip addr add 10.222.222.1/24 dev eth0
yopo : ip addr add 10.222.222.2/24 dev eth0

Preparations

In the following, desktop is any machine on which there are enough credentials to either connect to the legacy machine using ssh, run nova commands or EC2 commands targeting the OpenStack cluster, yopo is the OpenStack node, ns1 is the legacy hardware.

desktop: euca-create-volume --zone bm0008 --size 150
+----+-----------+--------------+------+-------------+-------------+
| ID |   Status  | Display Name | Size | Volume Type | Attached to |
+----+-----------+--------------+------+-------------+-------------+
| 3f | available | None         | 150  | None        |             |
+----+-----------+--------------+------+-------------+-------------+
desktop: nova volume-list | grep " $(printf "%d" 0x3f) "
| 63 | available      | None             | 150  | None        |                                      |

The bm0008 is the availability zone matching the OpenStack node known as yopo. Note that euca-create-volume which is an EC2 command reports the volume id as an exadecimal number but nova volume-list shows it as a decimal number. The hexadecimal form is used to name the LV volumes of the LVM backend. A partition table is then created on the 150GB volume and configured to have a single primary partition taking all the space.

yopo: kpartx -av /dev/vg/volume-0000003f
yopo: mkfs.ext3 /dev/mapper/vg-volume--0000003f1
yopo: mount /dev/mapper/vg-volume--0000003f1 /mnt
yopo: rsync -i --exclude=/etc/fstab --exclude=70-persistent-net.rules \
 --exclude=/boot/grub \
 --exclude=/srv/backup \
  --exclude=/var/cache \
  --exclude=/var/lib/backuppc \
  --exclude=/var/tmp \
  --exclude=/proc \
  --exclude=/sys -avHS --delete --numeric-ids 10.222.222.1:/ /mnt/

The partition is formatted with ext3 instead of ext4 to avoid any issues : the installed lenny from ns1 only uses ext3. A copy of the ns1 disk is made and excludes files that will either be replaced or contain data that are not worth replicating.

yopo: echo 'proc /proc proc defaults 0 0' > /mnt/etc/fstab
yopo: echo '/dev/vda1 / ext3 defaults,errors=remount-ro 0 1' >> /mnt/etc/fstab

The fstab is rewritten entirely to take into account the presence of a single partition ( as opposed to seven on ns1 ) and a device name starting with /dev/vd instead of /dev/hd or /dev/sd.

yopo: cp /mnt/boot/vmlinuz-2.6.26-2-vserver-686-bigmem /tmp
yopo: cp /mnt/boot/initrd.img-2.6.26-2-vserver-686-bigmem /tmp
yopo: umount
yopo: kpartx -dv /dev/vg/volume-0000003f
yopo: sed -i -e 's:kopt=.*:kopt=root=/dev/vda1' \
 -e 's/default=.*/default=0/' \
 -e 's/groot=.*/groot=(hd0,0)/' /boot/grub/menu.lst
yopo: echo '(hd0) /dev/vda' > /mnt/boot/grub/device.map
yopo: kvm -m 1024 -drive file=/dev/mapper/vg-volume--0000003f,if=virtio,index=0 \
  -boot c -initrd /tmp/initrd.img-2.6.26-2-vserver-686-bigmem\
   -kernel /tmp/vmlinuz-2.6.26-2-vserver-686-bigmem -append 'root=/dev/vda1' \
  -net nic -net user -nographic -curses -monitor unix:/tmp/file.mon,server,nowait
curses: grub-install /dev/vda
curses: update-grub
curses: halt

Grub is installed on the disk by using kvm to actually boot the instance, using a curses based console instead of a VGA console. The grub menu is edited to update the menu.lst and the device.map to reflect the changes with the disk and the partition table. The kernel and initrd are copied out of the file system imported from ns1 to be given as arguments to kvm to allow it to boot under conditions that are close to the one existing on the legacy hardware. Once the machine is successfully booted, grub-install and update-grub are called to allow kvm to boot without an external kernel. It can be verified with:

yopo: kvm -m 1024 -drive file=/dev/mapper/vg-volume--0000003f,if=virtio,index=0 \
  -boot c -net nic -net user -nographic \
  -curses -monitor unix:/tmp/file.mon,server,nowait

Routing the public IP

The legacy installation for ns1 does not obtain its IP address from DHCP and may contain a number of occurrence of this IP in various configuration files. The OpenStack node is configured to add a route dedicated to this IP by adding the following to /etc/rc.local.

brctl addbr br2004
ip link set br2004 up
ip r add 88.191.240.4/32 dev br2004

The br2004 bridge is dedicated to the tenant used to run the OpenStack instance, as shown by 2004 :

desktop: keystone tenant-list | grep ' april '
| 7c918c873280465da3785f5699d48316 | april           | True    |
desktop: nova-manage network list | grep 7c918c873280465da3785f5699d48316
5 10.145.4.0/24 None 10.145.4.3 None None 2004 7c918c873280465da3785f5699d48316 20941588-2c35-40b3-9ecb-af87cadae446

The bridge can be created before OpenStack runs so that the public IP can be routed to it. The existing router will be used by OpenStack.

Migrating

The rsync command shown is run to update copy, without stoping any service.

yopo: ssh 10.222.222.1 ip addr del 88.191.250.4/27 dev eth0
yopo: ssh 10.222.222.1 /etc/init.d/util-vserver stop

The rsync command is run again after stopping all vservers on ns1 and removing the IP from the interface.

yopo: umount /mnt
yopo: kpartx -dv /dev/vg/volume-0000003f
desktop: ssh controller.vm.april-int nova boot \
 --image 'CirrOS 0.3' \
 --block_device_mapping vda=63::0:0 \
 --flavor e.1-cpu.0GB-disk.8GB-ram \
 --key_name loic --availability_zone=bm0008 ns1 --poll

The partition is unmounted and the instance booted from the volume. It should recover as if a power failure happened.

yopo: ip r add 88.191.240.4/32 dev br2004

After the public IP is routed to the bridge br2004 to which the newly created instance is connected, the services should be up and communicated properly.

Setup the arp proxy

The interface of the OpenStack node that is used for floating IPs must be configured as an arp proxy.

echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp
echo 1 > /proc/sys/net/ipv4/conf/br2004/proxy_arp

These lines are appended to /etc/rc.local so that they are run at boot time. The switch to which both machines were connected has an arp cache. It needs to be cleared so that it notices that packets must be sent to another MAC.

 ip addr add 88.191.250.4/32 dev eth0
arping -U 88.191.240.4 -I eth0
ip addr del 88.191.250.4/32 dev eth0

by Loic Dachary at May 19, 2013 10:43 PM

May 18, 2013

Matthias Runge

OpenStack testing session @FLOCK

As you might heard, we at Fedora had FUDCons (Fedora Users and Developers conference), which is now replaced by a conference named Flock. The first one will be held in Charleston, South Carolina between Aug. 9th and 12th. 2013. Coming there is a unique chance this year, to meet many Fedora users and developers to come together, discuss new ideas, work to make those ideas a reality, and continue to promote the core values of the Fedora Community: Freedom, Friends, Features, and First.

OpenStack is a somehow complex thing to setup and to integrate into Linux distributions. Thus, I proposed an OpenStack testing hackfest at Flock, to test the latest build for Fedora, and also to bring users and developers together into one room. Currently, it is not decided, if this session is accepted, so please stay tuned.

by mrunge at May 18, 2013 12:17 PM

May 17, 2013

OpenStack Blog

OpenStack Community Weekly Newsletter (May 10-17)

OpenStack Compute (Nova) Roadmap for Havana

The Havana design summit was held mid-April.  Since then the Nova team has been documenting the Havana roadmap and going full speed ahead on development of these features.  The list of features that developers have committed to completing for the Havana release is tracked using blueprints on Launchpad. At the time of writing, there are 74 blueprints listed that cover a wide range of development efforts. Russell Bryant, Nova Tech Lead, highlights some of them.

Stacker Voices: Thierry Carrez, OpenStack Foundation

Thierry Carrez handles release management for the OpenStack Foundation and is chair of the project’s Technical Committee. Thierry was involved with the earliest incarnations of OpenStack while at Rackspace. Cloudscaling’s team caught up with him at the OpenStack Summit in Portland to get Thierry’s insights into the release cycle, governance and his wish list for the project.

Swiftsync – A way to synchronize two swift clusters

Enovance was asked to migrate and synchronize two swift clusters in order to provide a customer a way to handle a swift migration easily. For that they started a project called swiftsync hosted in github.

Tips ‘n Tricks

Security Issues

OpenStack In The Wild

A new section of the weekly newsletter dedicated to users of OpenStack. If you want to showcase how OpenStack helps you (or you know somebody that uses OpenStack) please let us know: email, twitter, reddit or avian carrier will do).

Upcoming Events

Other News

Welcome New Developers

  • Hugh Saunders
  • Bruno Semperlotti
  • YAMAMOTO Takashi, VALinux
  • Fujioka Yuuichi, NEC

Got answers?

Ask OpenStack is the go-to destination for OpenStack users. Interesting questions waiting for answers:

The weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

by Stefano Maffulli at May 17, 2013 09:53 PM

Rob Hirschfeld

Crowbar cuts OpenStack Grizzly (“pebbles”) branch & seeks community testing

Pebbles CutThe Crowbar team (I work for Dell) continues to drive towards “zero day” deployment readiness. Our Hadoop deployments are tracking Dell | Cloudera Hadoop-powered releases within a month and our OpenStack releases harden within three months.

During the OpenStack summit, we cut our Grizzly branch (aka “pebbles”) and switched over to the release packages. Just a reminder, we basically skipped Folsom. While we’re still tuning out issues on OpenStack Networking (OVS+GRE) setup, we’re also looking for community to start testing and tuning the Chef deployment recipes.

We’re just sprints from release; consequently, it’s time for the Crowbar/OpenStack community to come and play! You can learn Grizzly and help tune the open source Ops scripts.

While the Crowbar team has been generating a lot of noise around our Crowbar 2.0 work, we have not neglected progress on OpenStack Grizzly.  We’ve been building Grizzly deploys on the 1.x code base using pull-from-source to ensure that we’d be ready for the release. For continuity, these same cookbooks will be the foundation of our CB2 deployment.

Features of Crowbar’s OpenStack Grizzly Deployments

  • We’ve had Nova Compute, Glance Image, Keystone Identity, Horizon Dashboard, Swift Object and Tempest for a long time. Those, of course, have been updated to Grizzly.
  • Added Block Storage
    • importable Ceph Barclamp & OpenStack Block Plug-in
    • Equalogic OpenStack Block Plug-in
  • Added Quantum OpenStack Network Barclamp
    • Uses OVS + GRE for deployment
  • 10 GB networking configuration
  • Rabbit MQ as its own barclamp
  • Swift Object Barclamps made a lot of progress in Folsom that translates to Grizzly
    • Apache Web Service
    • Rack awareness
    • HA configuration
    • Distribution Report
  • “Under the covers” improvements for Crowbar 1.x
    • Substantial improvements in how we configure host networking
    • Numerous bug fixes and tweaks
  • Pull from Source via the Git barclamp
    • Grizzly branch was switched to use Ubuntu & SUSE packages

We’ve made substantial progress, but there are still gaps. We do not have upgrade paths from Essex or Folsom. While we’ve been adding fault-tolerance features, full automatic HA deployments are not included.

Please build your own Crowbar ISO or check our new SoureForge download site then join the Crowbar List and IRC to collaborate with us on OpenStack (or Hadoop or Crowbar 2). Together, we will make this awesome.


by Rob H at May 17, 2013 03:56 PM

Dell TechCenter

OpenStack Rocks in Poland

Join the 1st OpenStack User Group Meetup in Szczecin (Poland) 06-06-2013, Technopark Pomerania,...

May 17, 2013 01:04 PM

May 16, 2013

Mirantis

OpenStack Grizzly Webinar Preview: Availability Zones vs. Host Aggregates

You've probably been wondering what to make of the new features that have surfaced in the OpenStack Grizzly release – the list is substantial. That's why on Thursday, May 23, I'll be presenting a live Mirantis webinar about this and many of OpenStack Grizzly’s other new capabilities. It's free – you can sign up on [...]

The post OpenStack Grizzly Webinar Preview: Availability Zones vs. Host Aggregates appeared first on Mirantis.

by Nick Chase at May 16, 2013 05:13 PM

Kashyap Chamarthy

Nested Virtualization — KVM, Intel, with VMCS Shadowing

[Previous installments on Nested Virtualization with KVM and Intel.]

This is part of some recent testing that I’ve been doing with upstream KVM (for 3.10.1). The threads linked here has initial tests bench-marking kernel compile (with make defconfig, a default config file) times in L2. And some minimal guestfish appliance start-up timings in L1.

Some details:

  • Setup information to test with VMCS (Virtual Machine Control Structure) Shadowing. In brief, VMCS Shadowing — a processor specific feature — as described upstream, can reduce the overhead of nested virtualization by reducing the number of VMExits from L1 to L0.
  • Simple scripts used to create L1 and L2.
  • Libvirt XMLs of L1, L2 guests, for reference.

The gritty details of reasons for VMExits are described in Intel architecture manuals, Volume 3b, APPENDIX 1.


by kashyapc at May 16, 2013 07:13 AM

Rob Hirschfeld

Thanks! I’m enjoying my conversation with you

I write because I love to tell stories and to think about how actions we take today will impact tomorrow.  Ultimately, everything here is about a dialog with you because you are my sounding board and my critic.  I appreciate when people engage me about posts here and extend the conversation into other dimensions.  Feel free to call me on points and question my position – that’s what this is all about.

Thank you for being at part of my blog and joining in.  I’m looking forward to hearing more from you.

During the OpenStack Summit, I got to lead and participate in some excellent presentations and panels.  While my theme for this summit was interoperability, there are many other items discussed.

I hope you enjoy them.

Did one of these topics stand out?  Is there something I missed?  Please let me know!


by Rob H at May 16, 2013 03:48 AM

May 14, 2013

Loïc Dachary

OpenStack Upstream University training

Upstream University training for OpenStack contributors include a live session where students contribute to a Lego town. They have to comply with the coding standards imposed by the existing buildings. More than fifteen participants created an impressive city within a few hours during the session held in may 2013. The images speak for themselves. The next sessions will be in Paris in June and Portland in July.







by Loic Dachary at May 14, 2013 07:49 PM

Cloudscaling Engineering

Stacker Voices: Thierry Carrez, OpenStack Foundation

Thierry CarrezThierry Carrez handles release management for the OpenStack Foundation and is chair of the project’s Technical Committee. Thierry was involved with the earliest incarnations of OpenStack while at Rackspace. We caught up with him at the OpenStack Summit in Portland to get Thierry’s insights into the release cycle, governance and his wish list for the project.

 

In the video, we discuss:

  • drivers behind the shift from a 3-month to a 6-month release cycle for OpenStack

  • managing the release cycle as OpenStack has grown from two to nine projects

  • the logic behind aligning the release cycle with the semi-annual Summits

  • the role of CI in improving interoperability and quality across all the projects

  • complementary roles of the board (resources, brand, trademark) and the technical committee (meritocracy of developers and code quality)

  • importance of motivating corporate contributors to invest more in long-term, strategic projects like documentation, security, QA, and test suite

 

 Check out the video, below. Or, watch on YouTube.

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="http://www.youtube.com/embed/YQTRSfY6Auw" width="560"></iframe>

by Randy Bias at May 14, 2013 02:30 PM

Ceph

Incremental Snapshots with RBD

While Ceph has a wide range of use cases, the most frequent application that we are seeing is that of block devices as data store for public and private clouds managed by OpenStack, CloudStack, Eucalyptus, and OpenNebula. This means that we frequently get questions about things like geographic replication, backup, and disaster recovery (or some combination therein, given the amount of overlap on these topics). While a full-featured, robust solution to geo-replication is currently being hammered out there are a number of different approaches already being tinkered with (like Sebastien Han’s setup with DRBD or the upcoming work using RGW).

However, since one of the primary focuses in managing a cloud is the manipulation of images, the solution to disaster recovery and general backup can often be quite simplistic. Incremental snapshots can fill this, and several other, roles quite well. To that end I wanted to share a few thoughts from RBD developer Josh Durgin for those of you who may have missed his great talk at the OpenStack Developer Summit a few weeks ago.

For the purposes of disaster recovery, the idea is that you could run two simultaneous Ceph clusters in different geographic locations and instead of copying a new snapshot each time, you could simply generate and transfer a delta. The incantation would look something like this:

rbd export-diff --from-snap snap1 pool/image@snap2 pool_image_snap1_to_snap2.diff

This creates a simple binary file that stores the following information:

  • original snapshot name (if applicable)
  • end snapshot name
  • size of the image at ending snapshot
  • the diff between snapshots

The format of this file can be seen in the RBD doc.

After exporting a diff you could either simply back up the file somewhere offsite or import the diff on top of the existing image on a remote Ceph cluster.

rbd import-diff /path/to/diff backup_image

This will write the contents of the differential to the backup image and create a snapshot with the same name as the original ending snapshot. It will fail and do nothing if a snapshot with this name already exists. Since overwriting the same data is idempotent, it’s safe to have an import-diff interrupted in the middle.

These commands can work with stdin and stdout as well, so you could do something like:

rbd export-diff --from-snap snap1 pool/image@snap2 - | ssh user@second_cluster rbd import-diff - pool2/image

You can see which extents changed (in plain text, json, or xml) via:

rbd diff --from-snap snap1 pool/image@snap2 --format plain

There are a couple of limitations in the current implementation, however.

  1. There’s no guarantee you’re importing a diff onto an image in the right state (i.e. the same image at the same snapshot as the diff was exported from).
  2. There’s no way to inspect the diff files to see what snapshots they refer to, so you’d have to depend on the filename containing that information.

While the implementation is still relatively simple, you can see how this could be quite useful in managing not only cloud images, but any of your Ceph block devices. This functionality hit the streets with the recent ‘cuttlefish‘ stable release, but if you have questions or enhancement requests please let us know.

To learn more about some of the new things coming in future versions of Ceph you can check out the current published roadmap of work Inktank is planning on contributing. Also if you missed the virtual Ceph Developer Summit, the videos have been posted for review. In the meantime, if you have questions, comments, or anything for the good of the cause feel free to stop by our irc channel or drop a note to one of the mailing lists.

scuttlemonkey out

by scuttlemonkey at May 14, 2013 02:02 PM

eNovance

Swiftsync – A way to synchronize two swift clusters

We’ve been faced to a challenge to migrate and synchronize two swift clusters in order to provide a customer a way to handle a swift migration easily.

 

For that we have created a project called swiftsync hosted in github: https://github.com/enovance/swiftsync

The swiftsync project provides two binaries called :

 

  • swsync (The synchronizer)
  • swfiller (A swift filler)

 

swfiller is a tool designed to ease you the filling process of a swift cluster. This tool can be useful for testing the swsync capabilities on a test platform before using it on a production platform.

 

The second tool called swsync is the synchronization tool. It will migrate all swift content from an origin server to a destination server using only the API of the both proxy servers. Content that will be synchronized is the following :

  • account and account metadata
  • container and container metadata
  • object and object metadata

 

swsync will take care of data (accounts metadata, containers/objects, containers/objects metadata) that are already stored on destination cluster in order to speed up and optimize data copying. Indeed only data that is more recent on origin will be synchronized. The first run of swsync will migrate all data from origin to destination but next runs will only migrate modified data. The process is as follow :

  • will synchronize account metadata if they has changed on origin
  • will delete container on destination if no longer exists on origin
  • will create container on destination if not exists
  • will synchronize destination container metadata if not same as origin container.
  • will remove container object if no longer exists in origin container
  • will synchronize object and metadata object if the last-modified header is the latest on the origin.

 

swsync has been designed to be run and run again and not ensuring that the first pass goes well, if for example there is a network failure swsync will just skip it and hope to do it on the next run. So the tool can for instance be launched by a cron job to perform a diff synchronization each night.

 

swsync will need to use a user that own the ResellerAdmin role. This role will let the user perform all kind of API operations on all swift account so swsync will be able to explore all origin and destination accounts to evaluate which data it need to synchronize.

 

Both tools come with unit and functional tests. Unit tests are managed by tox as all openstack projects.

Important thing to mention is that the synchronization tool is currently a work in progress and has not been really tested on a cluster that own a huge amount of data.

 

Have a look to the project README file for further informations about the project.

by Fabien at May 14, 2013 07:40 AM

May 13, 2013

Daniel P. Berrangé

A new (configurable) cgroups layout for libvirt with QEMU, KVM & LXC

Several years ago I wrote a bit about libvirt and cgroups in Fedora 12. Since that time, much has changed, and we’ve learnt alot about the use of cgroups, not all of it good.

Perhaps the biggest change has been the arrival of systemd, which has brought cgroups to the attention of a much wider audience. One of the biggest positive impacts of systemd on cgroups, has been a formalization of how to integrate with cgroups as an application developer. Libvirt of course follows these cgroups guidelines, has had input into their definition & continues to work with the systemd community to improve them.

One of the things we’ve learnt the hard way is that the kernel implementation of control groups is not without cost, and the way applications use cgroups can have a direct impact on the performance of the system. The kernel developers have done a great deal of work to improve the performance and scalability of cgroups but there will always be a cost to their usage which application developers need to be aware of. In broad terms, the performance impact is related to the number of cgroups directories created and particularly to their depth.

To cut a long story short, it became clear that the directory hierarchy layout libvirt used with cgroups was seriously sub-optimal, or even outright harmful. Thus in libvirt 1.0.5, we introduced some radical changes to the layout created.

Historically libvirt would create a cgroup directory for each virtual machine or container, at a path $LOCATION-OF-LIBVIRTD/libvirt/$DRIVER-NAME/$VMNAME. For example, if libvirtd was placed in /system/libvirtd.service, then a QEMU guest named “web1″ would live at /system/libvirtd.service/libvirt/qemu/web1. That’s 5 levels deep already, which is not good.

As of libvirt 1.0.5, libvirt will create a cgroup directory for each virtual machine or container, at a path /machine/$VMNAME.libvirt-$DRIVER-NAME. First notice how this is now completely disassociated from the location of libvirtd itself. This allows the administrator greater flexibility in controlling resources for virtual machines independently of system services. Second notice that the directory hierarchy is only 2 levels deep by default, so a QEMU guest named “web” would live at /machine/web1.libvirt-qemu

The final important change is that the location of virtual machine / container can now be configured on a per-guest basis in the XML configuration, to override the default of /machine. So if the guest config says

  <resource>
    <partition>/virtualmachines/production</partition>
  </resource>

then libvirt will create the guest cgroup directory /virtualmachines.partition/production.partition/web1.libvirt-qemu. Notice that there will always be a .partition suffix on these user defined directories. Only the default top level directories /machine, /system and /user will be without a suffix. The suffix ensures that user defined directories can never clash with anything the kernel will create. The systemd PaxControlGroups will be updated with this & a few escaping rules soon.

There is still more we intend todo with cgroups in libvirt, in particular adding APIs for creating & managing these partitions for grouping VMs, so you don’t need to go to a tool outside libvirt to create the directories.

One final thing, libvirt now has a bit of documentation about its cgroups usage which will serve as the base for future documentation in this area.

by Daniel Berrange at May 13, 2013 08:59 PM

Russell Bryant

OpenStack Compute (Nova) Roadmap for Havana

The Havana design summit was held mid-April.  Since then we have been documenting the Havana roadmap and going full speed ahead on development of these features.  The list of features that developers have committed to completing for the Havana release is tracked using blueprints on Launchpad. At the time of writing, we have 74 blueprints listed that cover a wide range of development efforts.  Here are some highlights in no particular order:

Database Handling

Vish Ishaya made a change at the very beginning of the development cycle that will allow us to backport database migrations to the Grizzly release if needed. This is needed in case we need to backport a bug fix that requires a migration.

Dan Smith and Chris Behrens are working on a unified object model. One of the things that has been in the way of rolling upgrades of a Nova deployment is that the code and the database schema are very tightly coupled. The primary goal of this effort is to decouple these things. This effort is bringing in some other improvements, as well, including better object serialization handling for rpc, as well as object versioning.

Boris Pavlovic continues to do a lot of cleanup of database support in Nova.  He’s adding tests (and more tests), adding unique constraints, improving session handling, and improving archiving.

Chris Behrens has been working on a native MySQL database driver that performs much better than the SQLAlchemy driver for use in large scale deployments.

Mike Wilson is working on supporting read-only database slaves. This will allow distributing some queries to other database servers to help scaling in large scale deployments.

Bare Metal

The Grizzly release of Nova included the bare metal provisioning driver. Interest in this functionality has been rapidly increasing. Devananda van der Veen proposed that the bare metal provisioning code be split out into a new project called Ironic. The new project was approved for incubation by the OpenStack Technical Committee last week. Once this has been completed, there will be a driver in Nova that talks to the Ironic API. The Ironic API will present some additional functionality that doesn’t make sense to use to present in the Compute API in Nova.

Prior to the focus shift to Ironic, some new features were added to the bare metal driver. USC-ISI added support for Tilera and Devananda added a feature that allows you to request a specific bare metal node when provisioning a server.

Version 3 (v3) of the Compute API

The Havana release will include a new revision of the compute REST API in Nova. This effort is being led by Christopher Yeoh, with help from others. The v3 API will include a new framework for implementing extensions, extension versioning, and a whole bunch of cleanup: (1) (2) (3) (4).

Networking

The OpenStack community has been maintaining two network stacks for some time. Nova includes the nova-network service. Meanwhile, the OpenStack Networking project has been developed from scratch to support much more than nova-network does. Nova currently supports both. OpenStack Networking is expected to reach and surpass feature parity with nova-network in the Havana cycle. As a result, it’s time to deprecate nova-network. Vish Ishaya (from the Nova side) and Gary Kotton (from the OpenStack Networking side) have agreed to take on the challenging task of figuring out how to migrate existing deployments using nova-network to an updated environment that includes OpenStack Networking.

Scheduling

The Havana roadmap includes a mixed bag of scheduler features.

Andrew Laski is going to make the changes required so that the scheduler becomes exclusively a resource that gets queried. Currently, when starting an instance, the request is handed off to the scheduler, which then hands it off to the compute node that is selected. This change will make it so proxying through nova-scheduler is no longer done. This will mean that every operation that uses the scheduler will interact with it the same way, as opposed to some operations querying and others proxying.

Phil Day will be adding an API extension that allows you to discover which scheduler hints are supported.  Phil is also looking at adding a way to allocate an entire host to a single tenant.

Inbar Shapira is looking at allowing multiple scheduling policies to be in effect at the same time.  This will allow you to have different sets of scheduler filters activated depending on some type of criteria (perhaps the requested availability zone).

Rerngvit Yanggratoke is implementing support for weighting scheduling decisions based on the CPU utilization of existing instances on a host.

Migrations

Nova includes support for different types of migrations. We have cold migrations (migrate) and live migrations (live-migrate). We also have resize and evactuate, which are very related functions. The code paths for all of these features have evolved separately. It turns out that we can rework all of these things to share a lot of code. While we’re at it, we are restructuring the way these operations work to be primarily driven by the nova-conductor service.  This will allow the tasks to be tracked in a single place, as opposed to the flow of control being passed around between compute nodes. Having compute nodes tell each other what to do is also a very bad thing from a security perspective. These efforts are well underway. Tiago Rodrigues de Mello is working on moving cold migrations to nova-conductor and John Garbutt is working on moving live migrations. All of this is tracked under the parent blueprint for unified migrations.

And More!

This post doesn’t include every feature on the roadmap. You can find that here. I fully expect that more will be added to this list as Havana progresses. We don’t always know what features are being worked on in advance. If you have another feature you would like to propose, let’s talk about it on the openstack-dev list!


by russellbryant at May 13, 2013 07:02 PM

May 11, 2013

Loïc Dachary

Disaster recovery on host failure in OpenStack

The host bm0002.the.re becomes unavailable because of a partial disk failure on an Essex based OpenStack cluster using LVM based volumes and multi-host nova-network. The host had daily backups using rsync / and each LV was copied and compressed. Although the disk is failing badly, the host is not down and some reads can still be done. The nova services are shutdown, the host disabled using nova-manage and an attempt is made to recover from partially damaged disks and LV, when it leads to better results than reverting to yesterday’s backup.

restoring an instance from backup

The host is marked as unavailable

nova-manage service disable --host=bm0002.the.re --service=nova-compute
nova-manage service disable --host=bm0002.the.re --service=nova-network
nova-manage service disable --host=bm0002.the.re --service=nova-volume

and shows as such when listed

# nova-manage service list --host=bm0002.the.re
Binary           Host    Zone Status     State Updated_At
nova-compute     bm0002.the.re  bm0002  disabled   XXX   2013-05-11 09:18:25
nova-network     bm0002.the.re  bm0002  disabled   XXX   2013-05-11 09:18:30
nova-volume      bm0002.the.re  bm0002  disabled   XXX   2013-05-11 09:18:33

It can be removed completely later by modifying the mysql database directly. The april-ci instance was running on bm0002.the.re:

# nova list --name april-ci
+--------------------------------------+----------+---------+--------------------------------------+
|                  ID                  |   Name   |  Status |               Networks               |
+--------------------------------------+----------+---------+--------------------------------------+
| 4e8a8126-b27d-4c9e-abeb-4dc574c54254 | april-ci | SHUTOFF | novanetwork=10.145.9.5, 176.31.18.26 |
+--------------------------------------+----------+---------+--------------------------------------+

It is artificially moved to a host that is enabled:

mysql -e "update instances set host = 'bm0001.the.re', availability_zone = 'bm0001' where hostname = 'april-ci'" nova

and deleted

nova delete april-ci

Assuming the content of failed host was backed up entirely ( i.e. rsync / ), the april-ci disk is located using the id shown above as the output of nova list

# grep 4dc574c54254 /var/lib/nova/instances/*/*.xml
/var/lib/nova/instances/instance-000001de/libvirt.xml:    <uuid>4e8a8126-b27d-4c9e-abeb-4dc574c54254</uuid>

and the corresponding disk is turned into a minimal file system

chroot /backup/bm0002.the.re
mount -t proc none /proc
qemu-nbd --port 20000 /var/lib/nova/instances/instance-000001de/disk &
nbd-client localhost 20000 /dev/nbd0
pv /dev/nbd0 > april-ci.april-ci.img
fsck -fy $(pwd)/april-ci.april-ci.img
resize2fs -M april-ci.april-ci.img
exit

and uploaded to glance, using the same kernel and initrd, as shown with nova image-show original-image-of-april-ci

glance add name="april-ci-2013-05-11" disk_format=ami container_format=ami \
 kernel_id=2e714ea3-45e5-4bb8-ab5d-92bfff64ad28 \
 ramdisk_id=6458acca-24ef-4568-bb2b-e52322a5a11c < /backup/bm0002.the.re/april-ci.april-ci.img

it is then rebooted using the same flavor

nova boot --image 'april-ci-2013-05-11' \
  --flavor e.1-cpu.10GB-disk.1GB-ram \
  --key_name loic --availability_zone=bm0001 --poll april-ci

recovering from a partially damaged logical volume

A 30GB volume contains bad blocks toward the end ( after 26GB ) but it was not full. A fsck is run on a copy of the disk to check how much the recovery process would lose. It turns out to be less than a hundred files in a non-critical area. A new disk of the same size is allocated on another machine with

# euca-create-volume --zone bm0001 --size 30
VOLUME  vol-0000005b    30      bm0001  creating        2013-05-11T11:22:19.889Z

and the content of the damaged volume are copied over, until it fails with an I/O error.

ssh -A root@bm0001.the.re
ssh bm0002.the.re pv /dev/nova-volumes/volume-00000143 | \
 pv > /dev/nova-volumes/volume-0000005b

and it is repaired

fsck -fy /dev/nova-volumes/volume-0000005b

The volume residing on the failed host is removed directly from the database

mysql -e "update volumes set deleted = 1 where id = 30" nova

recovering from a partially damanged instance disk

An instance disk has a few failed blocks and may be recovered if the others are copied over. Because rsync is more resilient to I/O errors than dd or pv, it is used to recover as much as possible with:

# ssh -A root@bm0002.the.re
# rsync --inplace --progress /var/lib/nova/instances/instance-00000089/disk root@bm0001.the.re:/backup/bm0002.the.re/var/lib/nova/instances/instance-00000089/disk
  1843396608 100%    8.41MB/s    0:03:28 (xfer#1, to-check=0/1)
rsync: read errors mapping "/mnt/var/lib/nova/instances/instance-00000089/disk": Input/output error (5)
WARNING: disk failed verification -- update retained (will try again).
disk
  1843396608 100%   37.37MB/s    0:00:47 (xfer#2, to-check=0/1)
rsync: read errors mapping "/var/lib/nova/instances/instance-00000089/disk": Input/output error (5)
ERROR: disk failed verification -- update retained.
sent 1843836447 bytes  received 858892 bytes  7000741.32 bytes/sec
total size is 1843396608  speedup is 1.00
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1070) [sender=3.0.9]

It is then turned into a file using nbd as shown above and checked for errors:

# fsck -fy $(pwd)/openstack.jenkins.img
fsck from util-linux 2.20.1
e2fsck 1.42.5 (29-Jul-2012)
/openstack.jenkins.img: recovering journal
Clearing orphaned inode 117551 (uid=0, gid=0, mode=0100644, size=0)
Clearing orphaned inode 9764 (uid=0, gid=0, mode=0100644, size=1393052)
Clearing orphaned inode 9765 (uid=0, gid=0, mode=0100644, size=302040)
Clearing orphaned inode 7050 (uid=105, gid=109, mode=0100644, size=0)
Clearing orphaned inode 8841 (uid=0, gid=0, mode=0100644, size=81800)
Clearing orphaned inode 10235 (uid=0, gid=0, mode=0100644, size=253328)
Clearing orphaned inode 10240 (uid=0, gid=0, mode=0100644, size=180624)
Clearing orphaned inode 8840 (uid=0, gid=0, mode=0100644, size=874608)
Clearing orphaned inode 6469 (uid=0, gid=0, mode=0100755, size=1245180)
Clearing orphaned inode 10739 (uid=0, gid=0, mode=0100644, size=18192)
Clearing orphaned inode 10927 (uid=0, gid=0, mode=0100644, size=19908)
Clearing orphaned inode 10754 (uid=0, gid=0, mode=0100644, size=100820)
Clearing orphaned inode 10738 (uid=0, gid=0, mode=0100644, size=11468)
Clearing orphaned inode 10926 (uid=0, gid=0, mode=0100644, size=31568)
Clearing orphaned inode 10956 (uid=0, gid=0, mode=0100644, size=18780)
Clearing orphaned inode 10958 (uid=0, gid=0, mode=0100644, size=22312)
Clearing orphaned inode 10723 (uid=0, gid=0, mode=0100644, size=13976)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (2299561, counted=2283092).
Fix? yes
Free inodes count wrong (538192, counted=534536).
Fix? yes
/openstack.jenkins.img: ***** FILE SYSTEM WAS MODIFIED *****
/openstack.jenkins.img: 52984/587520 files (0.3% non-contiguous), 338348/2621440 blocks

If the lossage is better than recovering from yesterday's backup, the instance is rebooting using this copy.

by Loic Dachary at May 11, 2013 02:09 PM

May 10, 2013

OpenStack Blog

OpenStack Community Weekly Newsletter (May 3 – 10)

OpenStack 2013.1.1 released

2013.1.1 release, the latest in the series of stable releases. These releases are bugfix updates to Grizzly and are intended to be relatively risk free with no intentional regressions or API changes. A total of 85 bugs have been fixed in this release.

OpenStack Grizzly documentation released

We have released a version of the OpenStack official documentation for grizzly and it is now available at http://docs.openstack.org/grizzly. We continue to update docs through our continuous publishing process so feedback is always welcome. If you have questions about how OpenStack documentation is maintained or would like to get involved, see http://wiki.openstack.org/Documentation/HowTo. We had nearly 80 contributors to the documentation for the Grizzly release. Thanks to everyone who helped create and maintain accurate information for OpenStack.

Guidelines for answering question on Ask

It’s time we start collecting guidelines for the moderators so we keep having a very informative tool, with consistently good questions and answers. This wiki page hosts the draft of the Guidelines for Moderators https://wiki.openstack.org/wiki/Community/AskModerators. Comments welcome before they’re moved to their natural home on Ask OpenStack.

Discussions at Breakfast with the Board – OpenStack April 2013 Summit

The summary written by OpenStack Foundation’s Board of Directors of the things discussed during the Breakfast with the Bard in Portland, ranging from marketing to wifi, transparency to elections to what makes a contribution. A must read.

Use existing RBD images and put it into Glance

What if Glance, the OpenStack Image Service, was capable of converting images within its store, say from QCOW2 image to a RAW? Waiting for this capability to be added, Sébastien Han plays with a scenario where you have a KVM cluster backed by a Ceph Cluster and your CTO wants you to migrate the whole environment to OpenStack. Science fiction in action.

Security Issues

OpenStack In The Wild

A new section of the weekly newsletter dedicated to users of OpenStack. If you want to showcase how OpenStack helps you (or you know somebody that uses OpenStack) please let us know: email, twitter, reddit or avian carrier will do). More content from Portland Summit:

Upcoming Events

Other News

Welcome New Developers

  • Matt Wagner, Redhat

Got answers?

Ask OpenStack is the go-to destination for OpenStack users. Interesting questions waiting for answers:

The weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

 

by Stefano Maffulli at May 10, 2013 11:44 PM

Julien Danjou

Rant about Github pull-request workflow implementation

One of my recent innocent tweet about Gerrit vs Github triggered much more reponses and debate that I expected it to. I realize that it might be worth explaining a bit what I meant, in a text longer than 140 characters.

<script async="async" charset="utf-8" src="http://platform.twitter.com/widgets.js"></script>

The problems with Github pull-requests

I always looked at Github from a distant eye, mainly because I always disliked their pull-request handling, and saw no value in the social hype it brings. Why?

One click away isn't one click effort

The pull-request system looks like an incredible easy way to contribute to any project hosted on Github. You're a click away to send your contribution to any software. But the problem is that any worthy contribution isn't an effort of a single click.

Doing any proper and useful contribution to a software is never done right the first time. There's a dance you will have to play. A slowly rhythmed back and forth between you and the software maintainer or team. You'll have to dance it until your contribution is correct and can be merged.

But as a software maintainer, not everybody is going to follow you on this choregraphy, and you'll end up with pull-request you'll never get finished unless you wrap things up yourself. So the gain in pull-requests here, isn't really bigger than a good bug report in most cases.

This is where the social argument of Github isn't anymore. As soon as you're talking about projects bigger than a color theme for your favorite text editor, this feature is overrated.

Contribution rework

If you're lucky enough, your contributor will play along and follow you on this pull-request review process. You'll make suggestions, he will listen and will modify his pull-request to follow your advice.

At this point, there's two technics he can use to please you.

Technic #1: the Topping

Github's pull-requests invite you to send an entire branch, eclipsing the fact that it is composed of several commits. The problem is that a lot of one-click-away contributors do not masterize Git and/or do not make efforts to build a logical patchset, and nothing warns them that their branch history is wrong. So they tend to change stuff around, commit, make a mistake, commit, fix this mistake, commit, etc. This kind of branch is composed of the whole brain's construction process of your contributor, and is a real pain to review. To the point I quite often give up.

<figure> <figcaption> A typical case: 3 commits to build a 4 lines long file. </figcaption> </figure>

Without Github, the old method that all software used, and that many software still use (e.g. Linux), is to send a patch set over e-mail (or any other medium like Gerrit). This method has one positive effect, that it forces the contributor to acknowledge the list of commits he is going to publish. So, if the contributor he has fixup commits in his history, they are going to be seen as first class citizen. And nobody is going to want to see that, neither your contributor, nor the software maintainers. Therefore, such a system tend to push contributors to write atomic, logical and self-contained patchset that can be more easily reviewed.

Technic #2: the History Rewriter

This is actually the good way to build a working and logical patchset using Git. Rewriting history and amending problematic patches using the famous git rebase --interactive trick.

The problem is that if your contributor does this and then repush the branch composing your pull-request to Github, you will both lose the previous review done, each time. There's no history on the different versions of the branch that has been pushed.

In the old alternative system like e-mail, no information is lost when reworked patches are resent, obviously. This is far better because it eases the following of the iterative discussions that the patch triggered.

Of course, it would be possible for Github to enhance this and fix it, but currently it doesn't handle this use case correctly..

<figure> <figcaption> Exercise for the doubtful readers: good luck finding all revisions of my patch in the pull-request #157 of Hy.</figcaption> </figure>

A quick look at OpenStack workflow

It's not a secret for anyone that I've been contributing to OpenStack as a daily routine for the last 18 months. The more I contribute, the more I like the contribution workflow and process. It's already well and longly described on the wiki, so I'll summarize here my view and what I like about it.

Gerrit

To send a contribution to any OpenStack project, you need to pass via Gerrit. This is way simpler than doing a pull-request on Github actually, all you have to do is do your commit(s), and type git review. That's it. Your patch will be pushed to Gerrit and available for review.

Gerrit allows other developers to review your patch, add comments anywhere on it, and score your patch up or down. You can build any rule you want for the score needed for a patch to be merged; OpenStack requires one positive scoring from two core developers before the patch is merged.

Until a patch is validated, it can be reworked and amended locally using Git, and then resent using git review again. That simple. The historic and the different version of the patches are available, with the whole comments. Gerrit doesn't lose any historic information on your workflow.

Finally, you'll notice that this is actually the same kind of workflow projects use when they work by patch sent over e-mail. Gerrit just build a single place to regroup and keep track of patchsets, which is really handy. It's also much easier for people to actually send patch using a command line tool than their MUA or git send-email.

Gate testing

Testing is mandatory for any patch sent to OpenStack. Unit tests and functionnals test are run for each version of each patch of the patchset sent. And until your patch passes all tests, it will be impossible to merge it. Yes, this implies that all patches in a patchset must be working commits and can be merged on their own, without the entire patchset going in! With such a restricution, it's impossible to have "fixup commits" merged in your project and pollute the history and the testability of the project.

Once your patch is validated by core developers, the system checks that there is not any merge conflicts. If there's not, tests are re-run, since the branch you are pushing to might have changed, and if everything's fine, the patch is merged.

This is an uncredible force for the quality of the project. This implies that no broken patchset can ever sneak in, and that the project pass always all tests.

Conclusion: accessibility vs code review

In the end, I think that one of the key of any development process, which is code review, is not well covered by Github pull-request system. It is, along with history integrity, damaged by the goal of making contributions easier.

Choosing between these features is probably a trade-off that each project should do carefully, considering what are its core goals and the quality of code it want to reach.

I tend to find that OpenStack found one of the best trade-off available using Gerrit and plugging testing automation via Jenkins on it, and I would probably recommend it for any project taking seriously code reviews and testing.

by Julien Danjou at May 10, 2013 05:55 PM

May 08, 2013

OpenStack Blog

Discussions at Breakfast with the Board – OpenStack April 2013 Summit

It is an exciting time to be part of the OpenStack community.  It was a great conference with lots of momentum around OpenStack.  The speed and growth of the community is amazing.

Tuesday morning during the Summit, we continued the tradition of Breakfast With The Board (BwtB).  We wish to thank all who participated.  As board members we very much appreciated your comments of support, feedback and ideas.  We heard many positive and encouraging comments  and participated in many lively discussions.

Through this writeup we would like to share what we heard.   There was a wide variety of topics discussed, including:

Summit Design Session Growing Pains
Despite a variety of changes tested and introduced over the past Summits, accommodating all who wish to participate in the Summit design sessions continue to exhibit growing pains.  The design sessions are “intended to be small, focused developer working sessions where the roadmap is set by active contributors on the project.”  With such a description it is easy to see why so many business persons, users, and developers want to participate or listen in. Yet the fear is that a varied large audience will decrease session output.

Many ideas were voiced at the BwtB as to how to address the issue, including room moderators, attendee prioritization, seating arrangements and session segregation.

“Scotty we need more power er WIFI”
While the conference survey will prioritize on  what items will be most relevant to improve for the next Summit, one of the vocal suggestion at the BwtB was the never ending need for more WIFI.  We techies live on WIFI.

Who the heck is…
Leading the list for reasons to attend the Summit is to simply meet people we work with on IRC and other community channels.  A simple suggestion was made that we add IRC nicks in a nice big font to the front and back of the conference badges. 50% of the time you see the back of someone’s badge and don’t know who they are.

Traveling to the Fall Summit
For those traveling to the fall Summit from the North America, concerns over prohibitive travel costs was raised.  Determining a Summit location is made up of many different factors.  Cost of travel being one.    A Summit location effects attendance, whether it be in Portland or Hong Kong.  Balancing that cost can be tricky.   The planning committee investigations concluded that attendees will find that the travel rates will not be the feared prohibitive if they do some research and book early.

Driving Priorities
Several discussions evolved around the idea of how customer priorities are injected into each projects focus and features. Typically in a corporate development model such interests are captured and formulated into the development model through Product Owners (PO) or Product Managers (PM).  How does this map to the OpenStack model?  Which is easily generalized to how does this map to the open source world?

At the BwtB, several of the discussions converged on the notion of contribution.  Contribution either in the form of code, leadership or voice.  One company simply cannot pretend to make choices for resources in another company. At most you can find other resources from a  company which share a problem you are helping describe and therefore solve.

A familiar saying in the open source world is “scratch the itch”.  It is this saying which has driven open source developers for years.  If you find a need that nothing out there can meet, write a solution yourself or better yet voice the need to help find those who share in the need and write a solution together contributing in ways that leverage your experience and expertise or providing support to those who can contribute for you.

Big Vision
Also discussed at the BwtB was the notion of having the TC play more of a role across the various projects, for things like security and API versioning, aligning and setting direction across the groups. Citing the need for the TC (or someone at least) to give more cross project consideration for:
API compatibility and consistency
architectural consistency
security
Input from Users to guide our path

Align the Doc
Opinions voiced concerns that the documentation lags the implementations.  So how do we  make the OpenStack documentation more up-to-date and improve quality and timelines?  That was the question raised by attendees at the BwtB.  Offered suggestions included a requirement for documentation changes to be checked in concurrent with the code, rather than just setting a flag that the doc’s might be effected.

What comprises OpenStack?
A couple of tables discussed the current progress around the current Core/Integrated/Incubated framework with input on moving forward; people seem to prefer the kernel/drivers analogy. There is confusion regarding the new approach to core-integrated-incubation,  what the differences are, who gets seats on the TC, etc.  Early and continued discussions at Technical Committee and Board on this are important for next phases of the effort. It is important  to ensure that the TC and Board sign off on all steps with formal statements by the foundation when we arrive at any and all conclusions.

Interoperability
There is a lot of interop interest. Folks at the BwtB seemed to be mostly happy with the refstack approach. They voiced opinions about whether API-based interop or same-codebase interop is appropriate in various projects and for having verification teams for plug-ins.

Marketing OpenStack
Where does OpenStack as the data center operating system model go?  How to support that?  Marketing discussions ranged across several of the tables.  Including a conversation at one of the tables  on how to best explain OpenStack to CIO/IT Directors. Participants in the discussion felt that the video overviews available on the OpenStack website as well as the user stories presented at the Summit Keynotes were of great help.

Others pondered why FUD is generated by open source competition with a  lack of sense of those for who their competition really should be (proprietary software).

And others voiced concern over perceptions around OpenStack.  These perceptions include, complexity, talent shortages, security gaps and that it takes too many people to run OpenStack.  Such perceptions create a barrier to adoption.

Transparency
The Board at its February meeting, launch a committee to improve transparency and foster collaboration between the foundation members and members of the board, technical committee, user committee and other committees.  Members of the committee took the opportunity to discuss, at their tables, the committee ideas and efforts.  Everyone is all for transparency and seeking a balance between transparency and compromising the strategic position of the project was accepted as an important consideration. The ombudsman and staggered release were seen as valid solutions.

Attendees also voiced the importance for direct participation within project processes.  It is important that the TC and board to listen to what the project have to say.

Elections
The Board at its February meeting also launched an effort to improve the Individual member election process.  The board members engaged in this effort took the opportunity to gather feedback at the BwtB on the ideas and efforts underway.  Many were pleased that a schedule for implementation of changes is being set and were pleased with the efforts so far.

Conclusion
As you can see there was a wide range of topics raised and discussed.  Each of which could be worthy of a full writeup on its own. As a board we appreciate the input.  We will delve into the issues further and will use this input to guide the prioritization of our efforts.  So again thank you for your participation. We look forward to the next BwtB at the fall Summit.

Regards,

OpenStack Board of Directors

by OpenStack Board at May 08, 2013 04:09 PM

May 07, 2013

Sébastien Han

Use existing RBD images and put it into Glance

The title of the article is not that explicit, actually I had trouble to find a proper one. Thus let me clarify a bit. Here is the context I was wondering if Glance was capable of converting images within its store. The quick answer is no, but I think such feature is worth to be implemented. Glance could be able to convert a QCOW2 image to a RAW format. Usually if you already have an image within let’s say a Ceph cluster (RBD), you have to download the image (since you probably don’t have the source image file anymore), then manually convert it with qemu-img (QCOW2 –> RAW) and eventually import it into Glance. Enough talk about this, I’ll address this in a future article. For now let’s stick to the first matter. Imagine that you have a KVM cluster backed by a Ceph Cluster and your CTO wants you to migrate the whole environment to OpenStack because it’s trendy (joking, OpenStack just rocks!). You’re not going to backup all your images and then build a new cluster or something like that, you might want OpenStack (Glance) to be aware of your Ceph cluster. Generally speaking you just have to connect Glance to one of your image pool. After this, the only thing to do is to create (it’s more registering the images ID and metadata than creating a new image) into Glance. No worries here’s the explanation. Longest introduction ever.

In this article, I’m assuming that Glance is already connected to Ceph and to the proper RBD pool. Before starting anything, please understand that within the current Grizzly stable branch, the RBD backend is not implemented. That’s funny because we don’t need that much to implement it. The bug report is on launchpad and the proposed feature is under review on Gerrit.

However if you want to enable the fix now:

  • Go to the line 278 of /opt/stack/glance/glance/api/v1/images.py
  • Then simply edit the line like so:
<figure class="code"><figcaption></figcaption>
1
for scheme in ['s3', 'swift', 'http', 'rbd']:
</figure>

Let’s test this!

Get the image size from the rbd client:

<figure class="code"><figcaption></figcaption>
1
2
3
4
5
6
$ rbd -p images info ubuntu-raw
rbd image 'ubuntu-raw':
size 2048 MB in 512 objects
order 22 (4096 KB objects)
block_name_prefix: rb.0.3ded.2eb141f2
format: 1
</figure>

Eventually create/register the new image:

<figure class="code"><figcaption></figcaption>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ glance image-create --size 2147483648 --name ubuntu-rbd --store rbd --disk-format raw --container-format ovf --location rbd://ubuntu-raw
+------------------+--------------------------------------+
| Property         | Value                                |
+------------------+--------------------------------------+
| checksum         | None                                 |
| container_format | ovf                                  |
| created_at       | 2013-05-06T15:29:26                  |
| deleted          | False                                |
| deleted_at       | None                                 |
| disk_format      | raw                                  |
| id               | 0d47c421-b079-44ff-bcc5-ee711d500512 |
| is_public        | False                                |
| min_disk         | 0                                    |
| min_ram          | 0                                    |
| name             | ubuntu-rbd-hack                      |
| owner            | 19292b3b597b4ecc9a41103cc312a42f     |
| protected        | False                                |
| size             | 2147483648                           |
| status           | active                               |
| updated_at       | 2013-05-06T15:29:26                  |
+------------------+--------------------------------------+
</figure>

R Note about the URI from the --location option, there are 2 way to build it, it can be:

  • rbd://<fsid>/<pool>/<image>/<snapshot>
  • rbd://<image-name> ; Glance will figured out the pool since you put it into the Glance configuration.

It either 1 or 4 field(s).


Of course the example was only with one image but the method will definitely work for a whole Ceph cluster with tons of images!

May 07, 2013 03:15 PM

May 06, 2013

Sébastien Han

HA from DevOops side: OpenStack summit video

It’s a bit late that I’m happy to share our talk (with Emilien Macchi) at the OpenStack Summit. Ok that was my first talk, so please be gentle ^. In the meantime, here the video. In this presentation, we shared 2 HA reference architectures for OpenStack.

<iframe src="http://www.youtube.com/embed/HJaLvid0X9U"></iframe>

Slides are available on Slideshare

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="446" marginheight="0" marginwidth="0" scrolling="no" src="http://www.slideshare.net/slideshow/embed_code/19054325" style="border:1px solid #CCC;border-width:1px 1px 0;margin-bottom:5px" width="595"></iframe>


See you in Hong Kong! Cheers!

May 06, 2013 02:52 PM

May 03, 2013

OpenStack Blog

OpenStack Community Weekly Newsletter (Apr 25 – May 3)

Introducing Murano: Bringing Windows Environments to OpenStack

In response to growing demand for deploying and running Windows based applications on OpenStack cloud, the team at Mirantis started Murano: a native OpenStack component that enables fast provisioning and operation of Windows Environments on demand.

Who Wrote OpenStack Grizzly Docs?

Sneaking a peek at the numbers for documentation along with the code should show us pointers about docs keeping up with code. Anne Gentle dives into the documentation with data and insights.

“I” release cycle naming

The next OpenStack summit will happen in Hong Kong. That creates a pretty challenging naming problem, since there is no word starting with “i’ in classic transliteration of Chinese words. So the Technical Committee is willing to bend the rules a little to extend the range of candidates… Feel free to add suggestions to the list on the wiki.

Stacker Voices: Monty Taylor, HP

Cloudscaling Engineering talked with Monty Taylor of HP (reaching rockstar status also with a wired.com profile this week) at the OpenStack Summit in Portland. Monty leads the CI (continuous innovation) project for OpenStack. In that role, he and his group have built testing systems that have made it possible for the OpenStack project to scale from a few dozen contributors for the Bexar release to more than 700 developers now pushing hundreds of patches daily to OpenStack. Watch the video on YouTube.

A little tracing hack

Timothy Daly at Yahoo! added metrics and tracing for OpenStack and released tomograph: a tool to see what and how OpenStack is doing behind the curtains.

Contribute to OpenStack Activity Board

We’ve released the complete documentation for OpenStack Insights, with binaries and source code downloadable from Sourceforge while the OpenStack Dash tools are the vanilla MetricsGrimoire set hosted on github. The code is free as in freedom so you’re welcome to play with it.

How to run pylint with few false positives

Testing your python code can get complex and with pylint, you will see false positives, meaning it will complain some lines as bugs that are actually correct. lintstack is designed to address this problem: reduce false positives from pylint as much as possible without sacrificing accuracy. Yun Mao describes how lintstack works.

Report from Previous Events

Tips and Tricks

OpenStack In The Wild

A new section of the weekly newsletter dedicated to users of OpenStack. If you want to showcase how OpenStack helps you (or you know somebody that uses OpenStack) please let us know: email, twitter, reddit or avian carrier will do). Meanwhile watch the keynotes from Portland Summit:

Upcoming Events

Other News

Welcome New Developers

  • Shawn Hartsock, VMwware
  • David Martin, redbrick health

Got answers?

Ask OpenStack is the go-to destination for OpenStack users. Interesting questions waiting for answers:

The weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

by Stefano Maffulli at May 03, 2013 11:48 PM

Rob Hirschfeld

We need better Gold Member criteria to help building OpenStack culture

bunny slippersDuring last OpenStack board meeting, we started a dialog that will be continued over the rest of the year.  It concerns how/if we should apply our criteria to measure the contributions of companies that are applying to become Gold members.

I believe that we should see many contribution “footprints” for companies in Foundation leadership positions.  These footprints do not have to be code in github: there are many visible ways to contribute to OpenStack including internal installs, delivered product, community meetups, open source support around code, service to the community through speaking and sponsoring and, of course, code too.

At this point in the OpenStack evolution, there is so much going on that it is easy to leave footprints because there are so many ways to engage.  Footprints are tangible evidence of community leadership and the currency of collaboration.  OpenStack thrives because we are committed to working together, being transparent in our actions and providing service to the project beyond our own needs.

I believe OpenStack Foundation’s new gold members will are great additions to our growing community; however, we need to be increasingly deliberate in accepting new Gold members to make sure that they have a history of demonstrating a culture of open source leadership and contribution.  

These applications deserve careful consideration for several reasons:

  1. there are a limited number of gold level positions (16 of the 24 are now occupied)
  2. there is no practical way to remove a gold member (but only 8 are elected to the board)
  3. there is a perception (by the applicants) that they gain additional credibility through gold membership
  4. gold and platinum members are the leaders of our community so everyone will models their behavior

It is important to remember that there is no limit or barrier (beyond $) to joining at the corporate sponsors level. So, being a gold member means that companies are seeking a broader leadership role in the project.

Over the next months, Simon Anderson (committee chair, Dreamhost) will be leading me and several other board members in an effort to refine of our Gold member review criteria.  I’ll post own list shortly and I’m interested in hearing from you about what type of “footprints” we should be considered in this process.


by Rob H at May 03, 2013 03:00 PM

May 02, 2013

Alessio Ababilov

OpenStack Summit April 2013: a First Experience

In April 2013, I have visited OpenStack summit in Portland, Oregon. It was my first OpenStack summit and my first trip to USA, so it was more that just impressive.

The summit was devoted to Havana being the next OpenStack release. As you can see, it would have been well-grounded to locate the summit somewhere in Florida, but Oregon has been ok, too, – just look at that forest in the vicinity of Pacific Ocean.

Somebody may find Portland a grey and distressing city, but I liked it. Portland is a green and tidy city very different from my native Kharkov.

A poster reminded how many companies take part in OpenStack.

You can imagine how exciting it was to see people that have reviewed my code all these years I take part in OpenStack; people that have written tons of commits themselves and whose everyday job is taking care of OpenStack and making global architecture decisions: Devin Carlen, Joe HeckDoug Hellmann, Thierry CarrezBrian Waldon, Mark Washenberger, Dean Troyer, and many others – that’s difficult to mention them all.

OpenStack reports could be roughly divided in two parts. Reports of the first kind were full of prominent slides – you could occasionally notice a unicorn.

But I preferred other reports -

…and I was not alone -

One of the main OpenStack-oriented directions in Grid Dynamics (my company) is Altai - a private cloud for developers. It consists of several parts:

  • OpenStack services (nova, glance, keystone);
  • Focus – the Altai Dashboard;
  • Nova DNS – a service giving DNS names to instances;
  • Nova Billing – a lightweight service for billing instances and images;
  • notifier daemons sending mail about instance state changes and hardware nodes state.

Altai services use a dedicated client library to make OpenStack API calls. This library contains the common parts of  novaclient and keystoneclient with several enhancements. I have written it in June 2012, but I had no support for this library in the community for a long time. Thanks to the summit, I took part in openstackclient discussion and got to know that such library is desirable, so, recently I have rewritten my old code thus releasing openstack.common.apiclient – a common client library - and making novaclient and keystoneclient use it. I would like to thank Dean Troyer who kindly reviewed my code and inspired my further work.

New Altai Dashboard version, Focus2, is a very smooth and modern-looking solution (take a look at its mockups) and it has a reference implementation, and Grid Dynamics delegation proposed it as a possible direction of Horizon evolution.

After visiting Doug Hellmann‘s introduction to Ceilometer, it became clearer that this project is a robust and powerful metering solution. It can give the second wind to Nova Billing that concentrates on billing only being a small (63 KiB vs 1.5 MiB of Ceilometer code) server with good test coverage.

Nova DNS is an event-based server that manages DNS records when instances are created, destroyed, started, or stopped. It can be used out of the box until moniker will be more feature-rich solution.

Looking at the summit, I see how interesting and significant it was for me and I am proud to be a member of OpenStack community together with intelligent and experienced people I have met.

Finally, I would like to thank my colleagues Dmitry Maslennikov and Nikita Savin who made my journey possible and so pleasant.


by aababilov at May 02, 2013 08:24 PM

Amar Kapadia

OpenStack Swift Comes of Age with the Grizzly Release

I recently wrote an EVault blog about the recent OpenStack Summit and the Coming of Age of Swift . The blog talks about the dynamics around Swift at the OpenStack Summit rather than talking about specific  feature of Grizzly (which has been covered by a number of blogs & articles). For example, I talk the various unconference sessions which were of very high quality. Please check out the blog.

by Amar Kapadia (noreply@blogger.com) at May 02, 2013 06:12 PM

OpenStack Blog

Germany, Israel & Hungary – OpenStack Events

Jonathan Bryce and a few members of the OpenStack Foundation team will be heading to Europe later this month to attend three key regional events. Jonathan and other noteworthy members of the OpenStack community will be speaking at each event. If you are in the area and would like to learn more about OpenStack or network with others in the community – please plan to attend!

Help us spread the word, and we hope to see you there!

Berlin, Germany – Friday, May 24

Screen Shot 2013-05-02 at 11.20.29 AM

OpenStack DACH Day 2013 will provide attendees with first­hand insights from OpenStack developers and enterprises that are successfully using OpenStack in production environments for both private and public clouds. The lineup includes speakers from industry leaders including:

  • Jonathan Bryce, OpenStack Foundation
  • Kurt Garloff, Deutsche Telekom AG
  • Monty Taylor, HP
  • Bernhard Wiedemann & Sascha Peilicke, SUSE
  • Muharem Hrnjadovic, Rackspace Cloud
  • Dr. Wolfgang Schulze, Inktank
  • Tobias Riedel, Netways
  • Dr. Udo Seidel, Amadeus Data Processing

Register to Attend:

  • When: Friday, May 24, 2013
  • Where: Berlin Fairgrounds (Messegelände unter dem Funkturm), Hall 7, as part of LinuxTag
  • Tickets: Registration is free, and there 200 tickets available at http://openstackdach2013.eventbrite.com

Tel Aviv, Israel – Monday, May 27

Screen Shot 2013-05-02 at 10.51.04 AM

Join the OpenStack community for the third OpenStack Israel event, co-organized by OpenStack community supporters IGTCloud and GigaSpaces. The event is sponsored by the OpenStack Foundation and includes speakers from across the OpenStack community. Hear about OpenStack’s newest Grizzly release from the source, deep-dive into the Quantum network, learn about the new Cinder storage, hear what others are doing with OpenStack technology with real-life case studies from Intel, Liveperson and Alcatel, and meet the top industry leaders from IBM, HP, Rackspace, RedHat, GigaSpaces, DreamHost, Radware, Ravello, Mirantis, Cloudsoft and Hastexo.

Register to Attend:

  • When: Monday, May 27, 2013
  • Where: Herzilya Arts Center at 15 Jabotinsky Street in Herzilya, Israel
  • Tickets: Registration is free, but there are only 300 tickets available, so register quickly! http://www.openstack-israel.org

Budapest, Hungary – Wednesday, May 29

Screen Shot 2013-05-02 at 10.50.39 AM

Join us for OpenStack CEE Day – a large-scale one day user conference for the Central & Eastern European region. Attendees will get insights to OpenStack from industry-leading keynote speakers, as well as user case studies, workshops and deep dive sessions. The OpenStack CEE Day welcomes users, prospective users, ecosystem members, partners, developers and everyone who is excited about OpenStack’s open source cloud innovation.

Register to Attend:

####

Check out the latest hastexo blog post about each of these events – It’s May. It must be OpenStack Month! 

Follow @OpenStack on Twitter for the latest news.

by cmassey at May 02, 2013 04:48 PM

Florian Haas

It's May. It must be OpenStack Month!

This month my own schedule, and that of hastexo, is full of OpenStack. Here are the details.

read more

by florian at May 02, 2013 04:03 PM

Cloudscaling Engineering

The Trouble With Link Bonding (LACP, LAG)

One of our favorite sayings at Cloudscaling is “Simplicity Scales.” This saying has a slightly-less-well-known coda, “Complexity Fails.”

Let’s walk through a real-world example of this.

.

Background

In Open Cloud System (OCS), our high-availability (HA) strategy for services that have persistent datastores is to use a UCARP IP to make sure that one and only one of the backend servers is active at any given time. Then we replicate data between all the backend servers so that if one fails, another can take over the UCARP VIP and the cloud continues operating normally. UCARP works basically like VRRP – multiple devices share a virtual IP address (VIP) and communicate using CARP to figure out which one of them should be active at any given time.

The typical server in an OCS installation has four NICs: one (1G) for hardware management (IPMI), one (1G) for systems management (PXEbooting, chef), and two 10G NICs. In our canonical network design, one of these 10G NICs is used for intra-cloud traffic between VMs and storage resources, and the other is used for external access for VMs to talk to the Internet (or other resources outside the cloud).

Here is a diagram of the standard network layout without bonding.



This is a simple and well-understood network design, easily implemented with standard networking models that have been around for decades. But there’s another option for how OCS can be deployed: using bonded interfaces on the servers and port channels on the switches to take those two 10G NICs and make them appear as a single 20G network link, and pass both intra-cloud and external traffic across that higher-bandwidth virtual link. Many of our customers have preferred this option, which in theory provides higher burst bandwidth and greater resilience to failure of a NIC.  Bonding sounds great, right?

Diagram of the network architecture with bonding.


The Trouble Begins

Let me tell you a story. It’s kind of a detective story. Like everything else in OCS, we do extensive testing of our HA/failover solutions, and during such testing we discovered some odd behavior when running in bonded interface mode. In most of our tests, failover worked great. When a node failed, the other node would take over. Because everything had been replicated from the active node, no data was lost. When the failed node comes back up, it’s supposed to see the broadcasts from the existing master and join the cluster as a backup. This happened most of the time in our tests, but in a certain environment we saw the wrong behavior, where a failed node would come up and take over as master. In some cases, this could happen before replication had finished, which is obviously a big problem. After a ton of time spent debugging and a lot of red herrings, we finally figured out what was happening. If you use the default values for UCARP configurations, you get the following behavior when a node comes up and joins an existing cluster:

  • new node listens for 3 seconds for an announcement from an existing master

  • if the new node does not hear such an announcement it promotes itself to master

  • also important, if a master node hears an announcement from another master, it will demote itself to backup IF the other master has a numerically higher IP address

 

Here’s what was happening. During the boot process on the new node, it was taking several seconds (more than three) for the port channel on the bonded interfaces to be setup between the server and the switch – until that happened each port had link, but no frames (or packets) were being passed. During this time, UCARP was starting and listening for announcements – announcements that it couldn’t see because they come over the bonded interface, which wasn’t working yet. After three seconds the node was declaring itself a master, then the port channel would finish coming up and now both the new node and the previous master see announcements from a second master. Because the new node has a numerically lower IP address the other master demotes itself and you wind up with the new node becoming master – potentially before it has replicated data back over from the previous master.

Following diagram depicts UCARP under normal conditions.



And under failure conditions.



We never saw this behavior with unbonded interfaces, because there is no setup delay for the network in that case. The new node comes up, starts UCARP, hears the announcement from the previous master, and joins the cluster as a backup just like it’s supposed to. We also didn’t see this behavior with all models of network switches – some set up the port channels faster than others, and as long as it takes less than three seconds for the port channel to start passing traffic to the node, we see the proper behavior. We only saw it with a certain network switch that took more than three seconds, and we only had that switch in one test environment.

 

Bringing It Home

So back to “simplicity scales, complexity kills.” Interface bonding and port channels are newer technologies than basic switching and routing, and their implementation is more complicated on both the server and the switch sides. Because they are newer and more complex, the implementations from one vendor to another differ in significant ways (and have different bugs). In this case the complexity introduced by bonding introduced a new failure mode that manifested in way that is extremely hard to diagnose. Relying on simpler (and older) technologies can prevent having to deal with these kinds of hard-to-diagnose problems. For example, in other parts of OCS we use ECMP at layer 3 to provide HA to servers. This is a time-tested and well-understood mechanism that has been used by ISPs for HA for decades. We’re planning on switching our existing UCARP implementations to such a mechanism in the future, for what should by now be obvious reasons. :)

 

The Moral Of The Story: Keep It Simple

The worst part about this story is that by adding something that was aimed at making the system more reliable (redundant NICs) we introduced a new failure mode (likely multiple new failure modes) that wound up making the system less reliable. This is unfortunately a common theme with HA strategies. What appears at first glance to be a great idea has unexpected (and often negative) consequences on the overall system. The best way to avoid this is to use the simplest and most time-tested strategies you can to keep your systems up and running.

Keep it simple, people. Simplicity scales.

by Paul Guth at May 02, 2013 02:28 PM

Adam Young

Kerberizing PostgreSQL with FreeIPA for Keystone

There are many factors to weight when choosing which relational database management system (RDBMS) to deploy for a given application.  One reason I have been working with PostgreSQL for Keystone is that it support Kerberos Authentication.

Why Kerberize Postgresql

Direct access to the RDBMS might be required for many reasons.

  • A shared instance between servers
  • The database might be in a large replicated cluster managed as a service for the enterprise
  • The database instance  might provide a read only snapshot of live data for reporting
  • Some applications might use the Database as a persistant RPC mechanism

In the case of OpenStack, we want to make Keystone highly available.  As such, each Keystone instance will not get its own database instance, but instead will share a back end.

Puppetized Install and Configuraton

From a shell prompt:

Yum install puppet puppet-server tar postgresql
puppet module install puppetlabs/postgresql

Create a site.pp file for applying gss api to the pg_hba.conf file:

class { 'postgresql::server':
  config_hash => {
    'ip_mask_deny_postgres_user' => '0.0.0.0/32',
    #do not explicitly set 'ip_mask_allow_all_users' 
    #and it will default to localhost only
    'listen_addresses'           => '*',
    'manage_redhat_firewall'     => true,
  },
}
postgresql::pg_hba_rule { 'allow application network to access app database':
  description => "Open up postgresql for access from 192.168.0/24",
  type => 'host',
  database => 'all',
  user => 'all',
  address => '192.168.0.0/24',
  auth_method => 'gss'
}

Apply it with

 puppet apply --verbose /root/site.pp

Check the postgres access controls in /var/lib/pgsql/data/pg_hba.conf
You need a line like this.

host    all     all     192.168.0.0/24  gss

Make sure you don’t have some other rule that will conflict with it. For example, In an earlier pass I had to comment out:

#host   all     all     0.0.0.0/0       md5
#host   all     all     ::1/128 md5

Which preceded it and were triggering a password request from the psql command.

Kerberos for Postgres: Create new service in IPA.

ipa service-add postgres/pg.openstack.freeipa.org
ipa-getkeytab -s ipa.openstack.freeipa.org -p postgres/pg.openstack.freeipa.org@OPENSTACK.FREEIPA.ORG  -k /var/lib/pgsql/data/pg.keytab
chown postgres:postgres /var/lib/pgsql/data/pg.keytab

Postgres Config

Edit Postgresql.conf

The information to do this is out of the Postgres manual

# Kerberos and GSSAPI
krb_server_keyfile = '/var/lib/pgsql/data/pg.keytab'
krb_srvname = 'postgres'
host    all     all     192.168.0.0/24  krb5

Firewall:

Either iptables open port 5432:

lokkit -p 5432:tcp

Or open it with firewall-cmd:

firewall-cmd --add-port=5432/tcp

To Test:

psql -h pg.openstack.freeipa.org -d keystone -U keystone

Run klist afterwards to see the Postgres service ticket:

Ticket cache: FILE:/tmp/krb5cc_1615800001
Default principal: keystone@OPENSTACK.FREEIPA.ORG

Valid starting     Expires            Service principal
05/02/13 03:31:28  05/03/13 03:31:28  krbtgt/OPENSTACK.FREEIPA.ORG@OPENSTACK.FREEIPA.ORG
05/02/13 03:31:33  05/03/13 03:31:28  postgres/pg.openstack.freeipa.org@OPENSTACK.FREEIPA.ORG

On the Keystone side install Postgres client libraries for Keystone

yum install python-psycopg2 postgresql

In /etc/keystone/keystone.conf

connection = postgresql://pg.openstack.freeipa.org/keystone?krbsrvname=postgres

Assuming you are going to run this for a non-interactive service, you will need a cron job to fetch the tgt on a regular basis.

crontab /etc/keystone/keystone.crontab
1 0,6,12,18 * * *   su - keystone -c "KRB5CCNAME=FILE:/tmp/krb5cc_1615800001 kinit keystone -k -t /var/kerberos/krb5/user/1615800001/client.keytab"

by Adam Young at May 02, 2013 02:21 PM

Cloudscaling Engineering

Stacker Voices: Monty Taylor, HP

We talked with Monty Taylor of HP at the OpenStack Summit in Portland. Monty is the automation and deployment lead for cloud at HP. He’s also a member of both the OpenStack Technical Committee and the OpenStack Foundation Board of Directors.

// Check out Monty’s feature profile by Cade Metz in Wired Enterprise yesterday. //

Monty TaylorMonty leads the CI (continuous innovation) project for OpenStack. In that role, he and his group have built testing systems that have made it possible for the OpenStack project to scale from a few dozen contributors for the Bexar release to more than 700 developers now pushing patches *daily* to the project.

Watch the video to learn more about:

  • OpenStack’s integrated code review system and gated commits

  • running the CI system as a single app across two public clouds, with resources donated by HP, Rackspace and eNovance

  • merging about 150 patches each day into the code base, and the 500+ that don’t make it

  • how gated commits interact with the CI system

  • using Google’s Gerrit code review system that feeds into Zuul for gating, which is connected to a Jenkins server with Gearman worker support for scaling

  • running tests in parallel with optimistic pipelining to save time

 Check out the video, below. Or, watch on YouTube.

<iframe allowfullscreen="" frameborder="0" height="360" src="http://www.youtube.com/embed/eqw4zxqPelc?feature=player_detailpage" width="640"></iframe>

by Randy Bias at May 02, 2013 12:00 PM

eNovance

Keystone and PKI Tokens

PKI tokens has been implemented in keystone by Adam Young and others and was shipped for the OpenStack grizlly release. It is available since the version 2.0 API of keystone.

 

PKI is a beautiful acronym to Public-key infrastructure which according to wikipedia defines it like this :

Public-key cryptography is a cryptographic technique that enables users to securely communicate on an insecure public network, and reliably verify the identity of a user via digital signatures.

 

As described more lengthy on this IBM blog post keystone will start to generate a public and a private key and store it locally.

 

When getting the first request the service (i.e: Swift) will go get the public certificate from keystone and store it locally for later use.

 

When the user is authenticated and a PKI token needs to be generated, keystone will take the private key and encrypt the token and the metadata (i.e: roles, endpoints, services).

 

The service by the mean of the auth_token middleware will decrypt the token with the public key and get the info to pass on to the service it set the *keystone.identity* WSGI environement variable to be used by the other middleware of the service in the paste pipeline.

 

The PKI tokens are then much more secure since the service can trust where the token is coming from and much more efficient since it doesn’t have to validate it on every request like done for UUID token.

Auth token

This bring us to the auth_token middleware. The auth token middleware is a central piece of software of keystone to provide a generic middleware for other python WSGI services to integrate with keystone.

 

The auth_token middleware was moved in grizzly to the python-keystoneclient package, this allows us to don’t have to install a full keystone server package to use it (remember this is supposed to be integrated directly in services).

You usually would add the auth_token middleware in your paste pipeline at the begining of it (there may be other middlewares before like logging, catch_errors and stuff so not quite the first one).

 
[filter:authtoken]
signing_dir = /var/cache/service
paste.filter_factory = keystoneclient.middleware.auth_token:filter_factory
auth_host = keystone_host
auth_port = keystone_public_port
auth_protocol = keystone_public_port
auth_uri = http://keystone_host:keystone_admin_port/
admin_tenant_name = service
admin_user = service_user
admin_password = service_password
 
 

There is much more options to the auth_token middleware, I invite you to refer to your service documentation and read a bit the top of the auth_token file here.

 

When the service get a request with a X-Auth-Token header containing a PKI token the auth middleware will intercept it and start to do some works.

 

It will validate the token by first md5/hexdigesting it, this is going to be the key in memcache as you may have seen the PKI token since containing all the metadatas can be very long and are too big to server as is for memcache.

 

It will check if we have the key in memcache and if not start verify the signed token.

 

Before everything the token is checked if it was revoked (see my previous article about PKI revoked tokens). The way it’s getting the revoked token is to first check if the token revocation list is expired (by default it will do a refresh for it every seconds).

 

If it need to be refreshed it will do a request to the url ‘/v2.0/tokens/revoked‘ with an admin token to the keystone admin interface and get the list of revoked tokens.

 

The list get stored as well on disk for easy retrieval.

 

If the token is not revoked it will convert the token to a proper CMS format and start verifying it.

 

Using the signing cert filename and the ca filename it will invoke the command line openssl CLI to do a cms -verify which will decode the cms token providing the decoded data. If the cert filename or the ca filename was missing it will fetch it again.

 

Fetching the signing cert will be done by doing a non authenticated query to the keystone admin url ‘/v2.0/certificates/signing‘. Same goes for the ca making a query to the keystone url ‘/v2.0/certificates/ca‘.

 

When we have the decoded data we can now build our environement variable for the other inside the environement variable call keystone.token_info this will be used next by the other services middleware. Bunch of new headers will be added to the request with for example the User Project ID Project Name etc..

 

The md5/hexdigest PKI token is then stored with the data inside memcache.

 

And that’s it, there is much more information on the IBM blog post and on Adam’s blog I am mentionning earlier.

by Chmouel at May 02, 2013 09:39 AM

Chmouel Boudjnah

Keystone and PKI tokens overview

PKI tokens has been implemented in keystone by Adam Young and others and was shipped for the OpenStack grizlly release. It is available since the version 2.0 API of keystone.

PKI is a beautiful acronym to Public-key infrastructure which according to wikipedia defines it like this :

Public-key cryptography is a cryptographic technique that enables users to securely communicate on an insecure public network, and reliably verify the identity of a user via digital signatures.

As described more lengthy on this IBM blog post keystone will start to generate a public and a private key and store it locally.

When getting the first request the service (i.e: Swift) will go get the public certificate from keystone and store it locally for later use.

When the user is authenticated and a PKI token needs to be generated, keystone will take the private key and encrypt the token and the metadata (i.e: roles, endpoints, services).

The service by the mean of the auth_token middleware will decrypt the token with the public key and get the info to pass on to the service it set the *keystone.identity* WSGI environement variable to be used by the other middleware of the service in the paste pipeline.

The PKI tokens are then much more secure since the service can trust where the token is coming from and much more efficient since it doesn’t have to validate it on every request like done for UUID token.

Auth token

This bring us to the auth_token middleware. The auth token middleware is a central piece of software of keystone to provide a generic middleware for other python WSGI services to integrate with keystone.

The auth_token middleware was moved in grizzly to the python-keystoneclient package, this allows us to don’t have to install a full keystone server package to use it (remember this is supposed to be integrated directly in services).

You usually would add the auth_token middleware in your paste pipeline at the begining of it (there may be other middlewares before like logging, catch_errors and stuff so not quite the first one).

[filter:authtoken]
signing_dir = /var/cache/service
paste.filter_factory = keystoneclient.middleware.auth_token:filter_factory
auth_host = keystone_host
auth_port = keystone_public_port
auth_protocol = keystone_public_port
auth_uri = http://keystone_host:keystone_admin_port/
admin_tenant_name = service
admin_user = service_user
admin_password = service_password

There is much more options to the auth_token middleware, I invite you to refer to your service documentation and read a bit the top of the auth_token file here.

When the service get a request with a X-Auth-Token header containing a PKI token the auth middleware will intercept it and start to do some works.

It will validate the token by first md5/hexdigesting it, this is going to be the key in memcache as you may have seen the PKI token since containing all the metadatas can be very long and are too big to server as is for memcache.

It will check if we have the key in memcache and if not start verify the signed token.

Before everything the token is checked if it was revoked (see my previous article about PKI revoked tokens). The way it’s getting the revoked token is to first check if the token revocation list is expired (by default it will do a refresh for it every seconds).

If it need to be refreshed it will do a request to the url ‘/v2.0/tokens/revoked‘ with an admin token to the keystone admin interface and get the list of revoked tokens.

The list get stored as well on disk for easy retrieval.

If the token is not revoked it will convert the token to a proper CMS format and start verifying it.

Using the signing cert filename and the ca filename it will invoke the command line openssl CLI to do a cms -verify which will decode the cms token providing the decoded data. If the cert filename or the ca filename was missing it will fetch it again.

Fetching the signing cert will be done by doing a non authenticated query to the keystone admin url ‘/v2.0/certificates/signing‘. Same goes for the ca making a query to the keystone url ‘/v2.0/certificates/ca‘.

When we have the decoded data we can now build our environement variable for the other inside the environement variable call keystone.token_info this will be used next by the other services middleware. Bunch of new headers will be added to the request with for example the User Project ID Project Name etc..

The md5/hexdigest PKI token is then stored with the data inside memcache.

And that’s it, there is much more information on the IBM blog post and on Adam’s blog I am mentionning earlier.

by chmouel at May 02, 2013 08:00 AM

OpenStack Blog

Contribute to OpenStack Activity Board

We’ve released the complete documentation for OpenStack Insights, with binaries and source code downloadable from Sourceforge while the OpenStack Dash tools are the vanilla MetricsGrimoire set hosted on github. The code is free as in freedom so you’re welcome to play with it. We’re working to put both pieces of code in the hands of the OpenStack Infrastructure team soon.

Following up on the long session hosted during the  Summit in Portland and 1-on-1 discussions, I’ve created a new topic on the Development mailing list.  You can join the conversations about OpenStack metrics and the Activity Board  avoiding the high volume traffic on the Development list by subscribing only to the Metrics topic. You’ll receive only messages that have the words [metrics] or [activity] in the subject and nothing else.  Go to http://lists.openstack.org/cgi-bin/mailman/options/openstack-dev to subscribe and pick “Metrics” among the topic categories you would like to subscribe to.

If you want to know how the OpenStack Activity Board can help you understand your team’s activities in the project, build reports, integrate data from different sources, join the webinar we’re hosting on May 9th. We’ll keep ironing out the known issues while we think about the future of the platform.

by Stefano Maffulli at May 02, 2013 12:29 AM

April 30, 2013

Mirantis

Introducing Murano: Bringing Windows Environments to OpenStack

At Mirantis, within our customer and partner base we see growing demand for deploying and running Windows based applications on OpenStack cloud. Today we are pleased to announce the forthcoming Murano project, which is designed to ease this process.  The main goal of this initiative is to create a native OpenStack component that enables fast [...]

The post Introducing Murano: Bringing Windows Environments to OpenStack appeared first on Mirantis.

by Georgy Okrokvertskhov at April 30, 2013 09:22 PM

Giulio Fidente

OpenStack Cinder - Add more volume nodes

With this being the first of a short series, I'd like to publish some articles intendend to cover the required steps to configure Cinder (OpenStack block storage service) in a mid/large deployment scenario. The idea is to discuss at least three topics: how to scale the service by adding more volume nodes; how to ensure high-availablity for the API and Scheduler sub-services; leverage the multi-backend feature landed in Grizzly.

I'm starting with this post on the scaling issue first. Cinder is composed of three main parts, the API server, the scheduler and the volume service. The volume service is some sort of abstraction layer between the API and the actual resources provider.

By adding more volume nodes into the environment you will be able to increase the total offering of block storage to the tenants. Each volume node can either provide volumes by allocating them locally or on a remote container like an NFS or GlusterFS share.

Some assumptions before getting into the practice:

  • you're familiar with the general OpenStack architecture
  • you have at least one Cinder node configured and working as expected

First thing to do on the candidate node is to install the required packages. I'm running the examples on CentOS and using the RDO repository which makes this step as simple as:

# yum install openstack-cinder

If you plan to host new volumes using the locally available storage dont' forget to create a volume group called cinder-volumes (the name can be configured via the cinder_volume parameter). Also don't forget to configure the tgtd to include the config files created dynamically by Cinder. Add a line like the following:

include /etc/cinder/volumes/*

in your /etc/tgt/targets.conf file. Now enable and start the tgtd service:

# chkconfig tgtd on
# service tgtd start

Amongst the three init services installed by openstack-cinder you only need to run openstack-cinder-volume, which gets configured in /etc/cinder/cinder.conf. Configure it to connect to the existing Cinder database (the db in use by the pre-existing node) and to the existing AMQP broker (again, in use by the pre-existing node) by setting the following:

sql_connection=mysql://cinder:${CINDER_DB_PASSWORD}@${CINDER_DB_HOST}/cinder
qpid_hostname=${QPIDD_BROKER}

Set the credentials if needed and/or change the rpc_backend setting if you're not using Qpid as your message broker. One more setting, not really required to change but worth checking if you're using the local resources:

iscsi_ip_address=${TGTD_IP_ADDRESS}

That should match the public ip address of the volume node just installed. The iSCSI targets created locally using tgtadm/tgtd have to be reachable by the Nova nodes. The IP address of each target is stored in the database with every volume created. The iscsi_ip_address prameter sets what is the IP address to be given to the initiators.

At this point you should be ready to start the volume service:

# service openstack-cinder-volume start

Verify that it started by checking the logs (/var/log/cinder/volume.log) or by issueing on any Cinder node:

# cinder-manage host list

you should see all of your volume nodes listed. From now on you can create new volumes as usual and they will be allocated on any of the volume nodes, keep in mind that the scheduler will default to the node with the most space available.

by Giulio Fidente at April 30, 2013 12:00 AM

April 29, 2013

Adam Young

Securing OpenStack with FreeIPA

I gave a talk at the OpenStack summit in Portland about using FreeIPA to secure OpenStack. You can see the video here. I have HTMLified my slides if you wish to browse through them.

by Adam Young at April 29, 2013 08:39 PM

Yun Mao

How to run pylint with few false positives

How to run pylint with few false positives

Introduction

Python is a fantastic programming language. It's succinct, easily readable and great for a lot of purposes. For example, all OpenStack projects so far are written in Python. However, it's an interpreted language. There is no (default) compiler to do any static analysis such as type checking for you. The code quality heavily depends on code reviews (i.e. human eyes and brains) and test coverage.
There are several projects, such as pylint, pychecker and pyflakes, that do static code analysis to find bugs and coding style violations. Among them, pylint is the most sophisticated one. Unfortunately, Python code doesn't have any type annotations. It's a dynamically typed language. Therefore, Python type analysis is a really hard problem. In pylint, you will see false positives, meaning it will complain some lines as bugs that are actually correct. In fact, pylint was used early in OpenStack Nova's testing environment (around 2010) but latter dropped because the growing number of false positives became unmanagable.

Meet lintstack

lintstack is designed to address this problem: reduce false positives from pylint as much as possible without sacrificing accuracy. Certainly it's possible to improve pylint itself and perhaps type inference theory behind it but I'd leave that to the programming language researchers.

Key idea

The key idea is to leverage the version control system. I use git in the article but it works on others too. In git, you have access to the entire commit history of the project, but pylint only runs against the latest commit, or HEAD. Can we do better?
Let me use an example. Suppose I clone the project in git, HEAD is a A. I write a new feature, commit the patch, now the HEAD is at B. I run pylint against B, and see 60 errors in the report. That's an overwelming number. Which errors are due to my patch? If there were no false positives and commit A has no errors, then all 60 errors must all belong to my patch. But in reality there are false positives and real bugs in the commit A, so it's hard to tell which ones I should fix.
At a high level lintstack runs pylint on both commits A and B, and do a diff. lintstack considers errors on A are all false positives. Only the new errors are attributed to the patch. It might still be likely that some of the new errors are false positives, but it's much more manageable. Because the patch is small, I will only likely to see one or two errors instead of 60. I will pay attentions to the errors and fix accordingly.

Details

The idea of lintstack is really simple, but it requires a little more careful thoughts.
  • First, how do you do diff? In the new patch, the code is changed. That means, the line numbers, white spaces, or line breaks might change. So the error report on commit A might need to be revised to match that on commit B.
  • Second, the false positives we learned from commit A might help to reduce that on commit B. For example, pylint always seems to think that the sha1 module has no digestsize method. If we have seen that on commit A, and a new error from commit B is of the same kind, even thought it's new code, we should classify it as a false positive suppress them.
  • Last but not the least, the git history is a complicated graph. When I'm developing a patch, the master branch has moved forward because other developers make commits too! Say the master moves from A to A', it's not enough to run diff between A and B. It should be A' andmerge(A', B) instead. Things would get further complicated if I want to commit multiple patches on my own branch and send in for review.
All of the above mentioned problems are considered in lintstack. I invite you to checkout the code on github to see the details.

Performance

lintstack has been added to OpenStack Nova since late Folsom cycle. It's been deployed at the CI gate and integrated with the code review system gerrit at https://review.openstack.org. You can check the recent test results at Jenkins. During the entire Nova Grizzly development cycle, where 1877 change sets are committed, 127898 lines added, 79370 removed, only 18 errors reported from lintstatck are false positive cases, most of them are due to pylint unable to determine the call signatures of external libraries. On the other hand, lintstack almost finds real bugs (true positives) every day.

by Yun Mao (noreply@blogger.com) at April 29, 2013 03:30 PM

April 28, 2013

Flavio Percoco

Glance wants to go public

Havana development started, some folks just came back from April's summit and many things have been discussed - or are still under discussion. As for OpenStack's Image, one of the targets for this release is to make it ready for public environments. Unfortunately, some important features are missing, and that for, Glance is not cloud-ready, yet.

Does OpenStack Image needs too go public?

As for now, I can only see good things coming out of this. Glance is doing its job and has grown gradually in the last 2 releases but it does lacks of some features that could improve its integration within different environments.

One thing that should be kept in mind is that this project is meant to provide images, and either it does that publicly with strong enforcements or privately in some hidden data center doesn't change that.

That being said, there are some ideas that come to mi mind where a public image service could be used (I'm pretty sure there are way better use cases for it):

  1. Remote image download for distributions' installer
  2. Vagrant boxes distribution
  3. Public image service within a cloud service (allowing users to do mor things that what they're allowed now)
  4. ISO images distribution
  5. ... add yours

Some features missing

Quotas

Maybe not the most important but definitely required. As for now, OpenStack's Image service lacks of any kind of quotas support, which means it is not possible to set limits on the many operations available - neither globally nor per user. Once Glance will reach the "outside world", it will be mandatory to moderate the resources usage in multiple ways: per user, per instance, per action, per tenant and per region.

This is still under discussion, and current thoughts go around supporting it internally or as an separated service.

Robust user roles

Perhaps the most important one, without it, many of the other features can't be implemented. Currently, roles and policies haven't been used heavily throughout Glance, which allow users to execute some actions without any enforcement. Good thing is the code is there - most of it - and what's really missing is the presence of new policies and roles.

Protected image properties

Images have properties, and those properties are public. However, it is necessary to assign hidden properties to images for other purposes like (following items are under discussion): billing, permissions, roles. Current properties model doesn't allow this:

class ImageProperty(BASE, ModelBase):
    """Represents an image properties in the datastore"""
    __tablename__ = 'image_properties'
    __table_args__ = (UniqueConstraint('image_id', 'name'), {})

    id = Column(Integer, primary_key=True)
    image_id = Column(String(36), ForeignKey('images.id'),
                      nullable=False)
    image = relationship(Image, backref=backref('properties'))

    name = Column(String(255), index=True, nullable=False)
    value = Column(Text)

Rate Limits

Under some views this might look like something related to quotas, whether it is or not is not what we'll discuss here. As for Glance, we're treating it as a separate task and different things are being taken under consideration. One of those is to leave this outside Glance and let third party tools - regardless they are part of OpenStack.

Performance and latency

There are some other discussions going around OpenStack's Image performance and more precisely about improving uploads and downloads. Although this is not a blocker task, it would be nice to see it going forward and being able to reduce the time and bandwidth needed for both operations. Since this is a long topic, I'd like to start sharing some insightful posts and blueprints that some folks already wrote:

Open Discussions

Most of this things are under discussion, feel free to chime. I'd like to thank Iccha Sethi for bringing most of this things up and her great contributions to the project.

by FlaPer87 at April 28, 2013 10:21 PM

Victoria Martínez de la Cruz

Hey ladies! There is a new round of the Outreach Program for Women

The free software world is once again revolutionized because of the GNOME’s Outreach Program for Women. Many women from different parts of the world are making their first contributions, sending their application letters and crossing their fingers to be the future interns in one of the various free software organizations involved in this new edition. OpenStack is one of them, and have fun ideas for all who want to join our community.

Would you like to participate? Contact me!

Lightning questions and answers

Who can participate?

Women or people who identify as such of any age and from anywhere in the world.

Do I need to know how to code?

No! It’s not necessary. In this internships you can contribute in different ways: systems management, user interface design, graphic design, documentation, community management, marketing, translation … and of course, software development.

Which are these organizations you mentioned?

NESCent (National Evolutionary Synthesis Center), Open Technology Institute, OpenStack, WordPress, Wikimedia, Yocto, Debian, GNOME, Joomla, KDE, MediaGoblin, Mozilla, OpenMRS, Perl, Tor, Twisted y Wikia.

At this point I recommend considering NESCent, OpenStack, WordPress, Wikimedia and Yocto first.

How much time do I have to apply and when the internships begin?

Start as soon as possible! We’re near of the deadline, which is on 01-05-2013 – in 4 days! -, the internships begin on June 17th and end on September 23rd.

I know it’s short notice, but don’t miss the opportunity. I can help you!

What benefits do I have?

Well, the most important thing is that you will be able to work in very large organizations with people around the world. You’ll learn with the best and most experienced in their area, and you’ll be able to grow along them.

In addition, you will feel how fun and exciting is to work on free software projects.

I really can’t describe in words that last thing, so I suggest you watch this Wikimedia video which somehow sums up my feelings.

<iframe allowfullscreen="allowFullScreen" frameborder="0" height="281" mozallowfullscreen="mozallowfullscreen" src="http://commons.wikimedia.org/wiki/File:Great_Feeling.ogv?embedplayer=yes" webkitallowfullscreen="webkitAllowFullScreen" width="500"></iframe>

Finally, you will also receive a stipend that will help you to dedicate yourself completely to this experience.

What is expected of the interns?

Internships are full-effort, generally eight hours from Monday through Friday or as the mentor suggests.

You should choose a project, usually suggested by the mentor, in which to work during your internship. Plus you’ll be doing other less demanding tasks in parallel.

You should also write your experiences, tips and ideas in a blog.

How can I apply?

You have to make a small contribution to the organization you have chosen and send a letter of application by May 1. More details on the OPW official website.

Tips and tricks from a previous intern

This was the main goal of my post! To avoid making you waste more time, I made a list of what I consider to be the most important:

  • There are more people in the community apart from your mentor, don’t be afraid to chat with other people and ask for help in the community. Consider that many of those who are there have also another job, so they may not have time to answer – but they will answer! -.
  • Nobody knows everything and you have to start somewhere. If you are frustrated, let your mentor know about this. Seize the time, read, learn, and if you feel that one day wasn’t good enough, the next day will be better for sure.
  • Organize your tasks. What seems impossible to finish becomes more manageable if you face it from different angles and in a smaller pieces.
  • Socialize with other interns and mentors. Probably you will discover a great person – as I did :) -.
  • Enjoy! Have fun at work, and at the end of the day take a moment to see how much you have progressed.

I’m probably forgetting many details, so feel free to contact me and ask me for more details of this internship, help for your application or technical support. Whatever you need! And finally, again, be encouraged to be a part of this amazing experience.

by vkmc at April 28, 2013 02:52 PM

April 27, 2013

Flavio Percoco

Dynamic TTL Collection in Mongodb for Marconi

One of the things that led the team towards choosing Mongodb as Marconi's - the queuing service for OpenStack - first and default storage back-end is its TTL Collections feature.

TTL Collections - perhaps it would be better to call it TTL Indexes - are normal collections with a special index type, which defines the time - in seconds - a record can last in the collection.

This was added in Mongodb version 2.2 and was rapidly accepted and integrated many deployments.

Implementation

A TTL Index is created by specifying the expireAfterSeconds option in the ensureIndex method:

db.ttl_col.ensureIndex({<field>: <direction>}, {expireAfterSeconds: <seconds>})

The above starts a background thread (if not already running) that will monitor the collection and scrub expired records every minute, which means a record can last at most N + 60 where N is the number of seconds specified in the index and 60 the frequency of the background thread.

Marconi Usage

Even though it is a great feature, it wasn't enough to cover Marconi's needs since the later supports per message TTL. In order to cover this, one of the ideas was to implement something similar to Mongodb's thread and have it running server-side but we didn't want that for a couple of reasons: it needed a separated thread / process and it had a bigger impact in terms of performance.

After digging into this a bit more and doing some tests, we found out it is possible to "fool" the ttl monitor and have dynamic TTL support without much effort. Let me explain this a bit more.

Mongodb's TTL monitor looks for records that have expired by checking if the date field specified in the index is less than the current time minus the seconds specified in the expireAfterSeconds.

BSONObj query;
{
    BSONObjBuilder b;
    b.appendDate( "$lt" , curTimeMillis64() - ( 1000 * idx[secondsExpireField]  .numberLong() ) );
    query = BSON( key.firstElement().fieldName() << b.obj() );
}

Since the date set in the indexed field must be less than time - ttl, it is possible to have dynamic ttl by setting the index's ttl to 0 and adding it to the field date instead:

> use ttl
> db.ttl_col.ensureIndex({ttl: 1}, {expireAfterSeconds: 0})
> var start  = new Date(new Date().getTime() + 60000)
> db.ttl_col.insert({ttl: start})
> while (true) {
    var count = db.ttl_col.count();

    print("# of records: " + count + " (" + (start - new Date()) + ")");

    if (count == 0)
        break;

    sleep(4000);
}
# of records: 1 (56783)
# of records: 1 (52782)
# of records: 1 (48781)
# of records: 1 (44779)
# of records: 1 (40778)
# of records: 1 (36776)
# of records: 1 (32775)
# of records: 1 (28774)
# of records: 1 (24773)
# of records: 1 (20771)
# of records: 1 (16770)
# of records: 1 (12769)
# of records: 1 (8768)
# of records: 1 (4765)
# of records: 1 (763)
# of records: 1 (-3238)
# of records: 1 (-7239)
# of records: 1 (-11240)
# of records: 1 (-15241)
# of records: 1 (-19242)
# of records: 0 (-23243)

In the above code, the record has a 1 min TTL and lasted ~1:20 - notice that it could have lasted at least 1 min and at most 2 min.

This made the implementation way easier and allowed us to use the same behavior for both messages and claims, even though claims expiration doesn't require removing records.

Current Marconi's message post looks like this:

def post(self, queue, messages, client_uuid, tenant=None):
    qid = self._get_queue_id(queue, tenant)

    now = timeutils.utcnow()

    def denormalizer(messages):
        for msg in messages:
            ttl = int(msg["ttl"])
            expires = now + datetime.timedelta(seconds=ttl)

            yield {
                "t": ttl,
                "q": qid,
                "e": expires,
                "u": client_uuid,
                "c": {"id": None, "e": now},
                "b": msg['body'] if 'body' in msg else {}
            }

    ids = self._col.insert(denormalizer(messages))
    return map(str, ids)

Random thoughts

Would it be possible / better to have this behavior in the database side? What would the cost of this task be? As for now, I can see it being implemented like this:

> db.ttl_col.ensureIndex({datetime: 1}, {expireAfterSeconds: "ttl_field"})

This would require doing some in-query operations like adding the value of the record's ttl field to the current time and then check whether it is greater than datetime. They might be a better way, though.

by FlaPer87 at April 27, 2013 09:09 PM

April 26, 2013

OpenStack Security Blog

OpenStack Common Vulnerability Database

As per my last post I am starting to work on building an OpenStack common vulnerability database. As for the justifications, read here.

This post will discuss some of my proposed architecture.

So this is what I envision the final process workflow will probably resemeble:

We can make use of OSLO components in several areas of the architecture. Additionally, we can make use of the new requirements project. And that would certainly be the goal.

The next post I make will contain a rough draft of a proposed schema design for the database.

by openfly at April 26, 2013 10:51 PM

Rob Hirschfeld

OpenStack steps toward Interopability with Temptest, RAs & RefStack.org

Pipes are interoperableI’m a cautious supporter of OpenStack leading with implementation (over API specification); however, it clearly has risks. OpenStack has the benefit of many live sites operating at significant scale. The short term cost is that those sites were not fully interoperable (progress is being made!). Even if they were, we are lack the means to validate that they are.

The interoperability challenge was a major theme of the Havana Summit in Portland last week (panel I moderated) .  Solving it creates significant benefits for the OpenStack community.  These benefits have significant financial opportunities for the OpenStack ecosystem.

This is a journey that we are on together – it’s not a deliverable from a single company or a release that we will complete and move on.

There were several themes that Monty and I presented during Heat for Reference Architectures (slides).  It’s pretty obvious that interop is valuable (I discuss why you should care in this earlier post) and running a cloud means dealing with hardware, software and ops in equal measures.  We also identified lots of important items like Open OperationsUpstreamingReference Architecture/Implementation and Testing.

During the session, I think we did a good job stating how we can use Heat for an RA to make incremental steps.   and I had a session about upgrade (slides).

Even with all this progress, Testing for interoperability was one of the largest gaps.

The challenge is not if we should test, but how to create a set of tests that everyone will accept as adequate.  Approach that goal with standardization or specification objective is likely an impossible challenge.

Joshua McKenty & Monty Taylor found a starting point for interoperability FITS testing: “let’s use the Tempest tests we’ve got.”

We should question the assumption that faithful implementation test specifications (FITS) for interoperability are only useful with a matching specification and significant API coverage.  Any level of coverage provides useful information and, more importantly, visibility accelerates contributions to the test base.

I can speak from experience that this approach has merit.  The Crowbar team at Dell has been including OpenStack Tempest as part of our reference deployment since Essex and it runs as part of our automated test infrastructure against every build.  This process does not catch every issue, but passing Tempest is a very good indication that you’ve got the a workable OpenStack deployment.


by Rob H at April 26, 2013 08:57 PM

Anne Gentle

Who Wrote OpenStack Grizzly Docs?

Sneaking a peek at the numbers for documentation along with the code should show us pointers about docs keeping up with code. As I suspected, there were about three major contributors to the operations manuals that span all the projects, and about three major contributors to the API docs. Also not a big surprise, I am the major contributor to both. My spidey sense felt it but I had a real gut check with the actual data.

timsamoff_no3

What’s difficult about this data analysis at this time is that we still need to release the docs even while we plan for the next six months. What I really want to do is look at the past six months and all the amazing work and accomplishments we have seen. The growth has been great and the fantastic feat of the Operations Guide really topped off my year. But we are still lacking enough strong doc contributors to keep up with the pace of code growth.

First, let’s look at the OpenStack code analyzes. The last six months showed 517 contributors. For example, Object Storage grew their new contributors by over 35 people which is probably doubling the involvement. Our Infrastructure team continues to raise the bar for helping us slam in more and more bits as fast as our little cloud servers can slam them. Here’s Monty Taylor’s report:

OpenStack code patches

                        Essex   Folsom  Grizzly
Patches Uploaded        11036   17986   29308
Changes Created         5137    5990    12721
Changes Landed          4235    4978    10561
Avg patches per Change  2.6     3.6     2.7
Landing Percentage      82%     83%     83%

What I want to do here is provide similar data that shows the growth of the project relative to the docs. I’m using the openstack-gitdm project to run the numbers for the documentation repos. There are eight in total but I’m just going to look at the top two, openstack-manuals and api-site. The openstack-manuals repository holds the install, configuration, adminstration, high availability, and operations guide. The api-site repository holds the building blocks for the API reference page, the API Quick Start, and other API guides (but not the API specs).

Here’s a listing of all the OpenStack doc repositories:
openstack/openstack-manuals – for operators and deployers, docs.openstack.org
openstack/api-site – for API consumers, api.openstack.org
openstack/compute-api
openstack/image-api
openstack/object-api
openstack/netconn-api
openstack/volume-api
openstack/identity-api

These are the types of statistics I want to know about doc contributions.
Number of doc contributors: 79. This is a great value.
Number of new doc contributors: 27. I like this from a growth standpoint.
Number of doc contributions: 512. There were 435 doc changes within openstack-manuals during the grizzly release, and 429 during the folsom release. Compared to over 12,000 code changes I instinctively know this wasn’t enough doc update. While we do have a good base set of docs, they are getting a bit crufty and we want to address that in the Havana release.

Number of employers: 49 (up from 37 last release). This is a high number. The highest doc contributing employer is Rackspace during the Grizzly release.

So, what about quality? The most bugs fixed by a doc contributor is 45 (well over half) by Tom Fifieldt. Tom is a great doc bug triage expert and I don’t know what we’d do without him.

How about what’s the top docs being read? The most read books are the Ubuntu Install and Deploy and the API Quick Start followed closely by the Identity 2.0 API Spec (wow that surprised me).

Here’s the reported data from openstack-gitdm. Thanks to Daniel Stangel for helping me retrieve this data. One hidden contributor is Jon Proulx, who wrote lots of the Operations Guide. Everett Toews also contributed a lot to the Operations Guide but won’t show up here. This omission leads me to suspect there may be other “ghosts” writing OpenStack docs, but I think the main point is, the top three shown below are far ahead of the fourth, fifth, and sixth-highest doc contributors.

Processed 435 csets from 79 developers
49 employers found
A total of 87457 lines added, 26085 removed (delta 61372)

Developers with the most changesets
Tom Fifield                 99 (22.8%)
annegentle                  86 (19.8%)
Lorin Hochstein             46 (10.6%)
Emilien Macchi              17 (6.0%)
atul jha                    11 (2.5%)
Mate Lakat                  10 (2.3%)
Diane Fleming                9 (2.1%)
dcramer                      8 (1.8%)
Aaron Rosen                  8 (1.8%)
gongysh                      6 (1.4%)
Ed Kern                      6 (1.4%)
Eduardo Patrocinio           6 (1.4%)
Alvaro Lopez Garcia          5 (1.1%)
Kurt Martin                  4 (0.9%)
Dan Wendlandt                4 (0.9%)
Razique Mahroua              4 (0.9%)
Gary Kotton                  4 (0.9%)
Dolph Mathews                4 (0.9%)
Christophe Sauthier          3 (0.7%)
Covers 80.459770% of changesets

Developers with the most changed lines
daisy-ycguo               37578 (39.9%)
Diane Fleming             19381 (20.6%)
annegentle                7624 (8.1%)
Tom Fifield               3126 (3.3%)
Lorin Hochstein           2757 (2.9%)
John Griffith             2390 (2.5%)
gongysh                   2169 (2.3%)
zhangchao010              2036 (2.2%)
Mate Lakat                1927 (2.0%)
Emilien Macchi            1684 (1.8%)
Navneet Singh              970 (1.0%)
Alvaro Lopez Garcia        647 (0.7%)
Brian Rosmaita             580 (0.6%)
dcramer                    554 (0.6%)
Dan Wendlandt              472 (0.5%)
atul jha                   431 (0.5%)
EmilienM                   428 (0.5%)
Joe Topjian                411 (0.4%)
Eric Windisch              376 (0.4%)
Ed Kern                    341 (0.4%)

At the OpenStack Summit last week I started looking for data that will help us shape the scope for the documentation for the coming release. With the right scope, we can keep up with code. Right now the docs scope that DOES release with code is docs for Python developers only, at docs.openstack.org/developers. However it seems people want install docs more than anything around release time. We will release the docs next week, 4/30/13, and have basic install docs in review now. We’ll need to keep track of doc bugs once we release of course. What we want to do in addition to decreasing scope is to increase resources, so we are working with member companies to create and fill upstream OpenStack documentation positions at each member company. Other creative ideas are welcome of course. I find this creative resourcing fascinating and I’m not about to whine about keeping up. Rather, I want to keep rising to the challenge.

by annegentle at April 26, 2013 12:49 PM

OpenStack Blog

OpenStack Community Weekly Newsletter (Apr 12 – 25)

Special post-Summit issue

OpenStack docs and tooling in 20 minutes

Howto get started with all the tooling and setup needed to build, review, and contribute to OpenStack Documentation. By Joe Heck.

How It’s Made: the OpenStack API Reference Page

The site at http://api.openstack.org is a collection of HTML pages, and one page has an especially interesting story about how it is built. Anne Gentle reveals the secret.

Storage != Transfer

John Bresnahan argues that concepts of data transfer and data storage should not be conflated into a single solution.  He believes that OpenStack can benefit from a new component that offloads the burden of optimally transferring images from existing components like nova-compute and swift.

Report from Previous Events

Tips and Tricks

Upcoming Events

Other News

Welcome New Developers

  • Tilottama Gaat, Rackspace
  • Zang MingJie, None
  • Jason Dunsmore, Rackspace
  • James Slagle, None

Got answers?

Ask OpenStack is the go-to destination for OpenStack users. Interesting questions waiting for answers:

The weekly newsletter is a way for the community to learn about all the various activities occurring on a weekly basis. If you would like to add content to a weekly update or have an idea about this newsletter, please leave a comment.

by Stefano Maffulli at April 26, 2013 12:00 PM

Michael Still

Merged in Havana: fixed ip listing for single hosts

Nova has supported listing the fixed ips for a single host for a while. Well, except for that time we broke it by removing the database call it used and not noticing. My change to fix that situation has just landed, so this should now work again. To list the fixed ips used on a host, do something like:

    <pr> nova-manage fixed list hostname


I will propose a backport to grizzly for this now.

Tags for this post: openstack havana fixed_ip nova rackspace
Related posts: Upgrade problems with the new Fixed IP quota; Merged in Havana: configurable iptables drop actions in nova; Michael's surprisingly unreliable predictions for the Havana Nova release; Havana Nova PTL elections; Faster pip installs; Some quick operational notes for users of loop and nbd devices; Further adventures with base images in OpenStack; OpenStack at linux.conf.au 2013; Moving on; Image handlers (in essex); Openstack compute node cleanup

Comment

April 26, 2013 08:56 AM

April 25, 2013

John Bresnahan

A Picture Can Beat 1000 Dead Horses

Unless this is your first time reading my blog, you are probably aware that I am beginning to become obsessed with the idea of a data transfer service.  In this post I continue the topic from my previous post by introducing a couple of diagrams.

the_wild_west

A diagram of a possible swift deployment is on the right side.  On the left is a client to that service.  The swift deployment is very well managed, redundant and highly available.  The client speaks to the swift via a well defined REST API and using supported client side software to interpret the protocol.  However, between the server side network protocol interpreter and the client side network protocol interpreter is the wild west.

The wild west is completely unprotected and unmanaged. Many things can occur that cause a lost, slow, or disruptive transfer.  For example

  • Dropped connections
  • Congestion events
  • Network partitions

Such problems make data transfer expensive.  Ideally there would be a service to oversee the transfer.  Transfer could be check-pointed as they progress so that if a connection is dropped it could be restarted with minimal loss.  Also it could try to maximize the efficiency of the pathway between the source and the destination by tuning the protocols in use (like setting a good value for the TCP window), or using multicast protocols where appropriate (like bittorrent), or scheduling transfers so as to not shoot itself in the foot.

A safer architecture would look like this:

tamed_west

The transfer service is now in a position to manage the transfer thus it allows for the following:

  • A fire and forget asynchronous transfer request from the client.
  • Baby sit and checkpoint the transfer.  If it fails restart it from the last checkpoint.
  • Schedule transfer for optimal times.
  • Prioritize transfers and their use of the network.
  • Coalesce transfer requests and schedule appropriately and into multicast sessions.
  • Negotiate the best possible protocol between the two endpoints.
  • Verify that the data successfully is written to the destination storage system and verify its integrity.

by John Bresnahan, Red Hat at April 25, 2013 11:58 PM

Flavio Percoco

glance-gridfs-store

Recently, GridFS support has been added to Glance, which means, it is now possible to store images inside GridFS. For those of you that don't know what GridFS is, it is a convention for storing files inside MongoDB.

About the implementation

The implementation was pretty much straight-forward. It implements all the required methods and it just needed few lines of code. Some cool things about this store is that GridFS already had most (or all of them) things needed: md5, length, clean API. No rocket science!

Some Benefits

  • It's fast
  • It's distributed
  • Can be mounted (using gridfs-fuse)
  • Can be shared with other environments and / or applications
  • Fits perfectly in deployments where mongodb already exists.
  • Reads from secondaries when using replica sets.
  • Sharding support.

Random Thoughts

1) Even though Glance uses gridfs mostly as a bucket to store images, it can still be extended - either changing the store code or from outside it - with other features like tracking accesses to images, for example.

2) When using replica sets, an interesting deployment could be to have a replica, where Glance can read from, in every glance-api node. This would speed reads but it would definitely use more space.

3) When using shards, would it be possible to have glance-api running on the shard nodes and "shard" glance requests as well? (sounds like a crazy idea)

I don't expect this store to be widely used but I do find interesting how MongoDB fits well in so many environments and such different use cases.

by FlaPer87 at April 25, 2013 10:57 PM

OpenStack Blog

3rd Swiss OpenStack User Group Meetup


chosug
Following on from our 2nd meeting, the Swiss OpenStack user group met on 24th of April at the University of Bern.It was an excellent event with many attention grabbing presentations! A big thanks goes out to the sponsors:

 

Once we kicked off, there were five presentations, 3 which were more detailed and 2 that were more lightning talks in nature. The presentations in there running order were:

Screen Shot 2013-04-25 at 17.02.29 Screen Shot 2013-04-25 at 17.02.47 Screen Shot 2013-04-25 at 17.02.59 Screen Shot 2013-04-25 at 17.03.16 Screen Shot 2013-04-25 at 17.03.27 Screen Shot 2013-04-25 at 17.03.41 Screen Shot 2013-04-25 at 17.03.55

Upcoming

There are other upcoming Swiss events that will include much talk of OpenStack. Of note are:

Also the Swiss Informatics Society have started a cloud computing special interest group, where all folk active in cloud are welcomed to join. More details can be found at their site.

Swiss OpenStack User Group Channels

 Original post: ICCLab

by dizz at April 25, 2013 07:11 PM

Julien Danjou

OpenStack Design Summit Havana, from a Ceilometer point of view

Last week was the OpenStack Design Summit in Portland, OR where we, developers, discussed and designed the new OpenStack release (Havana) coming up.

The summit has been wonderful. It was my first OpenStack design summit -- even more as a PTL -- and bumping into various people I've never met so far and worked with online only was a real pleasure!

<figure> <figcaption>Me and Nick ready to talk about Ceilometer new features.</figcaption> </figure>

Nick Barcet from eNovance, our dear previous Ceilometer PTL, and myself, talked about Ceilometer and presented the work that bas been done for Grizzly, with some previews of what we'll like to see done for its Havana release. You can take a look at the slides if you're curious.

Design sessions

Ceilometer had his design sessions during the last days of the summit. We noted a lot of things and commented during the sessions in our Etherpads instances.

The first session was a description of Ceilometer core architecture for interested people, and was a wonderful success considering that the room was packed. Our Doug Hellmann did a wonderful job introducing people to Ceilometer and answering question.

<figure> <figcaption>Doug explaining Ceilometer architecture.</figcaption> </figure>

The next session was about getting feedbacks from our users. We had a lot of surprise to discover wonderful real use-cases and deployments, like the CERN using Ceilometer and generating 2 GB of data per day!

The following sessions ran on Thursday and were much more about new features discussion. A lot ot already existing blueprints were discussed and quickly validated during the first morning session. Then, Sandy Walsh introduced the architecture they use inside StackTach, so we can start thinking about getting things from it into Ceilometer.

API improvements were discussed without surprises and with a good consensus on what needs to be done. The four following sessions that occupied a lot of the days were related to alarming. All were lead by Eoghan Glynn, from RedHat, who did an amazing job presenting the possible architectures with theirs pros and cons. Actually, all we had to do was to nod to his designs and acknowledge the plan on how to build this.

That last two sessions were about discussing advanced models for billing where we got some interesting feedback from Daniel Dyer from HP, and then were a quick follow-up of the StackTach presentation from the morning session.

Havana roadmap

The list of blueprints targetting Havana is available and should be finished by next week. If you want to propose blueprints, you're free to do so and inform us about it so we can validate it. The same applies if you wish to implement one of them!

API extension

I do think the API version 2 is going to be heavily extended during this release cycle. We need more feature, like the group-by functionnality.

Healthnmon

In parallel of the design sessions, discussions took place in the unconference room with the Healthnmon developers to figure out a plan in order to merge some of their efforts into Ceilometer. They should provide a component to help Ceilometer supports more hypervisors than it currently does.

Alarming

Alarming is definitely going to be the next big project for Ceilometer. Today, Eoghan and I started building blueprints on alarming, centralized in a general blueprint.

We know this is going to happen for real and very soon, thanks to the engagements of eNovance and RedHat who are commiting resources to this amazing project!

by Julien Danjou at April 25, 2013 02:49 PM

Sébastien Han

Ceph and Cinder multi-backend

Grizzly brought the multi-backend functionality to cinder and tons of new drivers. The main purpose of this article is to demonstrate how we can take advantage of the tiering capability of Ceph.

I. Ceph

To configure Ceph to use different storage devices see my previous article: Ceph 2 speed storage with CRUSH.


II. Cinder

Assuming your 2 pools are called:

  • rbd-sata points to the SATA rack
  • rbd-ssd points to the SSD rack

II.1 Configuration

Cinder configuration file:

# Multi backend options

# Define the names of the groups for multiple volume backends
enabled_backends=rbd-sata,rbd-ssd

# Define the groups as above
[rbd-sata]
volume_driver=cinder.volume.driver.RBDDriver
rbd_pool=cinder-sata
volume_backend_name=RBD_SATA
# if cephX is enable
#rbd_user=cinder
#rbd_secret_uuid=<None>
[rbd-ssd]
volume_driver=cinder.volume.driver.RBDDriver
rbd_pool=cinder-ssd
volume_backend_name=RBD_SSD
# if cephX is enable
#rbd_user=cinder
#rbd_secret_uuid=<None>

Unfortunately the rbd driver doesn’t support this variable yet. This feature has been submitted here: https://review.openstack.org/#/c/27535/.

Then create the pointers:

<figure class="code"><figcaption></figcaption>
1
2
3
4
5
6
7
8
9
$ cinder type-key ssd set volume_backend_name=RBD_SSD
$ cinder type-key sata set volume_backend_name=RBD_SATA
$ cinder extra-specs-list
+--------------------------------------+------+---------------------------------------+
|                  ID                  | Name |              extra_specs              |
+--------------------------------------+------+---------------------------------------+
| b1522968-e4fa-4372-8ac4-3925b7c79ee1 | ssd  |  {u'volume_backend_name': u'RBD_SSD'} |
| b50bf5a3-6044-4392-beeb-432302f6421c | sata | {u'volume_backend_name': u'RBD_SATA'} |
+--------------------------------------+------+---------------------------------------+
</figure>

Then restart cinder services:

<figure class="code"><figcaption></figcaption>
1
$ sudo restart cinder-api ; sudo restart cinder-scheduler ; sudo restart cinder-volume
</figure>

Eventually create 2 volume type, one for each backend:

<figure class="code"><figcaption></figcaption>
1
2
3
4
5
6
7
8
9
10
11
12
13
$ cinder type-create ssd
+--------------------------------------+------+
|                  ID                  | Name |
+--------------------------------------+------+
| b1522968-e4fa-4372-8ac4-3925b7c79ee1 | ssd  |
+--------------------------------------+------+

$ cinder type-create sata
+--------------------------------------+------+
|                  ID                  | Name |
+--------------------------------------+------+
| b50bf5a3-6044-4392-beeb-432302f6421c | sata |
+--------------------------------------+------+
</figure>

II.2. Play with it

<figure class="code"><figcaption></figcaption>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
$ cinder create --volume_type ssd --display_name vol-ssd 1
+---------------------+--------------------------------------+
|       Property      |                Value                 |
+---------------------+--------------------------------------+
|     attachments     |                  []                  |
|  availability_zone  |                 nova                 |
|       bootable      |                false                 |
|      created_at     |      2013-04-22T14:54:53.917580      |
| display_description |                 None                 |
|     display_name    |               vol-ssd                |
|          id         | 4c777d96-66e4-4f85-815c-92d4503c5c8c |
|       metadata      |                  {}                  |
|         size        |                  1                   |
|     snapshot_id     |                 None                 |
|     source_volid    |                 None                 |
|        status       |               creating               |
|     volume_type     |                 ssd                  |
+---------------------+--------------------------------------+

$ cinder create --volume_type ssd --display_name vol-sata 1
+---------------------+--------------------------------------+
|       Property      |                Value                 |
+---------------------+--------------------------------------+
|     attachments     |                  []                  |
|  availability_zone  |                 nova                 |
|       bootable      |                false                 |
|      created_at     |      2013-04-22T14:54:58.831327      |
| display_description |                 None                 |
|     display_name    |               vol-sata               |
|          id         | 8e347bd1-2044-40a2-ae87-ee9a23cddd71 |
|       metadata      |                  {}                  |
|         size        |                  1                   |
|     snapshot_id     |                 None                 |
|     source_volid    |                 None                 |
|        status       |               creating               |
|     volume_type     |                 ssd                  |
+---------------------+--------------------------------------+
</figure>

Does it work?

<figure class="code"><figcaption></figcaption>
1
2
3
4
5
$ rbd -p cinder-ssd ls
volume-8e347bd1-2044-40a2-ae87-ee9a23cddd71

$ rbd -p cinder-sata ls
volume-4c777d96-66e4-4f85-815c-92d4503c5c8c
</figure>

It’s nice that the multi-backend came with Cinder, we are gradually getting to enjoy the full power of Ceph!

April 25, 2013 10:03 AM

April 24, 2013

SwiftStack Team

Havana Design Summit: Swift API Discussions

As part of the Swift technical track last week at the OpenStack design summit, we had several topics on the Swift API. Swift has a remarkably stable API. We've added to the API, but we haven't removed anything or changed any existing behavior beyond some minor conformance-to-spec fixes. This means that clients written years ago still work even when talking to Swift clusters deployed just yesterday.

External API

Although it is stable, the Swift API is not without some minor warts. There are a few inconsistencies in the API and a few awkward parts that would break clients if they changed. Cleaning up these parts of the API will help client developers write cleaner Swift applications and will allow end users to more easily use cross-cloud Swift clusters.

Figuring out what will be changed in the Swift API will be a long proccess, but there are a few important baseline things that need to happen before any changes to the API can be made. First, we need a formal definition of what the Swift API is. We have never had a formal API spec. Swift has always relied on the careful attention of its contributors to ensure that existing clients don't break. A spec won't lessen the need for careful attention by contributors and reviewers, but it will allow client developers to know exactly what they can expect from a particular deployment. A formal API spec also allows deployers to know what must be supported to ensure support for data migration between Swift clusters. As a side-benefit, formally defining the API will expose gaps in our current docs and help us keep our docs more up-to-date.

The second thing we must do as a community is define our API for discovering the supported Swift API. Users need to be able to determine what API a particular Swift cluster supports in order to know how to talk to it. There has been a lot of work on API discoverability in other OpenStack projects, so I hope that we can use some of their techniques and lessons-learned in Swift.

Once we have these two things, an API spec and API discoverability, we can start the discussions around what needs to change in the Swift API and go about implementing the changes in the code.

I expect that all of these questions will create quite a bit of discussion in the community. As a group, we need to get feedback from deployers (of all sizes), developers, and end users. Together, we'll be able to make improvements and find the path that is best for everyone.

Internal API

Since a Swift cluster is a set of cooperating processes running on many servers, it implies that there is an internal API too. This API is how the communication between the nodes works and how the storage nodes talk to the underlying storage volumes.

While this internal API isn't nearly as formal or rigid as Swift's external API, there are opportunities to improve it too. Parts of Swift's code can be refactored to allow cleaner abstractions so that specific optimizations or alternatives can be implemented.

A while back, the concept of a Local File System (LFS) was proposed to Swift. Ultimately, the proposed patch was not merged, but the idea is a good one. The concept allows for filesystem-specific optimizations to be made. For example, an XFS module could optimize the way it walks over inodes or a ZFS module could take advantage of its ZFS-specific self-healing properties.

Other interested parties have started working on the concept for LFS rencently, specifically with the goal of better integrating Swift and GlusterFS. I'm hopeful that the patches will be successfully merged this time around, and I'm looking forward to the additional functionality the LFS feature will allow.

Next Steps

To move forward on improving both the internal and external APIs for Swift, we need community involvement for a few things:

  • Formally defining the current Swift API
  • Implementing API version discoverability into Swift
  • Completion of the LFS patch for talking to storage volumes
  • Refactoring the proxy code to abstract communication with storage servers

I'm looking for people in the Swift community to help completing these tasks. If you're interested, drop by #openstack-swift on freenode and let's talk!

April 24, 2013 10:40 PM

DreamHost

OpenStack Networking Project

Our very own DreamHoster, Mark McClain, who is also an OpenStack Project Technical Lead had an awesome interview with Mirantis!  In case you haven’t had a chance to read it, check it out below!

This is what Mark McClain looks like

This is what Mark McClain looks like

OpenStack Project Technical Lead Interview Series #1: Mark McClain, OpenStack Networking Project 

by: David M. Fishman

We are introducing the first of a continuing series of interviews with OpenStack Project Technical Leads on our Mirantis blog. Our goal is to educate the broader tech community and help people understand how they can contribute to and benefit from OpenStack. Naturally, these are the opinions of the interviewee, not of Mirantis. We’ve edited the interview for clarity and post length.

Our first interview is with Mark McClain, newly elected OpenStack Networking Project (formerly known as “Quantum”) Technical Lead.

Mirantis: Tell us about yourself and how you got started with OpenStack.

Mark McClain: I’m a Senior Cloud Developer at DreamHost and I work on the Cloud Team. DreamHost hired me specifically to work on OpenStack. I was hired during the middle of the Essex cycle, and the cool thing about working on OpenStack, and specifically OpenStack Networking, is it combines two of my favorite things: networking and Python.

Q: What are your responsibilities? What do you find yourself doing on a day to day basis?

A: During the Grizzly release, I was a core contributor focusing on improving the metadata service functionality when using overlapping IP networks. I also worked on database migration so that folks who were deploying OpenStack can seamlessly upgrade from Folsom to Grizzly. During Grizzly, I also led several sub-teams including the L3, database, and bug triage teams.

Now as PTL, I take a much larger view of the project. I’m responsible for running our weekly team meeting and organizing the Network track at the design summit.  On a daily basis, I’ll correspond with the community, coordinate with sub-team leads, review code submissions, triage bugs, and review blueprints. I’ll also coordinate with other members of the Foundation and Technical Committee on cross-project issues.

Q: What is it that makes OpenStack Networking so special? Why does it matter?

A: With Nova you can spin up virtual machines and it provides basic network capabilities. But when you want to use newer technologies, say, tunneling to provide network isolation between tenants or VXLAN – you can’t really leverage those with Nova networking. It limits you to VLAN’s or flat networking. The new technologies enable the biggest benefits of OpenStack Networking: scalable tenant isolation.

Q: What has the OpenStack Networking community accomplished so far, and what are your plans for the Havana release?

A: OpenStack Networking was originally created at the Diablo Summit. It was an incubator project during Essex and it was integrated in Folsom. During the Essex and Folsom time frame the community really spent a lot of time trying to reach feature parity and build many L2 and L3 features into OpenStack Networking.

In Grizzly we were able to shift focus to adding new services, and also closing the parity gap with Nova Networking. In the Grizzly cycle we added overlapping metadata services, migrations, and security groups. Another big feature of Grizzly was load balancing.

As a matter of fact, several folks from Mirantis actually helped contribute to load balancing. That was a big community project that involved multiple people from various vendors who all worked together to produce an API and the foundation for load balancing.

In Havana we’re going to extend the load balancing service and add more features. Looking forward there are vendors in the community working to improve OpenStack Networking’s by adding VPN support, enterprise level ACL support, and IPv6 support. Right now the IPv6 functionality is pretty basic and folks want to add some high level services on top of that. Also there are companies and community members working on bare metal support, full multi-host support, providing HA in a little smaller context similar to what Nova multi-host is … also there are several community members who come together to work on other user facing features.

Q: What is genuinely unique about OpenStack Networking or is it just an open source version of Networking as a Service as it already exists?

A: I think the most unique thing about OpenStack Networking compared to almost any of the other Networking as a Service solutions is a very vibrant vendor community. During the Grizzly cycle we added five new plug-ins from different vendors. That’s one of the unique things about OpenStack Networking. It also shows a lot of vendor momentum because most of the vendors have chosen to put their energy and their efforts behind OpenStack.

If you were to compare that to other networking solutions in some other cloud stacks, they have maybe one or two options if you’re lucky. You’re pretty much stuck with the networking option of the stack.

Q: What are some use cases where OpenStack Networking really shines?

A: In multi-tenant environments where isolation and security are a must you can get those systems up and running rapidly and provide those services to tenants, whether it’s a public or private cloud and you can get them running at scale fairly quickly.

In the case of smaller deployments OpenStack Networking can be configured to support even smaller private clouds fairly easily using open source tools. Smaller shops that have limited resources still can take advantage of many of the same features that the folks who are deploying at scale can as well.

Q: When you say “fairly quickly,” can you quantify that?

A: For a smaller shop, just following the OpenStack guides – if you’re familiar with OpenStack – that would be half a day. If you’re unfamiliar with OpenStack, maybe a day or two. The guides will walk you through and get you set up with a pretty realistic set-up that works well for the majority of cases.

The nice thing is you can do that with the minimal level staff and it all runs on commodity hardware. You don’t need special switches or servers. For those who are trying to experiment and figure out if OpenStack or OpenStack Networking is the solution for them, they can use gear that most businesses have in their labs anyway for testing.

Q: How about the know-how base that you need to have to get OpenStack Networking up and running?

A: It’s the same set of skills that you would find if you had a network engineer or even a DevOps type of position. You really only need a basic familiarity for deploying IP networks.

So there’s really no special skills that are needed because a lot of what the plug-in authors have done – both open source and proprietary – is abstracted out a lot of the details of knowing the extreme specifics of certain protocols so that the deployer can focus on the API’s.

Q: Are there any misconceptions about OpenStack Networking?

A: We don’t battle too many misconceptions with OpenStack Networking. I think most people understand what it does. Some folks will choose to still deploy Nova networking for new installations because they’re concerned about OpenStack Networking’s complexity, maturity or stability.

Now the support materials for installation have caught up and the distributions have done a really good job of packaging OpenStack, so that is no longer the case. For new deployments, you should use OpenStack Networking from the beginning and leverage those features now, versus starting with Nova-network and eventually having to migrate to OpenStack Networking, which is a non-trivial migration.

During Havana, the OpenStack Networking and Nova teams are going to be discussing: How do we bridge that gap and how does OpenStack Networking become the default network provider for Nova?

Q: How about OpenStack Networking scalability?

A: There are several large deployments running OpenStack Networking. Some are running versions of Folsom, and some people are actually running trunk which is really interesting because it speaks to the maturity of codebase.

Q: Does OpenStack Network Project still have any “childhood ailments”?

A: We spent a lot of time in Grizzly working on isolated metadata services. A lot of the support questions we got and bug reports after the Folsom release were: How does metadata service work? – and so we spent a lot of time making sure that metadata service was a lot easier to configure and just worked out of the box for a wide variety of deployments. In Grizzly that’s probably one of the biggest diseases that we’ve gotten rid of.

Q: Who would you like to see contributing to OpenStack Network Project ?

A: We’re very fortunate. We have contributions from some very well respected companies including: Arista, BigSwitch, Brocade, Cisco, HP, IBM, NEC, Nicira, Juniper, Midokura, Plumgrid, and VMware. During the Grizzly cycle we added even more companies and some new start-ups who are offering their solutions so that drives innovation in the community.

As far as the ideal contributor … it’s somebody who is excited about networking and wants to participate in the OpenStack community, and is willing to trade ideas back and forth amongst the different contributors so that at the end of the day the community benefits as a whole.

Q: What are the biggest opportunities for folks who want to create something awesome and outstanding in OpenStack Network Project?

A: There will be a big push in Havana for VPN-as-a-Service in several different deployment modes. Also, we’ll extend load balancing. In Grizzly we took the baby steps of getting it out and there’s several vendors who are now trying to leverage that API.  IPv6 support is also going to be big as well. More internet service providers are offering v6 services for business deployments. Ensuring that OpenStack Network Project works for the various deployment modes of IPv6 is going to be important as well.

We also have excitement around folks who are working on bare metal with OpenStack Networking and on device management in larger scale: If I’m a hardware vendor – how do I integrate my piece of hardware into OpenStack Networking? Also, we are focused on deployer topics such as: How can I provide different level service level offerings?

Q: How would you advise people who want to get started contributing to OpenStack Networking? What steps should they specifically take?

A: The first step is to obviously join the OpenStack development mailing list. It gives you a sense of what the topics are that the OpenStack Networking developers are discussing.

The OpenStack Networking team also maintains a Wiki page for starter bugs. On that page we keep track of simple links for: Here’s how to find the code reviews for the OpenStack Networking server side or the OpenStack Networking client etc. As bugs are reported we will tag the easier ones as low hanging fruit which are an excellent opportunity for new developers and contributors to jump in on the project.

That means whoever is triaging the bugs can say: “Hey, this is something that is not overly complicated and is a good way to become familiar with the OpenStack Networking code base.” We also maintain a list of community projects that would be good to start working on. By being a member of the OpenStack Networking mailing list you can recruit other members of the community and work together. It also builds up trust within the community so that those folks who were reviewing your code are working with you. You have a sense of rapport with them.

Q: Thank you very much, Mark!

A: You’re welcome!

by @missmariss at April 24, 2013 07:01 PM

Rob Hirschfeld

Crowbar and our Pivot (or, how we slipped and shipped Grizzly)

Crowbar Grizzly PostMy team at Dell uses Lean process because it forces us to be honest about making hard choices. Our recent decision to pivot back to Crowbar 1.x for the OpenStack Grizzly release is a great example how the pivot process works.

4/24 note: I have a longer post and ISO for Grizzly on Crowbar waiting until we enter QA. The Crowbar community is already very active around this work and you’re encouraged to join.

Like any refactor, there was schedule risk when we started the Crowbar 2.x release. To mitigate this risk, we made two critical choices. First, we choose to advance the OpenStack barclamps on the 1.x code base in parallel with the 2.x work. Second, we chose a pivot date for the team to choose releasing Grizzly on the 1.x or 2.x trunks.

Choosing to jump back to 1.x was one of the hardest choices I’ve made in my career. I’m proud that we had the foresight to keep that as an option and prouder that our team rallied to make it happen.

I acknowledge that 1.x has gaps; however, getting Grizzly into the field for PoCs and pilots with 1.x provide substantial benefits to the community.  That said, there are barclamps for HA deployments and other production features that are under development on the 1.x branch and will be available in the community.

The 2.x code base provides important features but we are building from on the 1.x deployment recipes. This means that development, testing and tuning applied to the Grizzly barclamps will translates directly into Crowbar 2.x field readiness. In fact, more completeness on OpenStack can dramatically simplify Crowbar 2.x testing efforts.  This is especially true on the OpenStack Networking (fka Quantum) barclamps because they are new work.

Delivering solutions is a balance between features, timing and field experience.  The Crowbar team’s preference is to collaborate with operators in the field and that means making workable software available quickly.

I hope that you’ll agree with our approach and help us make Grizzly the most deployable OpenStack yet.


by Rob H at April 24, 2013 04:30 PM

Kyle Mestery

OpenStack Summit Portland Aftermath

Last week I attended the OpenStack Summit in Portland. This was my fifth OpenStack Summit, and a lot has changed since I attended my first OpenStack Summit in Santa Clara in 2011. Everything about this spring’s event was bigger: The crowds, the demos, the design summits. It was pretty awesome to see how far OpenStack has come, and even more exciting to see how much is left to be done. So many new ideas around virtual machine scheduling, orchestration, and automation were discussed this week. I thought I’d share some thoughts around the Summit now that things have really sunk in from last week.

Is It Time to Separate the Conference and the Design Summit?

OpenStack Networking Design Summit Session

OpenStack Networking Design Summit Session

With the growth of the conference, and the increased attendance by folks new to OpenStack, the question was asked by many folks if the time has come to split the event into a separate Conference and Design Summit. Particularly on Monday, the Design Summit rooms were packed with people, almost to the point of overflowing. The photo above was taken in the OpenStack Networking (formally the project known as Quantum), but was fairly representative of most Design Summit Sessions. For the most part, the design sessions withstood the influx of people and proceeded as they have in past conferences. And certainly having users participate in design sessions is a good thing. But the scale the conference has now attained means the organizers will need to keep a close on eye on this going forward to ensure relevant design sessions are still attainable by attendees interested in this portion of the event.

OpenStack Networking Is Still Hot

With regards to the design summit sessions and the conference in general, the interest in networking in OpenStack is at an all time high. The Networking Design Summit sessions were packed with attendees, and the discussions were very vibrant and exciting. For the most part, the discussions around Networking in OpenStack are all moving beyond basic L2 networks and into higher level items such as service insertion, VPNs, firewalls, and even L3 networks. There was a lot of good material discussed, and some great blueprints (see here and here, among others) are all set to land in Havana.

OpenStack Networking Design Summit Session

OpenStack Networking Design Summit Session

In addition to the design discussions around OpenStack Networking, there were panels, conference sessions, and plenty of hallway conversations on the topic. Almost all the networking vendors had a strong presence at the Summit including Cisco (disclosure: I work for Cisco), Brocade, Ericsson, VMware/Nicira, Big Switch, PLUMgrid, and others. The level of interest in networking around OpenStack was truly amazing.

Which leads me to my next observation.

How Many Panels on SDN Does a Single Conference Need?

It’s obvious Software Defined Networking is hot now. And per my prior observation, it’s obvious that OpenStack Networking is hot. So it would seem the two fit together nicely, and in fact, they do. But how many panel discussions around SDN and OpenStack does one conference need? There were at least two of these, and it seemed like there was a large amount of “SDN washing” going on at this conference. To some extent, this was bound to eventually happen. As technologies mature and more and more people and money are thrown at them, the hype level goes crazy. Trying to level set the conversation, especially in the Design Summit sessions, and ensure an even discourse will become increasingly challenging going forward.

Customers, Customers, and More Customers

This conference had the real feel of actual customers deploying OpenStack. Take a look at the video of the Day 2 Keynote which featured Bloomberg, Best Buy, and Comcast for a taste of how some large enterprise customers are deploying and using OpenStack. But even beyond those big three, it was easy to walk around the conference floor and bump into many other people who are in the process of deploying OpenStack into their own data centers. Most of these people come to the OpenStack party for one of two reasons: Price and scalability. But once they enter the ecosystem, they realize there is much more to OpenStack than simple economics and scalability. As I’ve written before, OpenStack is a community, and deploying OpenStack in your datacenter makes you an honorary member of that community. To some customers, the idea of open collaboration with vendors and solutions providers is a new idea. But this type of open collaboration is the way forward, and I think ultimately, this is what will help to keep customers utilizing OpenStack to solve their real business needs.

by mestery at April 24, 2013 03:21 PM

John Bresnahan

Storage != Transfer

In this post I argue that the concepts of data transfer and data storage should not be conflated into a single solution.  Like many problems in computer science, by abstracting problems into their own solution space, they can be more easily solved.  I believe that OpenStack can benefit from a new component that offloads the burden of optimally transferring images from existing components like nova-compute and swift.

Storage Systems

Within the OpenStack world there are a few interesting storage systems. Swift, Gluster, and Ceph are just three that immediately come to mind. These systems do amazing things like data redundancy, distribution, high availability, parallel access, and consistency to name just a few.  As such systems get more complex they can become aware of caching levels and tertiary storage. Storage systems also need to be concerned with the integrity of the physical media used to store the data which quickly leads to a system of checksums and forward error correction.  One can imagine how complex that can become.

I have probably missed many other challenges, and yet that list alone is near daunting.  In addition to it, storage systems need an access protocol that enables reading and writing data. The access protocol is used in many ways including random access, block level IO, small chunks, large chucks, and parallel IO.

With the access protocol users can also stream large data sets from the storage system to a client (and thereby another storage system), even across a WAN.  However I argue that such actions are often best left to a service dedicated to that job (as I described in a previous post). The storage systems control domain ends at its API.  After that, the all bytes coming and going are in the wild west.

Transfer Service

A transfer service’s primary responsibility is moving data from one place to another in the most efficient, safe, and effective way.  GridFTP and Globus Online provide good examples of transfer services.  The transfer service’s job is to make the lawless land between two storage systems safer.  It’s duty is to make sure that all bytes (or bytes that look just like them) make it across the network and to the destination, safely and quickly and without disruption to other travellers.

When dealing with large data set transfers the following must be considered:

  • Restart transfers that fail after partial completion without having to retransmit large amounts of data.
  • Negotiate the fastest/best protocol between endpoints.
  • Set protocol specific parameters for optimal performance (eg: TCP window size).
  • Schedule transfer for an optimal time (which can prevent thrashing).
  • Mange the resources it is using (network, CPU, etc) of both the source and destination and prevent over heating.
  • Allow for 3rd party transfers (do not force the end user to speak ever complex protocol).

Just as the transfer service is not concerned with data once it safely hits a storage system, the storage system should not be concerned with the above list.  Yet both services are needed in an offering like OpenStack.

Summary

When data is written to storage it should be kept safe and available.  When it is read the exact same data should be immediately available and correct.  That is the charge placed on the storage system, and that is where its charge should reasonably end.  The storage system cannot be responsible for making sure the data crosses networks to other storage systems which are often out side of its control safely and in the most efficient manner.  That is asking too much of one logical component.  That is the job of a transfer service.


by John Bresnahan, Red Hat at April 24, 2013 09:40 AM