December 15, 2017


Blog Round-up

It's time for another round-up of the great content that's circulating our community. But before we jump in, if you know of an OpenStack or RDO-focused blog that isn't featured here, be sure to leave a comment below and we'll add it to the list.

ICYMI, here's what has sparked the community's attention this month, from Ansible to TripleO, emoji-rendering, and more.

TripleO and Ansible (Part 2) by slagle

In my last post, I covered some of the details about using Ansible to deploy with TripleO. If you haven’t read that yet, I suggest starting there:


TripleO and Ansible deployment (Part 1) by slagle

In the Queens release of TripleO, you’ll be able to use Ansible to apply the software deployment and configuration of an Overcloud.


An Introduction to Fernet tokens in Red Hat OpenStack Platform by Ken Savich, Senior OpenStack Solution Architect

Thank you for joining me to talk about Fernet tokens. In this first of three posts on Fernet tokens, I’d like to go over the definition of OpenStack tokens, the different types and why Fernet tokens should matter to you. This series will conclude with some awesome examples of how to use Red Hat Ansible to manage your Fernet token keys in production.


Full coverage of libvirt XML schemas achieved in libvirt-go-xml by Daniel Berrange

In recent times I have been aggressively working to expand the coverage of libvirt XML schemas in the libvirt-go-xml project. Today this work has finally come to a conclusion, when I achieved what I believe to be effectively 100% coverage of all of the libvirt XML schemas. More on this later, but first some background on Go and XML…


Full colour emojis in virtual machine names in Fedora 27 by Daniel Berrange

Quite by chance today I discovered that Fedora 27 can display full colour glyphs for unicode characters that correspond to emojis, when the terminal displaying my mutt mail reader displayed someone’s name with a full colour glyph showing stars:


Booting baremetal from a Cinder Volume in TripleO by higginsd

Up until recently in TripleO booting, from a cinder volume was confined to virtual instances, but now thanks to some recent work in ironic, baremetal instances can also be booted backed by a cinder volume.


by Mary Thengvall at December 15, 2017 06:20 AM

December 14, 2017

OpenStack Superuser

Travel grants support global community to attend OpenStack Summit

Key contributors from 14 countries attended the recent OpenStack Summit in Sydney. Find out how the Travel Support Program could be your ticket to Vancouver.

Some Summit participants have to travel great distances to attend, but may not always have the resources or support to do so. The OpenStack Foundation helps participants reach their attendance goal via the Travel Support Program.

Winners of travel support grants at the Summit in Sydney.

The Foundation supported 30 people from 14 different countries to come participate in OpenStack Summit in Sydney.

The Travel Support Program is based on the premise of Open Design a commitment to an open design process that welcomes the public, including users, developers and upstream projects. This year the program also included individual supporters who chose to donate frequent flyer miles or funds to assist the program’s efforts.

The Summit is a great opportunity for participants to network and have important discussions regarding OpenStack contributions. Core contributor to the Manila Project, Rodrigo Barbieri made his first trip to Australia and thanks to the program, was able to have key meetings with fellow operators.

Amy Marrich also made her first trip to Australia. Through the support of the program, she was able to give back to the community, mentoring and instructing at the Upstream Institute and at a number of Women of OpenStack sessions. In addition, Marrich participated in the Forum, joining OpenStack-Ansible conversations and attended User Committee sessions, allowing the beginnings of reactivating the Diversity Working Group.

Tony Breeds, OpenStack software engineer, talked highly of the Summit and project team gatherings (PTGs), stating they are “invaluable if you need to be productive in OpenStack and the travel support program is a fantastic way of assisting community to attend.”

The deadline to apply for Travel Support to the Vancouver Summit is March 22, 2018. You can submit your application here.

Read these on how to apply for Travel support. // CC BY

The post Travel grants support global community to attend OpenStack Summit appeared first on OpenStack Superuser.

by Sonia Ramza at December 14, 2017 12:48 PM

Red Hat Stack

Red Hat OpenStack Platform 12 Is Here!

We are happy to announce that Red Hat OpenStack Platform 12 is now Generally Available (GA).

This is Red Hat OpenStack Platform’s 10th release and is based on the upstream OpenStack release, Pike.

Red Hat OpenStack Platform 12 is focused on the operational aspects to deploying OpenStack. OpenStack has established itself as a solid technology choice and with this release, we are working hard to further improve the usability aspects and bring OpenStack and operators into harmony.

Logotype_RH_OpenStackPlatform_RGB_Black (1)

With operationalization in mind, let’s take a quick look at some the biggest and most exciting features now available.


As containers are changing and improving IT operations it only stands to reason that OpenStack operators can also benefit from this important and useful technology concept. In Red Hat OpenStack Platform we have begun the work of containerizing the control plane. This includes some of the main services that run OpenStack, like Nova and Glance, as well as supporting technologies, such as Red Hat Ceph Storage. All these services can be deployed as containerized applications via Red Hat OpenStack Platform’s lifecycle and deployment tool, director.

Photo by frank mckenna on Unsplash

Bringing a containerized control plane to OpenStack is important. Through it we can immediately enhance, among other things, stability and security features through isolation. By design, OpenStack services often have complex, overlapping library dependencies that must be accounted for in every upgrade, rollback, and change. For example, if Glance needs a security patch that affects a library shared by Nova, time must be spent to ensure Nova can survive the change; or even more frustratingly, Nova may need to be updated itself. This makes the change effort and resulting change window and impact, much more challenging. Simply put, it’s an operational headache.

However, when we isolate those dependencies into a container we are able to work with services with much more granularity and separation. An urgent upgrade to Glance can be done alongside Nova without affecting it in any way. With this granularity, operators can more easily quantify and test the changes helping to get them to production more quickly.

We are working closely with our vendors, partners, and customers to move to this containerized approach in a way that is minimally disruptive. Upgrading from a non-containerized control plane to one with most services containerized is fully managed by Red Hat OpenStack Platform director. Indeed, when upgrading from Red Hat OpenStack Platform 11 to Red Hat OpenStack Platform 12 the entire move to containerized services is handled “under the hood” by director. With just a few simple preparatory steps director delivers the biggest change to OpenStack in years direct to your running deployment in an almost invisible, simple to run, upgrade. It’s really cool!

Red Hat Ansible.

Like containers, it’s pretty much impossible to work in operations and not be aware of, or more likely be actively using, Red Hat Ansible. Red Hat Ansible is known to be easier to use for customising and debugging; most operators are more comfortable with it, and it generally provides an overall nicer experience through a straightforward and easy to read format.


Of course, we at Red Hat are excited to include Ansible as a member of our own family. With Red Hat Ansible we are actively integrating this important technology into more and more of our products.

In Red Hat OpenStack Platform 12, Red Hat Ansible takes center stage.

But first, let’s be clear, we have not dropped Heat; there are very real requirements around backward compatibility and operator familiarity that are delivered with the Heat template model.

But we don’t have to compromise because of this requirement. With Ansible we are offering operator and developer access points independent of the Heat templates. We use the same composable services architecture as we had before; the Heat-level flexibility still works the same, we just translate to Ansible under the hood.

Simplistically speaking, before Ansible, our deployments were mostly managed by Heat templates driving Puppet. Now, we use Heat to drive Ansible by default, and then Ansible drives Puppet and other deployment activities as needed. And with the addition of containerized services, we also have positioned Ansible as a key component of the entire container deployment. By adding a thin layer of Ansible, operators can now interact with a deployment in ways they could not previously.

For instance, take the new openstack overcloud config download command. This command allows an operator to generate all the Ansible playbooks being used for a deployment into a local directory for review. And these aren’t mere interpretations of Heat actions, these are the actual, dynamically generated playbooks being run during the deployment. Combine this with Ansible’s cool dynamic inventory feature, which allows an operator to maintain their Ansible inventory file based on a real-time infrastructure query, and you get an incredibly powerful troubleshooting entry point.

Check out this short (1:50) video showing Red Hat Ansible and this new exciting command and concept:

Network composability.

Another major new addition for operators is the extension of the composability concept into networks.

As a reminder, when we speak about composability we are talking about enabling operators to create detailed solutions by giving them basic, simple, defined components from which they can build for their own unique, complex topologies.

With composable networks, operators are no longer only limited to using the predefined networks provided by director. Instead, they can now create additional networks to suit their specific needs. For instance, they might create a network just for NFS filer traffic, or a dedicated SSH network for security reasons.

Photo by Radek Grzybowski on Unsplash

And as expected, composable networks work with composable roles. Operators can create custom roles and apply multiple, custom networks to them as required. The combinations lead to an incredibly powerful way to build complex enterprise network topologies, including an on-ramp to the popular L3 spine-leaf topology.

And to make it even easier to put together we have added automation in director that verifies that resources and Heat templates for each composable network are automatically generated for all roles. Fewer templates to edit can mean less time to deployment!

Telco speed.

Telcos will be excited to know we are now delivering production ready virtualized fast data path technologies. This release includes Open vSwitch 2.7 and the Data Plane Development Kit (DPDK) 16.11 along with improvements to Neutron and Nova allowing for robust virtualized deployments that include support for large MTU sizing (i.e. jumbo frames) and multiple queues per interface. OVS+DPDK is now a viable option alongside SR-IOV and PCI passthrough in offering more choice for fast data in Infrastructure-as-a-Service (IaaS) solutions.

Operators will be pleased to see that these new features can be more easily deployed thanks to new capabilities within Ironic, which store environmental parameters during introspection. These values are then available to the overcloud deployment providing an accurate view of hardware for ideal tuning. Indeed, operators can further reduce the complexity around tuning NFV deployments by allowing director to use the collected values to dynamically derive the correct parameters resulting in truly dynamic, optimized tuning.

Serious about security.


Helping operators, and the companies they work for, focus on delivering business value instead of worrying about their infrastructure is core to Red Hat’s thinking. And one way we make sure everyone sleeps better at night with OpenStack is through a dedicated focus on security.

Starting with Red Hat OpenStack Platform 12 we have more internal services using encryption than in any previous release. This is an important step for OpenStack as a community to help increase adoption in enterprise datacenters, and we are proud to be squarely at the center of that effort. For instance, in this release even more services now feature internal TLS encryption.

Let’s be realistic, though, focusing on security extends beyond just technical implementation. Starting with Red Hat OpenStack Platform 12 we are also releasing a comprehensive security guide, which provides best practices as well as conceptual information on how to make an OpenStack cloud more secure. Our security stance is firmly rooted in meeting global standards from top international agencies such as FedRAMP (USA), ETSI (Europe), and ANSSI (France). With this guide, we are excited to share these efforts with the broader community.

Do you even test?

How many times has someone asked an operations person this question? Too many! “Of course we test,” they will say. And with Red Hat OpenStack Platform 12 we’ve decided to make sure the world knows we do, too.

Through the concept of Distributed Continuous Integration (DCI), we place remote agents on site with customers, partners, and vendors that continuously build our releases at all different stages on all different architectures. By engaging outside resources we are not limited by internal resource restrictions; instead, we gain access to hardware and architecture that could never be tested in any one company’s QA department. With DCI we can fully test our releases to see how they work under an ever-increasing set of environments. We are currently partnered with major industry vendors for this program and are very excited about how it helps us make the entire OpenStack ecosystem better for our customers.

So, do we even test? Oh, you bet we do!

Feel the love!

Photo by grafxart photo on Unsplash

And this is just a small piece of the latest Red Hat OpenStack Platform 12 release. Whether you are looking to try out a new cloud, or thinking about an upgrade, this release brings a level of operational maturity that will really impress!

Now that OpenStack has proven itself an excellent choice for IaaS, it can focus on making itself a loveable one.

Let Red Hat OpenStack Platform 12 reignite the romance between you and your cloud!

Red Hat OpenStack Platform 12 is designated as a “Standard” release with a one-year support window. Click here for more details on the release lifecycle for Red Hat OpenStack Platform.

Find out more about this release at the Red Hat OpenStack Platform Product page. Or visit our vast online documentation.

And if you’re ready to get started now, check out the free 60-day evaluation available on the Red Hat portal.

Looking for even more? Contact your local Red Hat office today.


by August Simonelli, Technical Marketing Manager, Cloud at December 14, 2017 01:49 AM

December 13, 2017

OpenStack Superuser

Making your first contact with OpenStack

Everyone has to start from somewhere. To make things easier for new contributors to OpenStack, a First Contact SIG (special interest group) has banded together.

Its mission?

“To provide a place and group of people for new contributors to come to for information and advice. New contributors are the future of OpenStack and the surrounding community. Its important to make sure they feel welcome and give them the tools to succeed.”

And if it sounds vaguely sci-fi – it does bring to mind the 1996 Star Trek film of the same name whose plot revolves around time travel.

In an effort to make time differences less problematic for newbies, members of the recently formed SIG are looking for more experienced OpenStack contributors to be available on IRC when that crucial first contact (or even the first handful) are made.

First Contact SIG chair Kendall Nelson, upstream developer advocate at the OpenStack Foundation, tells Superuser that founding members cover a range of time zones (Asia, Europe, the United States) they are looking to get bulk up these welcome-wagon types on IRC wherever they may be. They also would love to have another chair to represent the operations side of OpenStack contributions.

If you’re interested in helping out (or need help!) you can reach out to them over IRC in the #openstack-dev channel:

▪ Zhipeng Huang (zhipeng) UTC +8
▪ Amy Marrich (spotz) UTC-6
▪ Colleen Murphy (cmurphy) UTC+1
▪ Ildikó Váncsa (ildikov) UTC+1 (UTC+2 with daylight saving)

More resources:
If you’re looking for long-term or speed mentoring, or information about Outreachy internships head over to the mentoring Wiki. For information about learning to contribute or Upstream Institute check that out here. The Contributor Portal is also a good place to start.

Here are some of Superuser’s most popular resources for newcomers:
OpenStack basics: An overview for the absolute beginner

OpenStack basics: An overview for the absolute beginner

How to set up your work environment to become an OpenStack developer

How to take your OpenStack contributions to the next level

From zero to hero: Your first week as an OpenStack contributor

Cover Photo // CC BY NC

The post Making your first contact with OpenStack appeared first on OpenStack Superuser.

by Superuser at December 13, 2017 05:07 PM

James Slagle

TripleO and Ansible (Part 2)

In my last post, I covered some of the details about using Ansible to deploy
with TripleO. If you haven’t read that yet, I suggest starting there:

I’ll now cover interacting with Ansible more directly.

When using --config-download as a deployment argument, a Mistral workflow will be enabled that runs ansible-playbook to apply the deployment and configuration data to each node. When the deployment is complete, you can interact with the files that were created by the workflow.

Let’s take a look at how to do that.

You need to have a shell on the Undercloud. Since the files used by the workflow potentially contain sensitive data, they are only readable by the mistral user or group. So either become the root user, or add your interactive shell user account (typically “stack”) to the mistral group:

sudo usermod -a -G mistral stack
# Activate the new group
newgrp mistral

Once the permissions are sorted, change to the mistral working directory for
the config-download workflows:

cd /var/lib/mistral

Within that directory, there will be directories named according to the Mistral
execution uuid. An easy way to find the most recent execution of
config-download is to just cd into the most recently created directory and list
the files in that directory:

cd 2747b55e-a7b7-4036-82f7-62f09c63d671

The following files (or a similar set, as things could change) will exist:


All the files that are needed to re-run ansible-playbook are present. The exact ansible-playbook command is saved in Let’s take a look at that file:

$ cat


ansible-playbook -v /var/lib/mistral/2747b55e-a7b7-4036-82f7-62f09c63d671/deploy_steps_playbook.yaml --user tripleo-admin --become --ssh-extra-args "-o StrictHostKeyChecking=no" --timeout 240 --inventory-file /var/lib/mistral/2747b55e-a7b7-4036-82f7-62f09c63d671/tripleo-ansible-inventory --private-key /var/lib/mistral/2747b55e-a7b7-4036-82f7-62f09c63d671/ssh_private_key $@

You can see how the call to ansible-playbook is reproduced in this script. Also notice that $@ is used to pass any additional arguments directly to ansible-playbook when calling this script, such as --check, --limit, --tags, --start-at-task, etc.

Some of the other files present are:

  • tripleo-ansible-inventory
    • Ansible inventory file containing hosts and vars for all the Overcloud nodes.
  • ansible.log
    • Log file from the last run of ansible-playbook.
  • ansible.cfg
    • Config file used when running ansible-playbook.
    • Executable script that can be used to rerun ansible-playbook.
  • ssh_private_key
    • Private ssh key used to ssh to the Overcloud nodes.

Within the group_vars directory, there is a corresponding file per role. In my
example, I have a Controller role. If we take a look at group_vars/Controller we see it contains:

$ cat group_vars/Controller
- HostsEntryDeployment
- DeployedServerBootstrapDeployment
- UpgradeInitDeployment
- InstanceIdDeployment
- NetworkDeployment
- ControllerUpgradeInitDeployment
- UpdateDeployment
- ControllerDeployment
- SshHostPubKeyDeployment
- ControllerSshKnownHostsDeployment
- ControllerHostsDeployment
- ControllerAllNodesDeployment
- ControllerAllNodesValidationDeployment
- ControllerArtifactsDeploy
- ControllerHostPrepDeployment

Controller_post_deployments: []

The <RoleName>_pre_deployments and <RoleName>_post_deployments variables contain the list of Heat deployment names to run for that role. Suppose we wanted to just rerun a single deployment. That command would be:

$ ./ --tags pre_deploy_steps -e Controller_pre_deployments=ControllerArtifactsDeploy -e force=true

That would run just the ControllerArtifactsDeploy deployment. Passing -e force=true is necessary to force the deployment to rerun. Also notice we restrict what tags get run with --tags pre_deploy_steps.

For documentation on what tags are available see:

Finally, suppose we wanted to just run the 5 deployment steps that are the same for all nodes of a given role. We can use --limit <RoleName>, as the role names are defined as groups in the inventory file. That command would be:

$ ./ --tags deploy_steps --limit Controller

I hope this info is helpful. Let me know what you want to see next.




Cross posted at:


by slagle at December 13, 2017 01:12 PM

December 12, 2017

StackHPC Team Blog

Verne Global's hpcDIRECT Service: Bare Metal Powered by Molten Rock

After some months of development, prototyping and early customer trials, Verne Global's hpcDIRECT service has been announced. This system builds upon OpenStack Ironic to deliver the performance of bare metal combined with the dynamic flexibility of cloud. Iceland's abundant geothermal energy ensures the energy is clean but the power costs are kept low.

You can find plenty of details about Verne Global's announcement, for example as covered by Inside HPC or The Next Platform. However, we can also talk about some of the transformative technologies going on within the infrastructure.

Working in partnership the with technical team at Verne Global, StackHPC has designed, developed and deployed an OpenStack control plane that pushes the boundaries of OpenStack and Ironic:

  • OpenStack Pike, the latest release, in an agile deployment that tracks upstream closely.
  • Multi-network support, including both Ethernet and Infiniband, with tenant isolation throughout.
  • Custom resource classes for supporting Nova sheduling to multiple bare metal server specs.
  • Kolla-Ansible control plane deployed using Kayobe, the de-facto solution for automating Kolla-Ansible deployments from scratch.

In many other ways, hpcDIRECT is a showcase for the use of OpenStack for HPC applications:

  • Flexible delivery of complex application environments using Ansible and Heat.
  • HPC-oriented storage options, designed to meet a wide range of use cases for HPC and data-intensive analytics.
  • HPC workload telemetry gathered using Monasca or Prometheus and shared with users to provide performance insights into the behaviour of their codes.
  • Consistent management tools everywhere - for the first time, a system that is managed at every level from firmware to OS to application using the same tools (Ansible and friends).

Lewis Tucker, Verne Global Enterprise Solutions Architect, comments "From the outset we had an amitious goal to provide a bare metal on demand HPC Service. StackHPC's expertise has enabled us to execute our plans to timescales and budget and offer our customers an innovative and agile HPC service to meet their exacting requirements. We look forward to developing our relationship as hpcDIRECT grows".

John Taylor, CEO of StackHPC, adds "the hpcDIRECT project has validated our vision of an HPC-enabled OpenStack and gone even further to make it available as a service. We are thrilled to have been involved in this cutting-edge project."


by Stig Telfer at December 12, 2017 10:00 PM

Chris Dent

TC Report 50

Several members of the TC were at KubeCon last week so there's been limited activity since the last report and, of what has happened, much of it is related to establishing productive relations with the Kubernetes community.

Coping with Strategic Contributions

One of the things that came out of the conversations that happened with the k8s community is that they too are struggling to deal with ensuring what ttx describes as strategic contributions: functionality that is of benefit to the entire community as opposed to features pushed upstream by corporations to enhance their position or visibility in the market.

There's some discussion of this on Thursday and some again Today (Tuesday).

I expressed some confusion about the term, as what's strategic is in the eye of the beholder and the executor, and a mind of a certain disposition tends towards thinking of strategic contributions as advancing nefarious gains, and clearly the happy world of open source would never be nefarious, so it must be corporations.

A positive outcome from the conversations last week is "...we'll push the respective corporate sponsors of our respective foundations to report 'how they contribute to the project'". An important distinction here from other attempts at encouraging long-term, of benefit to the entire community, contributions is that this would be human narrative, not numbers. Stories of the ways in which corporate sponsors have enabled lasting improvements.

Thursday afternoon's discussion had some additional details on how and why human curated information may be most useful for this kind of thing, but there are the usual constraints of "who will do that?"

Other KubeCon

There were also summaries from ttx and dhellman on a variety of additional topics from KubeCon in today's log. Some interesting points in there beyond strategic contributions, including:

There are going to be a lot of opportunities for learning and collaborating. I hope we'll see some further summaries (human narrative!) from the people who were there.

Stuck Reviews

There are a couple of governance reviews in progress that need input from the wider community. As things currently stand they are a bit stuck:

by Chris Dent at December 12, 2017 05:45 PM

OpenStack Superuser

Why now is the time for multi-cloud

AUSTIN — Craig McLuckie isn’t a fan of every buzzword flitting across the tech landscape. He admits that multi-cloud wasn’t one of his favorites, but he’s changing his mind.

“I’m starting to see a deep legitimacy to multi-cloud,”  says McLuckie,  co-founder of Kubernetes and now CEO of Heptio, “I used to nod and smile when people talked about it, but never really believed it. I’m starting to see it for reals now.”

Multi-cloud was one of the three main pillars that he sees for the future of Kubernetes — the other two are improving developer productivity and tackling enterprise — that he elaborated on during keynotes at KubeCon + CloudNativeCon North America 2017.  And while it’s worth noting that multi-cloud is really happening, “we really need strong conformance” to make it materialize, he says.

He gave a shout out to cloud providers who introduced Kubernetes-as-a-service offerings, especially those that are upstream friendly. McLuckie says all signs point to positive because “we’re getting into a situation where it doesn’t matter who delivers it, how it operates or how it’s provisioned – but it matters how it runs.”

Circling back to the stress conformance, he added: “for us to get to a high level of assurance that Kubernetes is working exactly as it should conformance really matters. We need to have the ability to attest that these clusters they’re semantically consistent that they have the same behavior.” To this end, the CNCF-certified Kubernetes program is important, he says.

“When you see that Kubernetes logo on someone’s offering you can have a relatively high degree of assurance that it’s working as it should..” This in turn makes hybrid cloud real. Operators can run something on-premise — something they construct or from distro providers — and something from a public cloud that’s delivered by their preferred container-as-a-service (or Kubernetes-as-a-servce.)

The next question is to ensure that everything is semantically consistent in these
environments. Making multi-cloud a reality hinges on the ability to run the same tests in both places and the ability to certify your on-premise piece. (Here he gave a quick plug to Heptio’s Sonobuoy, a diagnostic tool for running Kubernetes conformance tests.)

For McLuckie, multi-cloud is not about taking an application and running it in two clouds. There are always a few exceptions, he says — for example, banks running long stateless cycles or bitcoin mining. But “if you’re doing anything that has massive amounts of data, nothing creates gravity or inertia like data,” he says adding that when you start to pull in service dependencies they will hold you back, “you have to be judicious about it.”

He defines multi-cloud as the flexibility to pick which cloud to deploy your new service to — being able to run things in different regions, have the data in those regions and the ability to pick cloud providers for those regions as well.

How to get there? Provide not just a “tested stable narrative” but get to the point where you can start to extend governance, risk management, compliance, policy and security enforcement with a common abstraction across the top. “That’s why extensibility will become so important,” he says giving a nod to a Heptio’s recent collaboration with Microsoft to use the Seattle-based startup’s Ark, an open source utility for managing disaster recovery and backup of Kubernetes clusters.

This, he says, is where it really get’s exciting. “This technology allows you not only to take your cluster state and restore it there — but to potentially restore somewhere else…” He admits that it’s not possible yet on the persistent volumes “But you can see the promise of this as a way to start moving things around inside clouds directly — whether it’s Microsoft cloud or between clouds or from on-prem clouds…and Kubernetes is key to this whole thing.”



The post Why now is the time for multi-cloud appeared first on OpenStack Superuser.

by Nicole Martinelli at December 12, 2017 05:44 PM

December 11, 2017

Red Hat Stack

Enabling Keystone’s Fernet Tokens in Red Hat OpenStack Platform

As we learned in part one of this blog post, beginning with the OpenStack Kilo release, a new token provider is now available as an alternative to PKI and UUID. Fernet tokens are essentially an implementation of ephemeral tokens in Keystone. What this means is that tokens are no longer persisted and hence do not need to be replicated across clusters or regions.

“In short, OpenStack’s authentication and authorization metadata is neatly bundled into a MessagePacked payload, which is then encrypted and signed as a Fernet token. OpenStack Kilo’s implementation supports a three-phase key rotation model that requires zero downtime in a clustered environment.” (from:

In our previous post, I covered the different types of tokens, the benefits of Fernet and a little bit of the technical details. In this part of our three part series we provide a method for enabling Fernet tokens on Red Hat OpenStack Platform Platform 10, during both pre and post deployment of the overcloud stack.

Pre-Overcloud Deployment

Official Red Hat documentation for enabling Fernet tokens in the overcloud can be found here:

Deploy Fernet on the Overcloud


We’ll be using the Red Hat OpenStack Platform here, so this means we’ll be interacting with the director node and heat templates. Our primary tool is the command-line client keystone-manage, part of the tools provided by the openstack-keystone RPM and used to set up and manage keystone in the overcloud. Of course, we’ll be using the director-based deployment of Red Hat’s OpenStack Platform to enable Fernet pre and/or post deployment.

Photo by Barn Images on Unsplash

Prepare Fernet keys on the undercloud

This procedure will start with preparation of the Fernet keys, which a default  deployment places on each controller in /etc/keystone/fernet-keys. Each controller must have the same keys, as tokens issued on one controller must be able to be validated on all controllers. Stay tuned to part three of this blog for an in-depth explanation of Fernet signing keys.

  1. Source the stackrc file to ensure we are working with the undercloud:
$ source ~/stackrc‍‍‍‍‍‍‍‍‍‍‍‍
  1. From your director, use keystone_manage to generate the Fernet keys as deployment artifacts:
$ sudo keystone-manage fernet_setup \
    --keystone-user keystone \
    --keystone-group keystone‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
  1. Tar up the keys for upload into a swift container on the undercloud:
$ sudo tar -zcf keystone-fernet-keys.tar.gz /etc/keystone/fernet-keys‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
  1. Upload the Fernet keys to the undercloud as swift artifacts (we assume your templates exist in ~/templates):
$ upload-swift-artifacts -f keystone-fernet-keys.tar.gz \
    --environment ~/templates/deployment-artifacts.yaml‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
  1. Verify that your artifact exists in the undercloud:
$ swift list overcloud-artifacts Keystone-fernet-keys.tar.gz

NOTE: These keys should be secured as they can be used to sign and validate tokens that will have access to your cloud.‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

  1. Let’s verify that deployment-artifacts.yaml exists in ~/templates (NOTE: your URL detail will differ from what you see here – as this is a uniquely generated temporary URL):
$ cat ~/templates/deployment-artifacts.yaml
# Heat environment to deploy artifacts via Swift Temp URL(s)
    - '

NOTE: This is the swift URL that your overcloud deployment will use to copy the Fernet keys to your controllers.

  1. Finally, generate the fernet.yaml template to enable Fernet as the default token provider in your overcloud:
$ cat << EOF > ~/templates/fernet.yaml
            keystone::token_provider: 'fernet'‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Deploy and Validate

At this point, you are ready to deploy your overcloud with Fernet enabled as the token provider, and your keys distributed to each controller in /etc/keystone/fernet-keys.

Photo by Glenn Carstens-Peters on Unsplash

NOTE: This is an example deploy command, yours will likely include many more templates. For the purposes of our discussion, it is important that you simply include fernet.yaml as well as deployment-artifacts.yaml.

$ openstack overcloud deploy \
--templates /home/stack/templates \
-e  /home/stack/templates/environments/deployment-artifacts.yaml \
-e /home/stack/templates/environments/fernet.yaml \
--control-scale 3 \
--compute-scale 4 \
--control-flavor control \
--compute-flavor compute \


Once the deployment is done you should validate that your overcloud is indeed using Fernet tokens instead of the default UUID token provider. From the director node:

$ source ~/overcloudrc
$ openstack token issue
| Field      | Value                                    |
| expires    | 2017-03-22 19:16:21+00:00                |
| id | gAAAAABY0r91iYvMFQtGiRRqgMvetAF5spEZPTvEzCpFWr3  |
|    | 1IB8T8L1MRgf4NlOB6JsfFhhdxenSFob_0vEEHLTT6rs3Rw  |
|    | q3-Zm8stCF7sTIlmBVms9CUlwANZOQ4lRMSQ6nTfEPM57kX  |
|    | Xw8GBGouWDz8hqDYAeYQCIHtHDWH5BbVs_yC8ICXBk       |
| project_id | f8adc9dea5884d23a30ccbd486fcf4c6         |
| user_id    | 2f6106cef80741c6ae2bfb3f25d70eee         |

Note the length of this token in the “id” field. This is a Fernet token.

Enabling Fernet Post Overcloud Deployment

Part of the power of the Red Hat OpenStack Platform director deployment methodology lies in its ability to easily upgrade and change a running overcloud. Features such as Fernet, scaling, and complex service management, can be managed by running a deployment update directly against a running overcloud.

Updating is really straightforward. If you’ve already deployed your overcloud with UUID tokens you can change them to Fernet by simply following the pre-deploy example above and run the openstack deploy command again, with the enabled heat templates mentioned, against your running deployment! This will change your overcloud token default to Fernet. Be sure to deploy with your original deploy command, as any changes there could affect your overcloud. And of course, standard outage windows apply – production changes should be tested and prepared accordingly.


I hope you’ve enjoyed our discussion on enabling Fernet tokens in the overcloud. Additionally, I hope that I was able to shed some light on this process as well. Official documentation on these concepts and Fernet tokens in the overcloud process is available

In our last, and final instalment on this topic we’ll look at some of the many methods for rotating your newly enabled Fernet keys on your controller nodes. We’ll be using Red Hat’s awesome IT automation tool, Red Hat Ansible to do just that.

by Ken Savich, Senior OpenStack Solution Architect at December 11, 2017 08:59 PM

OpenStack Superuser

How a new storage system delivers the goods for MercadoLibre

If you can’t find something in a store in Buenos Aires, the shop owner will often recommend you buy it online at MercadoLibre. (Looking for fancy Christmas lights? A yoga mat? It’s often cheaper — as well as faster and more secure – to buy online and have what you need delivered to that same neighborhood store.)

The online marketplace operates in 19 countries stretching from to Guatemala to its home country Argentina offering what amounts to the services of Amazon, eBay, Google AdSense and Paypal under one virtual roof. It’s tough to overstate their dominance of the market — let’s just say that Marcos Galperin, co-founder and CEO, ranked 17 in Bloomberg’s top 50 global business leaders for 2017 after revenue increased 61 percent following a foray into financial services.

To keep that market flowing as freely as the name implies, you need speed. An early adopter of OpenStack, within its OpenStack cloud MercadoLibre ran a variety of applications, including MySQLand MongoDB databases, all of which were suffering from slow, unpredictable performance, according to a recent case study. Their prior storage set-up didn’t scale easily or nondisruptively and the company needed its new storage solution to scale granularly.

“We were looking for a solution that could scale out so that we could add more and more nodes, growing capacity and performance without downtime,” says Mariano Guelar, a storage specialist at MercadoLibre. Integration with OpenStack block storage Cinder and Horizon dashboard were also key requirements. They were familiar with OpenStack Foundation gold member NetApp and ended up choosing SolidFire, introduced at the 2013 Summit, as a solution: they now run 12 SolidFire SF6010 that are deployed in U.S.-based data center.

The results from this new storage system came quickly: thanks to inline duplication, compression and thin provisioning, they have reduced data by 6.5 times and latency from 50 milliseconds to less than two.
“We are saving a lot of space and power in our data center now, so that’s a great benefit. That was really, really good for us,” says Guelar.

Full case study over at NetApp (.PDF)

The post How a new storage system delivers the goods for MercadoLibre appeared first on OpenStack Superuser.

by Superuser at December 11, 2017 05:49 PM

James Slagle

TripleO and Ansible deployment (Part 1)

In the Queens release of TripleO, you’ll be able to use Ansible to apply the
software deployment and configuration of an Overcloud.

Before jumping into some of the technical details, I wanted to cover some
background about how the Ansible integration works along side some of the
existing tools in TripleO.

The Ansible integration goes as far as offering an alternative to the
communication between the existing Heat agent (os-collect-config) and the Heat
API. This alternative is opt-in for Queens, but we are exploring making it the
default behavior for future releases.

The default behavior for Queens (and all prior releases) will still use the
model where each Overcloud node has a daemon agent called os-collect-config
that periodically polls the Heat API for deployment and configuration data.
When Heat provides updated data, the agent applies the deployments, making
changes to the local node such as configuration, service management,
pulling/starting containers, etc.

The Ansible alternative instead uses a “control” node (the Undercloud) running
ansible-playbook with a local inventory file and pushes out all the changes to
each Overcloud node via ssh in the typical Ansible fashion.

Heat is still the primary API, while the parameter and environment files that
get passed to Heat to create an Overcloud stack remain the same regardless of
which method is used.

Heat is also still fully responsible for creating and orchestrating all
OpenStack resources in the services running on the Undercloud (Nova servers,
Neutron networks, etc).

This sequence diagram will hopefully provide a clear picture:

Replacing the application and transport layer of the deployment with Ansible
allows us to take advantage of features in Ansible that will hopefully make
deploying and troubleshooting TripleO easier:

  • Running only specific deployments
  • Including/excluding specific nodes or roles from an update
  • More real time sequential output of the deployment
  • More robust error reporting
  • Faster iteration and reproduction of deployments

Using Ansible instead of the Heat agent is easy. Just include 2 extra cli args
in the deployment command:

-e /path/to/templates/environments/config-download-environment.yaml \

Once Heat is done creating the stack (will be much faster than usual), a
separate Mistral workflow will be triggered that runs ansible-playbook to
finish the deployment. The output from ansible-playbook will be streamed to
stdout so you can follow along with the progress.

Here’s a demo showing what a stack update looks like:

(I suggest making the demo fully screen or watch it here:

Note that we don’t get color output from ansible-playbook since we are
consuming the stdout from a Zaqar queue. However, in my next post I will go
into how to execute ansible-playbook manually, and detail all of the related
files (inventory, playbooks, etc) that are available to interact with manually.

If you want to read ahead, have a look at the official documentation:


The infrastructure that hosts this blog may go away soon. In which case I’m
also cross posting to:


by slagle at December 11, 2017 03:14 PM

December 08, 2017

Lee Yarwood

OpenStack TripleO FFU M2 progress report

This is a brief progress report from the Upgrades squad for the fast-forward upgrades (FFU) feature in TripleO, introducing N to Q upgrades.

tl;dr Good initial progress, missed M2 goal of nv CI jobs, pushing on to M3.


For anyone unfamiliar with the concept of fast-forward upgrades the following sentence from the spec gives a brief high level introduction:

> Fast-forward upgrades are upgrades that move an environment from release `N`
> to `N+X` in a single step, where `X` is greater than `1` and for fast-forward
> upgrades is typically `3`.

The spec itself obviously goes into more detail and I’d recommend anyone wanting to know more about our approach for FFU in TripleO to start there:

Note that the spec is being updated at present by the following change, introducing more details on the FFU task layout, ordering, dependency on the on-going major upgrade rework in Q, canary compute validation etc:

WIP ffu: Spec update for M2

M2 Status

The original goal for Queens M2 was to have one or more non-voting FFU jobs deployed somewhere able to run through the basic undercloud and overcloud upgrade workflows, exercising as many compute service dependencies as we could up to and including Nova. Unfortunately while Sofer has made some great progress with this we do not have any running FFU jobs at present:

We do however have documented demos that cover FFU for some limited overcloud environments from Newton to Queens:

OpenStack TripleO FFU Keystone Demo N to Q

OpenStack TripleO FFU Nova Demo N to Q

These demos currently use a stack of changes against THT with the first ~4 or so changes introducing the FFU framework:

FWIW getting these initial changes merged would help avoid the current change storm every time this series is rebased to pick up upgrade or deploy related bug fixes.

Also note that the demos currently use the raw Ansible playbooks stack outputs to run through the FFU tasks, upgrade tasks and deploy tasks. This is by no means what the final UX will be, with python-tripleoclient and workflow work to be completed ahead of M3.

M3 Goals

The squad will be focusing on the following goals for M3:

  • Non-voting RDO CI jobs defined and running
  • FFU THT changes tested by the above jobs and merged
  • python-tripleoclient & required Mistral workflows merged
  • Use of ceph-ansible for Ceph upgrades
  • Draft developer and user docs under review

FFU squad

Finally, a quick note to highlight that this report marks the end of my own personal involvement with the FFU feature in TripleO. I’m not going far, returning to work on Nova and happy to make time to talk about and review FFU related changes etc. The members of the upgrade squad taking this forward and your main points of contact for FFU in TripleO will be:

  • Sofer (chem)
  • Lukas (social)
  • Marios (marios)

My thanks again to Sofer, Lukas, Marios, the rest of the upgrade squad and wider TripleO community for your guidance and patience when putting up with my constant inane questioning regarding FFU over the past few months!

December 08, 2017 05:00 PM

OpenStack Superuser

Kubernetes: Why it’s time to be boring

AUSTIN — Waiting for your phone to charge: boring. Refreshing a web page multiple times: boring. Glacially slow wifi: boring.

Tiresome tech isn’t usually a good thing, but the Kubernetes community hopes to become the good kind of boring. So boring that there’s nothing to fiddle with, no workarounds needed. Day two morning keynotes at KubeCon + CloudNativeCon North America 2017 focused on the aspiration to good-boring and how to get there.

First up was Kelsey Hightower, staff developer advocate at Google, who brought up the energy, the star power and the demos. (The “rockstar” thing gets thrown around too often and too easily, but when audience members shout “I love you” at the speaker, the person probably deserves the title.)

“We’ve reached a major milestone: the most recent changes were so boring that I have no updates for you,” says Hightower, adding that this was the goal the whole time. “We want to get Kubernetes to a place where can build things on top of it, grow the community and the ecosystem and keep the core boring — so a round of applause to the entire community for getting it close to boring.”

That said, he was quick to advise newcomers against making it the wrong kind of boring. “If you’re new to Kubernetes, it’s going to set you free, but first it’s going to piss you off,” especially if you’re doing it wrong, says the author of a guide called “Kubernetes the Hard Way.”

His target: people over-relying on command-line interface kubectl; “kubectl is the new SSH…If you’re using kubectl to deploy from your laptop you’re missing the point. If you’re doing it right no one should know you’re using Kubernetes.”

To better make his point, he launched into live demo to show how much easier it can get by using Google Now to provision an 8-node cluster. “That’s what I call Kubernetes the easy way,” the phone shot back at him when it was all said and done. The crowd was impressed — but he had another thing to show on the quest for boring: a developer workflow featuring Grafana.

“As a dev you have one work flow in mind everything else is noise,” he says. This is how you see things…you don’t want to install Kubernetes.” Using a registry with a hello world app, he cloned the repo, then developed against it. Simple changes (from hello world to “go is the best programing language ever”) and then pushed the commit to great effect.
Using the voice commands on his smartphone again, he scaled 10 replicas. “I gotta admit, that was pretty dope,” the phone pronounced.

Clayton Coleman, architect, Kubernetes and OpenShift at Red Hat, underlined Hightower’s message. “Red Hat helps build boring software,” he says. “Open Source is not always an exciting thing. People have to chop wood and carry water.”

People tend to think that “exciting” means launching a new feature, he says, but in the infrastructure world “exciting means everything is on fire, again.” Stressing that the goal isn’t infrastructure but what what you build with it, he talked about three typical kinds of fire and what to do about them.

So uninteresting you can run “Game of Thrones” with it

Illya Chekrygin and Zihao Yu from HBO’s digital delivery team offered up a stellar case study about how they’re pushing the limits of Kubernetes with the popular drama series.
The short version: they run everything on Kubernetes, EC2 and a lot of nginx pods.


Future boring

Engineering director Chen Goldberg came out in a cape to talk about the “superpowers” of Kubernetes — extensibility and automation. She brought out software engineer Anthony Yeh to demo a metacontroller that makes it easy for anyone to write controllers. Called kube-metacontroller it offers lightweight Kubernetes controllers-as-a-service. You can check it out here.

“We’re creating a general purpose platform for developers running any application, what we showed today is an important building block towards that,” Goldberg said. “We want to make the ecosystem richer and make more of the Google tools available to everybody as open source,” she added. “Remember: we all possess this superpower, I hope you take advantage of it.”

The post Kubernetes: Why it’s time to be boring appeared first on OpenStack Superuser.

by Nicole Martinelli at December 08, 2017 03:20 PM


Gate repositories on Github with Software Factory and Zuul3


Software Factory is an easy to deploy software development forge. It provides, among others features, code review and continuous integration (CI). The latest Software Factory release features Zuul V3 that provides integration with Github.

In this blog post I will explain how to configure a Software Factory instance, so that you can experiment with gating Github repositories with Zuul.

First we will setup a Github application to define the Software Factory instance as a third party application and we will configure this instance to act as a CI system for Github.

Secondly, we will prepare a Github test repository by:

  • Installing the application on it
  • configuring its master branch protection policy
  • providing Zuul job description files

Finally, we will configure the Software Factory instance to test and gate Pull Requests for this repository, and we will validate this CI by opening a first Pull Request on the test repository.

Note that Zuul V3 is not yet released upstream however it is already in production, acting as the CI system of OpenStack.


A Software Factory instance is required to execute the instructions given in this blog post. If you need an instance, you can follow the quick deployment guide in this previous article. Make sure the instance has a public IP address and TCP/443 is open so that Github can reach Software Factory via HTTPS.

Application creation and Software Factory configuration

Let's create a Github application named myorg-zuulapp and register it on the instance. To do so, follow this section from Software Factory's documentation.

But make sure to:

  • Replace fqdn in the instructions by the public IP address of your Software Factory instance. Indeed the default hostname won't be resolved by Github.
  • Check "Disable SSL verification" as the Software Factory instance is by default configured with a self-signed certificate.
  • Check "Only on this account" for the question "Where can this Github app be installed".

Configuration of the app part 1 Configuration of the app part 2 Configuration of the app part 3

After adding the github app settings in /etc/software-factory/sfconfig.yaml, run:

sudo sfconfig --enable-insecure-slaves --disable-fqdn-redirection

Finally, make sure can contact the Software Factory instance by clicking on "Redeliver" in the advanced tab of the application. Having the green tick is the pre-requisite to go further. If you cannot get it, the rest of the article will not be able to be accomplished successfuly.

Configuration of the app part 4

Define Zuul3 specific Github pipelines

On the Software Factory instance, as root, create the file config/zuul.d/gh_pipelines.yaml.

cd /root/config
cat <<EOF > zuul.d/gh_pipelines.yaml
- pipeline:
    description: |
      Newly uploaded patchsets enter this pipeline to receive an
      initial +/-1 Verified vote.
    manager: independent
        - event: pull_request
            - opened
            - changed
            - reopened
        - event: pull_request
          action: comment
          comment: (?i)^\s*recheck\s*$
        status: 'pending'
        status-url: "{}/status.html"
        comment: false
        status: 'success'
        status: 'failure'

- pipeline:
    description: |
      Changes that have been approved by core developers are enqueued
      in order in this pipeline, and if they pass tests, will be
    success-message: Build succeeded (gate pipeline).
    failure-message: Build failed (gate pipeline).
    manager: dependent
    precedence: high
          - permission: write
        status: "myorg-zuulapp[bot]:local/"
        open: True
        current-patchset: True
        - event: pull_request_review
          action: submitted
          state: approved
        - event: pull_request
          action: status
          status: "myorg-zuulapp[bot]:local/"
        status: 'pending'
        status-url: "{}/status.html"
        comment: false
        status: 'success'
        merge: true
        status: 'failure'
sed -i s/myorg/myorgname/ zuul.d/gh_pipelines.yaml

Make sure to replace "myorgname" by the organization name.

git add -A .
git commit -m"Add pipelines"
git push git+ssh://gerrit/config master

Setup a test repository on Github

Create a repository called ztestrepo, initialize it with an empty

Install the Github application

Then follow the process below to add the application myorg-zuulapp to ztestrepo.

  1. Visit your application page, e.g.:
  2. Click “Install”
  3. Select ztestrepo to install the application on
  4. Click “Install”

Then you should be redirected on the application setup page. This can be safely ignored for the moment.

Define master branch protection

We will setup the branch protection policy for the master branch of ztestrepo. We want a Pull Request to have, at least, one code review approval and all CI checks passed with success before a PR become mergeable.

You will see, later in this article, that the final job run and the merging phase of the Pull Request are ensured by Zuul.

  1. Go to
  2. Choose the master branch
  3. Check "Protect this branch"
  4. Check "Require pull request reviews before merging"
  5. Check "Dismiss stale pull request approvals when new commits are pushed"
  6. Check "Require status checks to pass before merging"
  7. Click "Save changes"

Attach the application

Add a collaborator

A second account on Github is needed to act as collaborator of the repository ztestrepo. Select one in This collaborator will act as the PR reviewer later in this article.

Define a Zuul job

Create the file .zuul.yaml at the root of ztestrepo.

git clone
cd ztestrepo
cat <<EOF > .zuul.yaml
- job:
    name: myjob-noop
    parent: base
    description: This a noop job
    run: playbooks/noop.yaml
        - name: test-node
          label: centos-oci

- project:
    name: myorg/ztestrepo
        - myjob-noop
        - myjob-noop
sed -i s/myorg/myorgname/ .zuul.yaml

Make sure to replace "myorgname" by the organization name.

Create playbooks/noop.yaml.

mkdir playbooks
cat <<EOF > playbooks/noop.yaml
- hosts: test-node
    - name: Success
      command: "true"

Push the changes directly on the master branch of ztestrepo.

git add -A .
git commit -m"Add zuulv3 job definition"
git push origin master

Register the repository on Zuul

At this point, the Software Factory instance is ready to receive events from Github and the Github repository is properly configured. Now we will tell Software Factory to consider events for the repository.

On the Software Factory instance, as root, create the file myorg.yaml.

cd /root/config
cat <<EOF > zuulV3/myorg.yaml
- tenant:
    name: 'local'
          - myorg/ztestrepo
sed -i s/myorg/myorgname/ zuulV3/myorg.yaml

Make sure to replace "myorgname" by the organization name.

git add zuulV3/myorg.yaml && git commit -m"Add ztestrepo to zuul" && git push git+ssh://gerrit/config master

Create a Pull Request and see Zuul in action

  1. Create a Pull Request via the Github UI
  2. Wait the for pipeline to finish with success

Check test

  1. Ask the collaborator to set his approval on the Pull request


  1. Wait for Zuul to detect the approval
  2. Wait the for pipeline to finish with success

Gate test

  1. Wait for for the Pull Request to be merged by Zuul


As you can see, after the run of the check job and the reviewer's approval, Zuul has detected that the state of the Pull Request was ready to enter the gating pipeline. During the gate run, Zuul has executed the job against the Pull Request code change rebased on the current master then made Github merge the Pull Request as the job ended with a success.

Other powerful Zuul features such as cross-repository testing or Pull Request dependencies between repositories are supported but beyond the scope of this article. Do not hesitate to refer to the upstream documentation to learn more about Zuul.

Next steps to go further

To learn more about Software Factory please refer to the upstream documentation. You can reach the Software Factory team on IRC freenode channel #softwarefactory or by email at the mailing list.

by fboucher at December 08, 2017 02:13 PM

December 07, 2017

Red Hat Stack

An Introduction to Fernet tokens in Red Hat OpenStack Platform

Thank you for joining me to talk about Fernet tokens. In this first of three posts on Fernet tokens, I’d like to go over the definition of OpenStack tokens, the different types and why Fernet tokens should matter to you. This series will conclude with some awesome examples of how to use Red Hat Ansible to manage your Fernet token keys in production.

First, some definitions …

What is a token? OpenStack tokens are bearer tokens, used to authenticate and validate users and processes in your OpenStack environment. Pretty much any time anything happens in OpenStack a token is involved. The OpenStack Keystone service is the core service that issues and validates tokens. Using these tokens, users and and software clients via API’s authenticate, receive, and finally use that token when requesting operations ranging from creating compute resources to allocating storage. Services like Nova or Ceph then validate that token with Keystone and continue on with or deny the requested operation. The following diagram, shows a simplified version of this dance.

Screen Shot 2017-12-05 at 12.06.02 pm
Courtesy of the author

Token Types

Tokens come in several types, referred to as “token providers” in Keystone parlance. These types can be set at deployment time, or changed post deployment. Ultimately, you’ll have to decide what works best for your environment, given your organization’s workload in the cloud.

The following types of tokens exist in Keystone:

UUID (Universal Unique Identifier)

The default token provider in Keystone is UUID. This is a 32-byte bearer token that must be persisted (stored) across controller nodes, along with their associated metadata, in order to be validated.

PKI & PKIZ (public key infrastructure)

This token format is deprecated as of the OpenStack Ocata release, which means it is deprecated in Red Hat OpenStack Platform 11. This format is also persisted across controller nodes. PKI tokens contain catalog information of the user that bears them, and thus can get quite large, depending on how large your cloud is. PKIZ tokens are simply compressed versions of PKI tokens.


Fernet tokens (pronounced fehr:NET) are message packed tokens that contain authentication and authorization data. Fernet tokens are signed and encrypted before being handed out to users. Most importantly, however, Fernet tokens are ephemeral. This means they do not need to be persisted across clustered systems in order to successfully be validated.

Fernet was originally a secure messaging format created by Heroku. The OpenStack implementation of this lightweight and more API-friendly format was developed by the OpenStack Keystone core team.

The Problem

As you may have guessed by now, the real problem solved by Fernet tokens is one of persistence. Imagine, if you will, the following scenario:

  1. A user logs into Horizon (the OpenStack Dashboard)
  2. User creates a compute instance
  3. User requests persistent storage upon instance creation
  4. User assigns a floating IP to the instance

While this is a simplified scenario, you can clearly see that there are multiple calls to different core components being made. In even the most basic of examples  you see at least one authentication, as well as multiple validations along the way. Not only does this require network bandwidth, but when using persistent token providers such as UUID it also requires a lot of storage in Keystone. Additionally, the token table in the database

Photo by Eugenio Mazzone on Unsplash

used by  Keystone grows as your cloud gets more usage. When using UUID tokens, operators must implement a detailed and comprehensive strategy to prune this table at periodic intervals to avoid real trouble down the line. This becomes even more difficult in a clustered environment.

It’s not only backend components which are affected. In fact, all services that are exposed to users require authentication and authorization. This leads to increased bandwidth and storage usage on one of the most critical core components in OpenStack. If Keystone goes down, your users will know it and you no longer have a cloud in any sense of the word.

Now imagine the impact as you scale your cloud;  the  problems with UUID tokens are dangerously amplified.

Benefits of Fernet tokens

Because Fernet tokens are ephemeral, you have the following immediate benefits:

  • Tokens do not need to be replicated to other instances of Keystone in your controller cluster
  • Storage is not affected, as these tokens are not stored

The end-result offers increased performance overall. This was the design imperative of Fernet tokens, and the OpenStack community has more than delivered.  

Show me the numbers

All of these benefits sound good, but what are the real numbers behind the performance differences between UUID and Fernet? One of the core keystone developers, Dolph Matthews, created a great post about Fernet benchmarks.

Note that these benchmarks are for OpenStack Kilo, so you’ll most likely see even greater performance numbers in newer releases.

The most important benchmarks in Dolph’s post are the ones comparing the various token formats to each other on a globally-distributed Galera cluster. These show the following results using UUID as a baseline:

Token creation performance

Fernet 50.8 ms (85% faster than UUID) 237.1 (42% faster than UUID)

Token validation performance

Fernet 5.55 ms (8% faster than UUID) 1957.8 (14% faster then UUID)

As you can see, these numbers are quite remarkable. More informal benchmarks can be found at the Cern OpenStack blog,
OpenStack in Production.

Security Implications

Photo by Praveesh Palakeel on Unsplash

One important aspect of using Fernet tokens is security. As these tokens are signed and encrypted, they are inherently more secure than plain text UUID tokens. One really great aspect of this is the fact that you can invalidate a large number of tokens, either during normal operations or during a security incident, by simply changing the keys used to validate them. This requires a key rotation strategy, which I’ll get into in the third part of this series.

While there are security advantages to Fernet tokens, it must be said they are only as secure as the keys that created them. Keystone creates the tokens with a set of keys in your Red Hat OpenStack Platform environment. Using advanced technologies like SELinux, Red Hat Enterprise Linux is a trusted partner in this equation. Remember, the OS matters.


While OpenStack functions just fine with its default UUID token format, I hope that this article shows you some of the benefits of Fernet tokens. I also hope that you find the knowledge you’ve gained here to be useful, once you decide to move forward to implementing them.

In our follow-up blog post in this series, we’ll be looking at how to enable Fernet tokens in your OpenStack environment — both pre and post-deploy. Finally, our last post will show you how to automate key rotation using Red Hat Ansible in a production environment. I hope you’ll join me along the way.

by Ken Savich, Senior OpenStack Solution Architect at December 07, 2017 07:06 PM

OpenStack Superuser

KubeCon scales up, sticks to script

AUSTIN —- More than weird, Austin greeted attendees to KubeCon + CloudNativeCon North America 2017 with a drizzly cold snap at what, for many, is last conference of the year.

The first morning keynotes tried to keep things on the sunny side with a fairly standard agenda: lots of stats, a few corny puns, a flurry of announcements, one cool demo. (New to the keynote stage: a Taylor Swift GIF.)

Here’s our round-up, stay tuned for more.

Contain your excitement: The numbers

Cloud Native Computing Foundation executive director Dan Kohn took the stage and boosted the energy with some stats. The sold-out conference had 4,000 people attending — more than previous four times combined. CNCF had a similar growth spurt: going from four to 14 hosted projects. Oh and of the 1.5 million organizations on Git, they’re ranked nine for commits, just behind Linux. (Here’s the dashboard.) Kubernetes is the most used CNCF project in production (84 percent), followed by Prometheus (48 percent) and Fluentd (38 percent.)

Kohn also announced a bumper crop of new members or upgrading members. He brought Alibaba, newly minted platinum member, onstage for a case study. Hong Tang, chief architect at Alibaba Cloud, offered some jolt-you-awake stats. The cloud division, founded in 2009, is now the lead cloud provider in China with 47.6 percent market share. Alibaba handled 25.3 billion in sales on the most recent November 11 Singles Day (yeah, that’s more than the entire U.S. Thanksgiving-to-Christmas season), logging 325,000 orders per second at peak.

The CNCF also supersized the scholarship program from five in Berlin to 103 in Austin — for a total of $250,000 in funds. Michelle Noorali, senior software engineer at Microsoft Azure, who co-hosted the keynotes, called it the “largest investment in diversity for any conference, ever.” (More on this program to come.)

.@michellenoorali pulling through with some “cheesy #cloudnative metaphors” — you know we love a good GIF 🙌🏼 [LIVE from #KubeCon + #CloudNativeCon]

— CNCF (@CloudNativeFdn) December 6, 2017

“This is the year that companies of all sizes have become engaged in our community,” Kohn concluded, showing off this crazy-crowded map of the CNCF landscape.

Project updates and name bingo

If you’ve been interested in tech long enough to have a smart phone, you’ve probably noticed ‘the name thing.’ All those cool new innovations have to be called something: maybe it’s borrowed from a foreign language, a two-word mashup with emphasis in the non-obvious place or one vowel short with annoying punctuation.

And if your community is growing this fast there are both new people and lots of new projects, making it even more fun. There were so many of them in the project update section of the keynotes, one can only hope a buzzword drinking game is in the works. (Double points if you pronounce them wrong on purpose and no one corrects you.)

Yes, so here are some new and new-to-us project updates.

Three projects reached 1.0 versions:

  • containerd (pronounced con-tay-ner-D) an industry-standard runtime for building container solutions. containerd is already being used by Kubernetes for its cri-containerd project, which enables users to run Kubernetes clusters using containerd as the underlying runtime.
    The project was a new donation to the CNCF, along with the initial release of rktlet (rocket-let), rkt’s Kubernetes Container Runtime Interface (CRI) implementation.
  • Jaeger, which got started at Uber (one of two innovations started at a ride share in the presentation — see Lyft’s contribution with Envoy below), features better user interface and usability upgrades for viewing large traces, a new C++ library (in addition to Go, Java, Pything, Node) as well as integration with Kubernetes, Prometheus and Envoy. (The pronunciation on this one sounded like YAY-gər, along the lines of the digestif).

  • Eduardo Silva of TreasureData talked to the crowd about solving logging issues with the latest version of Fluentd, an open-source data collector for unified logging layer. It already has over 50,000 pulls a day and the 1.0 version features multi-process workers, sub-second time resolution, native TLS/SSL support, optimized buffers and better support for Kafka (data streaming) and Windows.

Old, projects, new projects and the demo with all the feels

  • Envoy, originally an in internal project at Lyft, reached a 1.5 release. It’s an edge and service proxy that makes the network transparent to apps. Built on the shoulders of solutions including NGINX, HAProxy, hardware load balancers and cloud load balancers, Envoy runs alongside every application and abstracts the network by providing common features in a platform-agnostic manner.
  • Kata Containers Intel’s Imad Sousou introduced this new project that combines the Intel Clear Containers and Hyper runV technologies to create an open standard for virtualized containers and build a community around it. He called the projecta a way of “accelerating the digital transformation” as it allows you to pair speed with security.

  • Conduit is a lightweight servicemesh built specifically for Kubernetes.

It was introduced to great effect by Oliver Gould, CTO of Buoyant. He won over many a tired/jetlagged participant by stumbling out onstage and asking in a tired voice, “How’s everyone doing? I see a lot of laptops and phones and you’re not really listening…” So he invited the crowd to head over to and pick the graphic that best represented how they were feeling.

He then installed Conduit, which incorporates the many lessons they’ve learned from over 18 months of production service mesh experience with Linkerd, into a running Kubernetes cluster and then rolled a service that was taking live traffic from the audience. The demo had intentional errors on the web app to illustrate Conduit’s power.





These were the highlights – we’ll add the video with the whole enchilada when it’s available.

Stay tuned for more coverage from Austin.

The post KubeCon scales up, sticks to script appeared first on OpenStack Superuser.

by Nicole Martinelli at December 07, 2017 02:25 PM

Cisco Cloud Blog

Cloud Unfiltered Podcast, Episode 29: The State of OpenStack, and a Few Other Things, with Ben Kepes

He’s an ultra-marathoner, a highly respected tech industry reporter, and an angel investor (he’s not at all fond of that last label for its sheer pretentiousness, but I can’t think of any other way to describe what he does). He works a lot, but doesn’t really work for anyone—except himself. All of which makes Ben […]

by Ali Amagasu at December 07, 2017 02:16 PM

Daniel P. Berrangé

Full coverage of libvirt XML schemas achieved in libvirt-go-xml

In recent times I have been aggressively working to expand the coverage of libvirt XML schemas in the libvirt-go-xml project. Today this work has finally come to a conclusion, when I achieved what I believe to be effectively 100% coverage of all of the libvirt XML schemas. More on this later, but first some background on Go and XML….

For those who aren’t familiar with Go, the core library’s encoding/xml module provides a very easy way to consume and produce XML documents in Go code. You simply define a set of struct types and annotate their fields to indicate what elements & attributes each should map to. For example, given the Go structs:

type Person struct {
    XMLName xml.Name `xml:"person"`
    Name string `xml:"name,attr"`
    Age string `xml:"age,attr"` 
    Home *Address `xml:"home"`
    Office *Address `xml:"office"`
type Address struct { 
    Street string `xml:"street"`
    City string `xml:"city"` 

You can parse/format XML documents looking like

<person name="Joe Blogs" age="24">
    <street>Some where</street><city>London</city>
    <street>Some where else</street><city>London</city>

Other programming languages I’ve used required a great deal more work when dealing with XML. For parsing, there’s typically a choice between an XML stream based parser where you have to react to tokens as they’re parsed and stuff them into structs, or a DOM object hierarchy from which you then have to pull data out into your structs. For outputting XML, apps either build up a DOM object hierarchy again, or dynamically format the XML document incrementally. Whichever approach is taken, it generally involves writing alot of tedious & error prone boilerplate code. In most cases, the Go encoding/xml module eliminates all the boilerplate code, only requiring the data type defintions. This really makes dealing with XML a much more enjoyable experience, because you effectively don’t deal with XML at all! There are some exceptions to this though, as the simple annotations can’t capture every nuance of many XML documents. For example, integer values are always parsed & formatted in base 10, so extra work is needed for base 16. There’s also no concept of unions in Go, or the XML annotations. In these edge cases custom marshaling / unmarshalling methods need to be written. BTW, this approach to XML is also taken for other serialization formats including JSON and YAML too, with one struct field able to have many annotations so it can be serialized to a range of formats.

Back to the point of the blog post, when I first started writing Go code using libvirt it was immediately obvious that everyone using libvirt from Go would end up re-inventing the wheel for XML handling. Thus about 1 year ago, I created the libvirt-go-xml project whose goal is to define a set of structs that can handle documents in every libvirt public XML schema. Initially the level of coverage was fairly light, and over the past year 18 different contributors have sent patches to expand the XML coverage in areas that their respective applications touched. It was clear, however, that taking an incremental approach would mean that libvirt-go-xml is forever trailing what libvirt itself supports. It needed an aggressive push to achieve 100% coverage of the XML schemas, or as near as practically identifiable.

Alongside each set of structs we had also been writing unit tests with a set of structs populated with data, and a corresponding expected XML document. The idea for writing the tests was that the author would copy a snippet of XML from a known good source, and then populate the structs that would generate this XML. In retrospect this was not a scalable approach, because there is an enourmous range of XML documents that libvirt supports. A further complexity is that Go doesn’t generate XML documents in the exact same manner. For example, it never generates self-closing tags, instead always outputting a full opening & closing pair. This is semantically equivalent, but makes a plain string comparison of two XML documents impractical in the general case.

Considering the need to expand the XML coverage, and provide a more scalable testing approach, I decided to change approach. The libvirt.git tests/ directory currently contains 2739 XML documents that are used to validate libvirt’s own native XML parsing & formatting code. There is no better data set to use for validating the libvirt-go-xml coverage than this. Thus I decided to apply a round-trip testing methodology. The libvirt-go-xml code would be used to parse the sample XML document from libvirt.git, and then immediately serialize them back into a new XML document. Both the original and new XML documents would then be parsed generically to form a DOM hierarchy which can be compared for equivalence. Any place where documents differ would cause the test to fail and print details of where the problem is. For example:

$ go test -tags xmlroundtrip
--- FAIL: TestRoundTrip (1.01s)
	xml_test.go:384: testdata/libvirt/tests/vircaps2xmldata/vircaps-aarch64-basic.xml: \
            /capabilities[0]/host[0]/topology[0]/cells[0]/cell[0]/pages[0]: \
            element in expected XML missing in actual XML

This shows the filename that failed to correctly roundtrip, and the position within the XML tree that didn’t match. Here the NUMA cell topology has a ‘<pages>‘  element expected but not present in the newly generated XML. Now it was simply a matter of running the roundtrip test over & over & over & over & over & over & over……….& over & over & over, adding structs / fields for each omission that the test identified.

After doing this for some time, libvirt-go-xml now has 586 structs defined containing 1816 fields, and has certified 100% coverage of all libvirt public XML schemas. Of course when I say 100% coverage, this is probably a lie, as I’m blindly assuming that the libvirt.git test suite has 100% coverage of all its own XML schemas. This is certainly a goal, but I’m confident there are cases where libvirt itself is missing test coverage. So if any omissions are identified in libvirt-go-xml, these are likely omissions in libvirt’s own testing.

On top of this, the XML roundtrip test is set to run in the libvirt jenkins and travis CI systems, so as libvirt extends its XML schemas, we’ll get build failures in libvirt-go-xml and thus know to add support there to keep up.

In expanding the coverage of XML schemas, a number of non-trivial changes were made to existing structs  defined by libvirt-go-xml. These were mostly in places where we have to handle a union concept defined by libvirt. Typically with libvirt an element will have a “type” attribute, whose value then determines what child elements are permitted. Previously we had been defining a single struct, whose fields represented all possible children across all the permitted type values. This did not scale well and gave the developer no clue what content is valid for each type value. In the new approach, for each distinct type attribute value, we now define a distinct Go struct to hold the contents. This will cause API breakage for apps already using libvirt-go-xml, but on balance it is worth it get a better structure over the long term. There were also cases where a child XML element previously represented a single value and this was mapped to a scalar struct field. Libvirt then added one or more attributes on this element, meaning the scalar struct field had to turn into a struct field that points to another struct. These kind of changes are unavoidable in any nice manner, so while we endeavour not to gratuitously change currently structs, if the libvirt XML schema gains new content, it might trigger further changes in the libvirt-go-xml structs that are not 100% backwards compatible.

Since we are now tracking libvirt.git XML schemas, going forward we’ll probably add tags in the libvirt-go-xml repo that correspond to each libvirt release. So for app developers we’ll encourage use of Go vendoring to pull in a precise version of libvirt-go-xml instead of blindly tracking master all the time.

by Daniel Berrange at December 07, 2017 02:14 PM

December 06, 2017

Stephen Finucane

Building PDFs for OpenStack documentation

I've only ever really worked with HTML and man page builds for the documentation various of various OpenStack projects. However, OpenStack uses Sphinx across the board and Sphinx, being the awesome tool that it is, supports many other output formats. In this instance, I was interested in PDF. Sphinx doesn't actually provide a native PDF builder (although other packages do). Instead, you have to generate LaTeX sources and then generate a PDF for this.

December 06, 2017 04:03 PM

OpenStack Superuser

Check out these OpenStack project updates

If you’re interested in getting up to speed on what’s next for OpenStack software in the current development cycle, the project update videos from the recent OpenStack Summit Sydney are available now.

In them you’ll hear from the project team leaders (PTLs) and core contributors about what they’ve accomplished, where they’re heading for future releases plus how you can get involved and influence the roadmap.

You can find the complete list of them on the OpenStack video page. You can also get a complete overview of all 60 them on the project navigator.

Some project updates that you won’t want to miss include:

Heat orchestration

Ironic bare metal provisioning service

Keystone identity service

Octavia load balancer

Swift object storage

Trove database-as-a-service

Vitrage root cause analysis (RCA)

And check out these more detailed write-ups of Kuryr, Magnum, Nova and Neutron.

The post Check out these OpenStack project updates appeared first on OpenStack Superuser.

by Superuser at December 06, 2017 01:29 PM


SUSE’s First Solution Partner in APJ

Aptira SUSE partnership

Becoming a Solution Partner in the SUSE Partner Program is no mean feat – it is a status awarded to select partners. As the highest partner tier in the Program, it recognises partners who have the deep technical expertise and commitment to building only the best solutions that provide maximum efficiency and high availability to demanding enterprise business clients.

“Aptira embodies what we are looking for in a Solution Partner”, says Mark Salter, VP Channels for SUSE. “They have superior technical knowhow and we are impressed with their commitment to providing their customers with the best solution possible with no vendor lock-in, which is a philosophy that SUSE as the open, opensource company, also subscribes to.”

“We like to partner with like-minded companies. In SUSE, we have a mutually complementary portfolio, and a joint desire to concentrate on offerings that are commercially better for customers”, says Tristan Goode, Founder, CEO and Board Director of Aptira. “Becoming SUSE’s first Solution Partner in APJ for Storage shows that we do much more than just OpenStack. We also offer a full range of technical services from consulting, solution delivery, systems integration through to managed services and support”.

Read more about our partnership with SUSE on the SUSE blog.

The post SUSE’s First Solution Partner in APJ appeared first on Aptira Cloud Solutions.

by Jessica Field at December 06, 2017 01:14 PM

December 05, 2017

Chris Dent

TC Report 49

After last week's rather huge TC Report, will keep this one short, reporting on some topics that came up in the #openstack-tc IRC channel in the last week.

Interop Tests and Tempest Plugins

There's a review in process attempting to clarify testing for interop programs. It's somewhat stuck and needs additional input from any community members who are interested in or concerned about interop testing and tempest plugins.

For a bit more context there was some discussion on Wednesday,

Kata Containers

Today, OpenStack got a sibling project managed by the Foundation, Kata Containers (there's a press release). It provides a way of doing "extremely lightweight virtual machines" that can work within a container ecosystem (such as Kubernetes).

The expansion of the Foundation was talked about at the summit in Sydney, but having something happen this quickly was a bit of a surprise, leading to some questions in IRC today. Jonathan Bryce showed up to help answer them.

Turns out this was all above board, but some communication had been dropped.

by Chris Dent at December 05, 2017 08:00 PM

OpenStack Superuser

Building out OpenStack’s integration engine

The demand for open infrastructure is expanding rapidly, a building boom driven by new use cases. It’s as if we started out seven years ago with a starter set of Lego bricks and now people are building CERN’s Atlas detector with it.  A few of these exciting new uses include NFV, internet of things, edge computing, container workloads, serverless architecture and machine learning/artificial intelligence.

With 85,000 community members and an average of 220 merged patches every week, OpenStack is already one of the largest and most active open source communities in history. Over the past year, we’ve been working on steps to ensure the Foundation is meeting this unprecedented demand and serving our vast and diverse user base.

We helped devise the original building blocks for thousands of clouds, and we want to understand how to serve communities as they construct the future of infrastructure. The biggest problem in open source today is not innovation, it’s integration. How all of these projects fit together (or often how they don’t – no matter how hard you hammer them) is what causes the most headaches for users — and often leads them back to buying closed, proprietary products out of frustration. 

Recognizing that pain point, the OpenStack Foundation plans to host new open source projects that tightly align with OpenStack’s mission to build open infrastructure, without necessarily becoming part of the original set of OpenStack datacenter cloud projects. Previously, we added functionality for new use cases and verticals directly within OpenStack, however we learned that approach ultimately diluted the meaning of OpenStack and caused confusion about which services were production-ready and which ones were emerging. This approach (sometimes called “the Big Tent”) was centered around innovation, but success in open source today also requires integration.

This week at KubeCon, we are launching Kata Containers, a new project that combines the Intel Clear Containers and Hyper runV technologies to create an open standard for virtualized containers and build a community around it. We view the technology as lower level container infrastructure that is designed to plug seamlessly into the container ecosystem.

The OpenStack Foundation plans to continue investing in building this integration for open infrastructure. We’re following a four-pronged strategy for these investments: develop and document joint use cases, support cross-community contributions, host new projects for the “glue” that ties core components together, and implement cross-community testing.

What’s next?

We’ve already been laying the foundation for this new structure in a number of ways. This base will strengthen the ties we’ve been forging with adjacent communities over the last three years. Some ways we’ve been building it include collaboration between components and communities, like the work that is happening between the OpenStack and Kubernetes communities this week at KubeCon. We have also introduced new projects focused on practicalities of deploying software together with the recent addition of OpenStack-Helm and OpenStack LOCI and prioritized testing end-to-end systems, supporting community initiatives like OpenLab.

Through all these activities, we intend to strengthen relationships and integration testing across open source projects, including those hosted at the Linux Foundation, Apache Software Foundation and others. We believe in the power of open source and that open-source organizations should lead by example in demonstrating this collaboration.

We want to hear from you as we decide where to focus our efforts and engage our community resources. Together, we’re building an impressive future for open infrastructure.

Cover Photo // CC by NC

The post Building out OpenStack’s integration engine appeared first on OpenStack Superuser.

by Jonathan Bryce at December 05, 2017 03:00 PM

December 04, 2017

OpenStack Superuser

Working together: OpenStack and Kubernetes

When it comes to building a thriving, healthy environment, collaboration is the key to success. In an open source software community, this may include the coordination among thousands of developers.

While working together across open source communities goes way back, the OpenStack Foundation has been spearheading efforts to extend the work of its community and enable them to better collaborate with others. Jonathan Bryce, executive director of the OpenStack Foundation has said that this request is really from OpenStack end users who rely on different open source technologies to deliver business value to their organizations.

Collaboration was also a theme at the recent Open Source Summit, where Jim Zemlin, executive director of the Linux Foundation echoed Bryce’s sentiment on collaboration saying that the ecosystem must collaborate in order to meet the explosive growth of open source adoption.

This week at KubeCon North America, there are several opportunities to not only learn how you can integrate OpenStack with Kubernetes, but also how you can get more plugged into groups who are focused on the cross-community collaboration between the two communities.

The OpenStack special interest group (SIG) coordinates improvements to and documentation of the OpenStack cloud provider implementation in Kubernetes as well as supporting efforts to deploy OpenStack itself using Kubernetes. Chris Hoge and Stephen Gordon will provide a SIG OpenStack update on Thursday afternoon and on Friday afternoon, they will lead a deep dive discussion to openly collaborate on the SIG’s future plans. 

Another session discussing the integration of OpenStack and Kubernetes will be a demo led by John Griffith, who first took the stage at the OpenStack Summit Barcelona to demo how to containerize Cinder. At KubeCon, Griffith will demonstrate a simplified deployment architecture by integrating Containerized standalone Cinder services with bare metal Kubernetes.

Want to get involved? You’ll find OpenStack in the sponsor expo at booth G16 and stop by one of the SIG sessions to start collaborating.

Cover photo CC BY NC

The post Working together: OpenStack and Kubernetes appeared first on OpenStack Superuser.

by Allison Price at December 04, 2017 07:58 PM

OpenStack Blog

Developer Mailing List Digest November 25 to December 1st


  • Project Team Gather (PTG) in Dublin registration is live [0]

Community Summaries

  • TC report by Chris Dent [0]
  • Release countdown [1]
  • Technical Committee Status updated [2]
  • POST /api-sig/news [3]
  • Nova notification update [4]
  • Nova placement resource providers update [5]

Dublin PTG Format

We will continue themes as we did in Denver (Monday-Tuesday), but shorter times like half days. Flexibility is added for other groups to book the remaining available rooms in 90-min slots on-demand driven by the PTG Bot (Wednesday-Friday.

First Contact SIG

A wiki has been created for the group [0]. The group is looking for intersted people being points of contact for newcomers and what specified time zones. Resource links like contributor portal, mentoring wiki, Upstream Institute, outreachy are being collected on the wiki page. A representative from the operators side to chair and represent would be good.

Policy Goal Queens-2 Update

Queens-2 coming to a close, we recap our community wide goal for policies [0]. If you want your status changed, contact Lance Bragstad. Use the topic policy-and-docs-in-code for tracking related code changes.

Not Started

  • openstack/ceilometer
  • openstack/congress
  • openstack/networking-bgpvpn
  • openstack/networking-midonet
  • openstack/networking-odl
  • openstack/neutron-dynamic-routing
  • openstack/neutron-fwaas
  • openstack/neutron-lib
  • openstack/solum
  • openstack/swift

In Progress

  • openstack/barbican
  • openstack/cinder
  • openstack/cloudkitty
  • openstack/glance
  • openstack/heat
  • openstack/manila
  • openstack/mistral
  • openstack/neutron
  • openstack/panko
  • openstack/python-heatclient
  • openstack/tacker
  • openstack/tricircle
  • openstack/trove
  • openstack/vitrage
  • openstack/watcher
  • openstack/zaqar


  • openstack/designate
  • openstack/freezer
  • openstack/ironic
  • openstack/keystone
  • openstack/magnum
  • openstack/murano
  • openstack/nova
  • openstack/octavia
  • openstack/sahara
  • openstack/searchlight
  • openstack/senlin
  • openstack/zun

Tempest Plugin Split Goal

A list of open reviews [0] is available for the Tempest plugin split goal [1].

Not Started

  • Congress
  • ec2-api
  • freezer
  • mistral
  • monasca
  • senlin
  • tacker
  • Telemetry
  • Trove
  • Vitrage

In Progress

  • Cinder
  • Heat
  • Ironic
  • magnum
  • manila
  • Neutron
  • murano
  • networking-l2gw
  • octavia


  • Barbican
  • CloudKitty
  • Designate
  • Horizon
  • Keystone
  • Kuryr
  • Sahara
  • Solum
  • Tripleo
  • Watcher
  • Winstackers
  • Zaqar
  • Zun

by Mike Perez at December 04, 2017 05:07 PM


Enhancing OpenStack Swift to support edge computing context

As the trend continues to move towards Serverless Computing, Edge Computing and Functions as a Service (FaaS), the need for a storage system that can adapt to these architectures grows ever bigger. In a scenario where smart cars have to make decisions on a whim, there is no chance for that car to ask a data center what to do in this scenario. These scenarios constitute a driver for new storage solutions in more distributed architectures. In our work, we have been considering a scenario in which there is a distributed storage solution which exposes different local endpoints to applications distributed over a mix of cloud and local resources; such applications can give the storage infrastructure and indicator of the nature of the data which can then be used to determine where it should be stored. For example, data could be considered to be either latency-sensitive (in which case the storage system should try to store it as locally as possible) or loss sensitive (in which case the storage system should ensure it is on reliable storage).

Because Object Storage is very fitting for the idea of FaaS (see here) we decided to use OpenStack Swift – with which we had some experience – and make some modifications to support an edge computing context. The way we envisioned Swift to work in this scenario is that there are two Swift instances, one being the local or edge deployment and the other being the remote or data center deployment, each of which offers a local endpoint to a distributed application: functionality running on the edge can communicate with the edge Swift instance; functionality running in the cloud can access the cloud Swift instance. When data is latency-sensitive it is usually stored on the edge and when it is to be more persistently saved it will be pushed to the data center where there are more resources and storage space to appropriately save of the data.

Approach 1: modifying the proxy server

The first Approach we considered was to modify the Swift Proxy Server in a way that can  distinguish between the different types of data. The path to an object in Swift is http://localhost:8080/v1/AUTH_test/container/object, we used this to make the data distinguishable. At first we thought about adding metadata to an object when uploading it to identify where the object should be stored. This didn’t work for us as we had trouble parsing the metadata at the point where we implemented the distinction if the object is to be stored remote or local. So the approach was designed with the idea that if the container name contained an indicator that the object is to be pushed remote from the current Swift storage. This indicator was added as a prefix for the container name called 'remote_' This was implemented in the proxy-server by parsing the path of the object and looking for the indicator. So a call to the remote Swift would look something like this: http://localhost:8080/v1/AUTH_test/remote_container/object

The changes in code have been made in /swift/swift/proxy/ which is the entry point to the whole Swift environment. The get_controller method was the method where we implemented our changes, as there the controllers for the account, container and the object get instantiated and this is where the code needs to make the distinction before it pushes the object locally.


The test environment consists of three VMs, all of them are running:

  • Ubuntu 16.04
  • 4GB RAM
  • 2 VCPU
  • 40GB disk space

The VMs are meant to simulate three different setups:

  • A device that accesses the Swift storage
  • An edge “micro data center” that  should represent limited storage capacity, but low latency
  • And the cloud, which is to represent higher storage capacity, redundancy but a higher latency

With tc qdisc we added the latency between the VMs to have some similar results as what we expect to be the round trip delay between these machines.

The way the tests were done is that from the device VM a HTTP request has been made with curl to first upload files of sizes from 10kb up to 100mb.

First the uploads – using a HTTP PUT with curl – were done without the remote indicator so that all the files are stored on the edge (blue line). Afterwards the uploads with the remote indicator were done, so that swift automatically pushes the file on to the cloud (red line).

The object was retrieved using HTTP GET with curl to get each file from the edge and from the cloud alike.

As the graphs can quite clearly show this approach is slow and absolutely not imaginable in a production deployment. The scaling is very bad and nowhere near useful in a daily use. Both the put and the get are very slow, but the way the get scaled is so much worse than the upload that we had to scrape this idea, because we didn’t believe that optimizing our code would help much.

Approach 2: write a middleware

As the results from the modified proxy server show that this is an approach that is not viable we decided to try something more normal in the Swift universe, which is to write a middleware for Swift to handle the redirects. Middlewares in Swift are compareable to Lego pieces that can be plugged into the pipeline, which defines the codeflow of Swift. and can be used to add extra functionalities such as access controls, authentications, logging etc. The default pipeline of swift looks like this (it can be found in the configuration file of the proxy server)

# This sample pipeline uses tempauth and is used for SAIO dev work and
# testing.
pipeline = catch_errors gatekeeper healthcheck proxy-logging cache container_sync bulk tempurl ratelimit crossdomain authtoken keystoneauth tempauth  formpost staticweb copy container-quotas account-quotas slo dlo versioned_writes proxy-logging  proxy-serve

When writing your own middleware you have to add your middleware in the pipeline and add the configurations for it. We advise you to add your .py file which is the middleware in to the /swift/swift/common/middleware directory. By doing this and adding the entrypoint of your middleware to the following file: /swift/swift.egg-info/entry-points.txt like this:

redirectmiddleware = swift.common.middleware.redirectmiddleware:redirect_factory
After adding all the files of your middleware and adding the configurations restart Swift and your middleware should be running in the Swift middleware. For more info on how to write a Swift middleware the following two links are recommended: Example with ClamAV and OS documentation on Swift Middleware.
The way our redirect middleware works is that it will, similar to the first approach, look for the remote indicator, but this time everything happens inside the new middleware and there is no need to go change the working Swift code. The purpose of middlewares is exactly that, introducing something into the Swift codeflow without having to get into code that already exists. When there is a remote indicator the middleware will then return a 307 HTTP response with the URL to the remote Swift installation.
The biggest issue with running a middleware like this is authentication. The problem we ran into is the authentication on the remote machine. The redirect will work but it will try to use the same auth_token that it used on the local Swift. The workaround we use at the moment is that the authentication token is hard coded to be the same in both Swift environments, not a very good solution, but it makes performance tests a whole lot easier.


The test setup was the same with the only difference being that the middleware is added to this Swift instead of having the proxy server modified. As graphs show the performance is significantly better than the previous approach.

The results for remote uploads scale more parallel to the local uploads than with the first approach which makes this approach more realistic to imagine being used in a daily scenario. Of course the performance could improve and the issue with authentication is not yet solved for this scenario, but we believe it to be the better solution, especially compared to the first approach.

The HTTP GET doesn’t scale as well as the PUT, the time gap between the remote and the local downloads gets bigger by the size of the objects. The download time for bigger files is significantly longer for remote files than for local files, yet still a lot better than the previous approach.

Lessons learnt

After spending quite some time looking into the inner workings of Swift, I can say we learnt a lot about Swift and how it works and handles requests. Also I believe we can say that the code of Swift should not be touched or changed in a deployed or even a test environment as it’s very intricate and has a lot of components. If you want to try to change some code in Swift we advise you to create a simple VM to test your changes and get comfortable with how Swift works (a tutorial to creating a Swift VM can be found on this blog very soon). As for functionality that you want to introduce in to the Swift code writing a Swift Middleware is highly advised. It will make your life a lot easier and it’s intended to be used in that way.

Future direction

One thing that could be looked inot is a solution where Swift could take data and independently decide if the object is to be stored locally or remote without the user input or indicator. Another thing that would be good to look into, is the possibility to use metadata to label the objects instead of using indicators in the container names. The last issue that would need to be resolved is the authentication issue that at the moment doesn’t allow the redirect.

by anke at December 04, 2017 12:47 PM

December 02, 2017


Australia vs the Rest of the World: A discussion on how OpenStack compares down under.

At the OpenStack Summit in Sydney last month, Aptira’s COO and Inventor of Solutionauting Roland Chan, and Project Manager John Spillane joined representatives from Telstra, SUSE and NeCTAR to discuss how OpenStack adoption in Australia compares to the rest of the world. They covered:

  • The Amazon effect on open source cloud business models.
  • OpenStack and the Edge: how much hype, how much substance?
  • The largest challenges facing the OpenStack community today.
  • The cultural and organizational changes needed to drive innovation.
  • Whether Australia is progressive or conservative in cloud technology adoption.

The full panel video can be viewed below:

The post Australia vs the Rest of the World: A discussion on how OpenStack compares down under. appeared first on Aptira Cloud Solutions.

by Jessica Field at December 02, 2017 12:15 PM

December 01, 2017

Lee Yarwood

OpenStack TripleO FFU Nova Demo N to Q

Update 04/12/17 : The initial deployment documented in this demo no longer works due to the removal of a number of plan migration steps that have now been promoted into the Queens repos. We are currently looking into ways to reintroduce these for use in master UC Newton OC FFU development deployments, until then anyone attempting to run through this demo should start with a Newton OC and UC before upgrading the UC to master.

This is another TripleO fast-forward upgrade demo post, this time focusing on a basic stack of Keystone, Glance, Cinder, Neutron and Nova. At present there are several workarounds still required to allow the upgrade to complete, please see the workaround sections for more details.


As with the original demo I’m still using tripleo-quickstart to deploy my initial environment, this time with 1 controller and 1 compute, with a Queens undercloud and Newton overcloud. In addition I’m also using a new general config to deploy a minimal control stack able to host Nova.

$ bash -w $WD -t all -R master-undercloud-newton-overcloud  \
   -c config/general_config/minimal-nova.yml $VIRTHOST

UC - docker_registry.yaml

Again with this demo we are not caching containers locally, the following command will create a docker_registry.yaml file referencing the RDO registry for use during the final deployment of the overcloud to Queens:

$ ssh -F $WD/ssh.config.ansible undercloud
$ openstack overcloud container image prepare \
  --namespace \
  --tag tripleo-ci-testing \
  --output-env-file ~/docker_registry.yaml

UC - tripleo-heat-templates

We then need to update the version of tripleo-heat-templates deployed on the undercloud host:

$ ssh -F $WD/ssh.config.ansible undercloud
$ cd /home/stack/tripleo-heat-templates
$ git fetch git:// refs/changes/19/518719/9 && git checkout FETCH_HEAD

Finally, as we are using a customised controller role the following services need to be added to the overcloud_services.yml file on the undercloud node under ControllerServices:

       - OS::TripleO::Services::Docker
       - OS::TripleO::Services::Iscsid
       - OS::TripleO::Services::NovaPlacement

UC - tripleo-common

At present we are waiting for a promotion of tripleo-common that includes various bugfixes when updating the overcloud stack, generating outputs etc. For the time being we can simply install directly from master to workaround these issues.

$ ssh -F $WD/ssh.config.ansible undercloud
$ git clone ; cd tripleo-common
$ sudo python install ; cd ~

OC - Update heat-agents

As documented in my previous demo post we need to remove any legacy heiradata from all overcloud hosts prior to updating the heat stack:

$ sudo rm -f /usr/libexec/os-apply-config/templates/etc/puppet/hiera.yaml \
             /usr/libexec/os-refresh-config/configure.d/40-hiera-datafiles \

We also need to update the heat-agents on all nodes to their Ocata versions:

$ git clone ; cd tripleo-repos
$ sudo python install
$ sudo tripleo-repos -b ocata current
$ sudo yum update -y python-heat-agent \
$ sudo yum install -y openstack-heat-agents \
                      python-heat-agent-ansible \
                      python-heat-agent-apply-config \
                      python-heat-agent-docker-cmd \
                      python-heat-agent-hiera \

OC - Workarounds #1

$ sudo yum remove openstack-ceilometer* -y

UC - Update stack outputs

With the workarounds in place we can now update the stack using the updated version of tripleo-heat-templates on the undercloud. Once again we need to use the original deploy command with a number of additional environment files included:

$ . stackrc
$ openstack overcloud deploy \
  --templates /home/stack/tripleo-heat-templates \
  -e /home/stack/docker_registry.yaml \
  -e /home/stack/tripleo-heat-templates/environments/docker.yaml \
  -e /home/stack/tripleo-heat-templates/environments/fast-forward-upgrade.yaml \
  -e /home/stack/tripleo-heat-templates/environments/noop-deploy-steps.yaml

UC - Download config

Once the stack has been updated we can download the config with the following command:

$ . stackrc
$ openstack overcloud config download
The TripleO configuration has been successfully generated into: /home/stack/tripleo-Oalkee-config

UC - FFU and Upgrade plays

Before running through any of the generated playbooks I personally like to add the profile_tasks callback to the callback_whitelist for Ansible within /etc/ansible/ansible.cfg. This provides timestamps during the playbook run and a summary of the slowest tasks at the end.

# enable callback plugins, they can output to stdout but cannot be 'stdout' type.
callback_whitelist = profile_tasks

We first run the fast_forward_upgrade_playbook to complete the upgrade to Pike:

$ . stackrc
$ ansible-playbook -i /usr/bin/tripleo-ansible-inventory \
PLAY RECAP *****************************************************************************************************************************              : ok=62   changed=8    unreachable=0    failed=0              : ok=123  changed=55   unreachable=0    failed=0   

Friday 01 December 2017  20:39:58 +0000 (0:00:03.967)       0:06:16.615 ******* 
Stop neutron_server ------------------------------------------------------------------------------------------------------------ 32.53s
stop openstack-cinder-volume --------------------------------------------------------------------------------------------------- 16.03s
Stop neutron_l3_agent ---------------------------------------------------------------------------------------------------------- 14.36s
Stop and disable nova-compute service ------------------------------------------------------------------------------------------ 13.16s
Cinder package update ---------------------------------------------------------------------------------------------------------- 12.73s
stop openstack-cinder-scheduler ------------------------------------------------------------------------------------------------ 12.68s
Setup cell_v2 (sync nova/cell DB) ---------------------------------------------------------------------------------------------- 11.79s
Cinder package update ---------------------------------------------------------------------------------------------------------- 11.30s
Neutron package update --------------------------------------------------------------------------------------------------------- 10.99s
Keystone package update -------------------------------------------------------------------------------------------------------- 10.77s
glance package update ---------------------------------------------------------------------------------------------------------- 10.28s
Keystone package update --------------------------------------------------------------------------------------------------------- 9.80s
glance package update ----------------------------------------------------------------------------------------------------------- 8.72s
Neutron package update ---------------------------------------------------------------------------------------------------------- 8.62s
Stop and disable nova-consoleauth service --------------------------------------------------------------------------------------- 7.94s
Update nova packages ------------------------------------------------------------------------------------------------------------ 7.62s
Update nova packages ------------------------------------------------------------------------------------------------------------ 7.24s
Stop and disable nova-scheduler service ----------------------------------------------------------------------------------------- 6.36s
Run puppet apply to set tranport_url in nova.conf ------------------------------------------------------------------------------- 5.78s
install tripleo-repos ----------------------------------------------------------------------------------------------------------- 4.70s

We then run the upgrade_steps_playbook to start the upgrade to Queens:

$ . stackrc
$ ansible-playbook -i /usr/bin/tripleo-ansible-inventory \
PLAY RECAP *****************************************************************************************************************************              : ok=57   changed=45   unreachable=0    failed=0              : ok=165  changed=146  unreachable=0    failed=0   

Friday 01 December 2017  20:51:55 +0000 (0:00:00.038)       0:10:47.865 ******* 
Update all packages ----------------------------------------------------------------------------------------------------------- 263.71s
Update all packages ----------------------------------------------------------------------------------------------------------- 256.79s
Install docker packages on upgrade if missing ---------------------------------------------------------------------------------- 13.77s
Upgrade os-net-config ----------------------------------------------------------------------------------------------------------- 5.71s
Upgrade os-net-config ----------------------------------------------------------------------------------------------------------- 5.12s
Gathering Facts ----------------------------------------------------------------------------------------------------------------- 3.36s
Install docker packages on upgrade if missing ----------------------------------------------------------------------------------- 3.14s
Stop and disable mysql service -------------------------------------------------------------------------------------------------- 1.97s
Check for os-net-config upgrade ------------------------------------------------------------------------------------------------- 1.66s
Check for os-net-config upgrade ------------------------------------------------------------------------------------------------- 1.57s
Stop keepalived service --------------------------------------------------------------------------------------------------------- 1.48s
Stop and disable rabbitmq service ----------------------------------------------------------------------------------------------- 1.47s
take new os-net-config parameters into account now ------------------------------------------------------------------------------ 1.31s
take new os-net-config parameters into account now ------------------------------------------------------------------------------ 1.08s
Check if openstack-ceilometer-compute is deployed ------------------------------------------------------------------------------- 0.70s
Check if iscsid service is deployed --------------------------------------------------------------------------------------------- 0.67s
Start keepalived service -------------------------------------------------------------------------------------------------------- 0.48s
Check for nova placement running under apache ----------------------------------------------------------------------------------- 0.46s
Stop and disable mongodb service on upgrade ------------------------------------------------------------------------------------- 0.45s
remove old cinder cron jobs ----------------------------------------------------------------------------------------------------- 0.45s

OC - Workarounds #2

On overcloud-novacompute-0 the following file needs to be removed to workaround a known issue:

$ ssh -F $WD/ssh.config.ansible overcloud-novacompute-0
$ sudo rm /etc/iscsi/.initiator_reset

UC - Deploy play

Finally we run through the deploy_steps_playbook:

$ ansible-playbook -i /usr/bin/tripleo-ansible-inventory \
PLAY RECAP *****************************************************************************************************************************              : ok=48   changed=11   unreachable=0    failed=0              : ok=76   changed=10   unreachable=0    failed=0   
localhost                  : ok=1    changed=0    unreachable=0    failed=0   

Friday 01 December 2017  21:04:58 +0000 (0:00:00.041)       0:10:24.723 ******* 
Run docker-puppet tasks (generate config) ------------------------------------------------------------------------------------- 186.65s
Run docker-puppet tasks (bootstrap tasks) ------------------------------------------------------------------------------------- 101.10s
Start containers for step 3 ---------------------------------------------------------------------------------------------------- 98.61s
Start containers for step 4 ---------------------------------------------------------------------------------------------------- 41.37s
Run puppet host configuration for step 1 --------------------------------------------------------------------------------------- 32.53s
Start containers for step 1 ---------------------------------------------------------------------------------------------------- 25.76s
Run puppet host configuration for step 5 --------------------------------------------------------------------------------------- 17.91s
Run puppet host configuration for step 4 --------------------------------------------------------------------------------------- 14.47s
Run puppet host configuration for step 3 --------------------------------------------------------------------------------------- 13.41s
Run docker-puppet tasks (bootstrap tasks) -------------------------------------------------------------------------------------- 10.39s
Run puppet host configuration for step 2 --------------------------------------------------------------------------------------- 10.37s
Start containers for step 5 ---------------------------------------------------------------------------------------------------- 10.12s
Run docker-puppet tasks (bootstrap tasks) --------------------------------------------------------------------------------------- 9.78s
Start containers for step 2 ----------------------------------------------------------------------------------------------------- 6.32s
Gathering Facts ----------------------------------------------------------------------------------------------------------------- 4.37s
Gathering Facts ----------------------------------------------------------------------------------------------------------------- 3.46s
Write the config_step hieradata ------------------------------------------------------------------------------------------------- 1.80s
create libvirt persistent data directories -------------------------------------------------------------------------------------- 1.21s
Write the config_step hieradata ------------------------------------------------------------------------------------------------- 1.03s
Check if /var/lib/docker-puppet/docker-puppet-tasks4.json exists ---------------------------------------------------------------- 1.00s


I’ll revisit this in the coming days and add a more complete set of tasks to verify the end environment but for now we can run a simple boot from volume instance (as Swift, the default store for Glance was not installed):

$ cinder create 1
$ cinder set-bootable 46d278f7-31fc-4e45-b5df-eb8220800b1a true
$ nova flavor-create 1 1 512 1 1
$ nova boot --boot-volume 46d278f7-31fc-4e45-b5df-eb8220800b1a --flavor 1 test 
$ nova list
| ID                                   | Name | Status | Task State | Power State | Networks          |
| 05821616-1239-4ca9-8baa-6b0ca4ea3a6b | test | ACTIVE | -          | Running     | priv= |

We can also see the various containerised services running on the overcloud:

$ ssh -F $WD/ssh.config.ansible overcloud-controller-0
$ sudo docker ps
CONTAINER ID        IMAGE                                                                                             COMMAND                  CREATED             STATUS                      PORTS               NAMES
d80d6f072604                  "kolla_start"            13 minutes ago      Up 12 minutes (healthy)                         glance_api
61fbf47241ce                    "kolla_start"            13 minutes ago      Up 13 minutes                                   nova_metadata
9defdb5efe0f                    "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         nova_api
874716d99a44             "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         nova_vnc_proxy
21ca0fd8d8ec              "kolla_start"            13 minutes ago      Up 13 minutes                                   neutron_api
e0eed85b860a               "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         cinder_volume
0882e08ac198            "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         nova_consoleauth
e3ebc4b066c9                    "kolla_start"            13 minutes ago      Up 13 minutes                                   nova_api_cron
c7d05a04a8a3                  "kolla_start"            13 minutes ago      Up 13 minutes                                   cinder_api_cron
2f3c1e244997   "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         neutron_ovs_agent
bfeb120bf77a      "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         neutron_metadata_agent
43b2c09aecf8              "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         nova_scheduler
a7a3024b63f6          "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         neutron_dhcp
3df990a68046            "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         cinder_scheduler
94461ba833aa            "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         neutron_l3_agent
4bee34f9fce2                  "kolla_start"            13 minutes ago      Up 13 minutes                                   cinder_api
e8bec9348fe3              "kolla_start"            13 minutes ago      Up 13 minutes (healthy)                         nova_conductor
22db40c25881                    "/bin/bash -c '/usr/l"   15 minutes ago      Up 15 minutes                                   keystone_cron
26769acaaf5e                    "kolla_start"            16 minutes ago      Up 16 minutes (healthy)                         keystone
99037a5e5c36                      "kolla_start"            16 minutes ago      Up 16 minutes                                   iscsid
9f4aae72c201          "kolla_start"            16 minutes ago      Up 16 minutes                                   nova_placement
311302abc297                     "kolla_start"            16 minutes ago      Up 16 minutes                                   horizon
d465e4f5b7e6                     "kolla_start"            17 minutes ago      Up 17 minutes (unhealthy)                       mysql
b9e062f1d857                    "kolla_start"            18 minutes ago      Up 18 minutes (healthy)                         rabbitmq
a57f053afc03                   "/bin/bash -c 'source"   18 minutes ago      Up 18 minutes                                   memcached
baeb6d1087e6                       "kolla_start"            18 minutes ago      Up 18 minutes                                   redis
faafa1bf2d2e                     "kolla_start"            18 minutes ago      Up 18 minutes                                   haproxy
$ exit
$ ssh -F $WD/ssh.config.ansible overcloud-novacompute-0
$ sudo docker ps
CONTAINER ID        IMAGE                                                                                             COMMAND             CREATED             STATUS                    PORTS               NAMES
0363d7008e87   "kolla_start"       12 minutes ago      Up 12 minutes (healthy)                       neutron_ovs_agent
c1ff23ee9f16                        "kolla_start"       12 minutes ago      Up 12 minutes                                 logrotate_crond
d81d8207ec9a                "kolla_start"       12 minutes ago      Up 12 minutes                                 nova_migration_target
abd9b79e2af8          "kolla_start"       12 minutes ago      Up 12 minutes                                 ceilometer_agent_compute
aa581489ac9a                "kolla_start"       12 minutes ago      Up 12 minutes (healthy)                       nova_compute
d4ade28175f0                      "kolla_start"       14 minutes ago      Up 14 minutes                                 iscsid
ae4652853098                "kolla_start"       14 minutes ago      Up 14 minutes                                 nova_libvirt
aac8fea2d496                "kolla_start"       14 minutes ago      Up 14 minutes                                 nova_virtlogd


So in conclusion this demo takes a simple multi-host OpenStack deployment of Keystone, Glance, Cinder, Neutron and Nova from baremetal Newton to containerised Queens in ~26 minutes. There are many things still to resolve and validate with FFU but for now, ahead of M2 this is a pretty good start.

December 01, 2017 05:00 PM

OpenStack Superuser

How to containerize GPU applications

By providing self-contained execution environments without the overhead of a full virtual machine, containers have become an appealing proposition for deploying applications at scale. The credit goes to Docker for making containers easy-to-use and hence making them popular. From enabling multiple engineering teams to play around with their own configuration for development, to benchmarking or deploying a scalable microservices architecture, containers are finding uses everywhere.

GPU-based applications, especially in the deep learning field, are rapidly becoming part of the standard workflow; deploying, testing and benchmarking these applications in a containerized application has quickly become the accepted convention. But native implementation of Docker containers does not support NVIDIA GPUs yet — that’s why we developed nvidia-docker plugin. Here I’ll walk you through how to use it.


NVIDIA GPUs require kernel modules and user-level libraries to be recognized and used for computing. There is a work-around for this but that requires installation of Nvidia drivers and mapping of the character devices corresponding to NVIDIA GPUs. However, if the host Nvidia driver is changed, the driver version installed inside container is no longer compatible and hence breaks the container usage on host. This goes against the prime feature of containers, i.e. portability. But with Nvidia-docker, which is a wrapper around Docker, one can seamlessly provision a container having GPU devices visible and ready to execute one’s GPU based application.

 Nvidia’s blog on nvidia-docker highlights the two critical points for using a portable GPU container:

  • Driver-agnostic CUDA images
  • A Docker command line wrapper that mounts the user mode components of the driver and the GPUs (character devices) into the container at launch.

Getting started with Nvidia-docker

Installing NVIDIA Docker

Update the NVIDIA drivers for your system before installing nvidia-docker. Also, make sure that Docker is installed on the system. Once you’ve done that, follow installation instructions here .

The next step is to test nvidia-docker:

rtaneja@DGX:~$ nvidia-docker
Usage: docker COMMAND
A self-sufficient runtime for containers
--config string Location of client config files (default "/home/rtaneja/.docker")
-D, --debug Enable debug mode
--help Print usage
-H, --host list Daemon socket(s) to connect to (default [])
-l, --log-level string Set the logging level ("debug", "info", "warn", "error", "fatal") (default "info")
--tls Use TLS; implied by --tlsverify
--tlscacert string Trust certs signed only by this CA (default "/home/rtaneja/.docker/ca.pem")
--tlscert string Path to TLS certificate file (default "/home/rtaneja/.docker/cert.pem")
--tlskey string Path to TLS key file (default "/home/rtaneja/.docker/key.pem")
--tlsverify Use TLS and verify the remote
-v, --version Print version information and quit

Now, let’s test whether you can pull hello-world image from docker-hub using nvidia-docker instead of Docker command:

rtaneja@DGX:~$ nvidia-docker run --rm hello-world
Using default tag: latest
latest: Pulling from library/hello-world
9a0669468bf7: Pull complete
Digest: sha256:cf2f6d004a59f7c18ec89df311cf0f6a1c714ec924eebcbfdd759a669b90e711
Status: Downloaded newer image for hello-world:latest
Hello from Docker!

The message above shows that your installation appears to be working correctly.

Developing GPU applications

For CUDA development, you can start by pulling nvidia/cuda image from Dockerhub

rtaneja@DGX:~$ nvidia-docker run --rm -ti nvidia/cuda:8.0 nvidia-smi
8.0: Pulling from nvidia/cuda
16da43b30d89: Pull complete
1840843dafed: Pull complete
91246eb75b7d: Pull complete
7faa681b41d7: Pull complete
97b84c64d426: Pull complete
ce2347c6d450: Pull complete
f7a91ae8d982: Pull complete
ac4e251ee81e: Pull complete
448244e99652: Pull complete
f69db5193016: Pull complete
Digest: sha256:a73077e90c6c605a495566549791a96a415121c683923b46f782809e3728fb73
Status: Downloaded newer image for nvidia/cuda:8.0

Building your own application

Now instead of running nvidia-smi from docker command line, you can build a Docker image and use CMD to run nvidia-smi, once a container is launched. To build images, Docker reads instructions from Dockerfile and assembles and image. An example Dockerfile for Ubuntu 16.04 and CUDA 9.0 looks from Docker hub looks like this:

# FROM defines the base image
FROM nvidia/cuda:7.5
# RUN executes a shell command
# You can chain multiple commands together with &&
# A \ is used to split long lines to help with readability
RUN apt-get update && apt-get install -y --no-install-recommends \
cuda-samples-$CUDA_PKG_VERSION && \
rm -rf /var/lib/apt/lists/*

# CMD defines the default command to be run in the container
# CMD is overridden by supplying a command + arguments to
# `docker run`, e.g. `nvcc --version` or `bash`
CMD nvidia-smi
#end of Dockerfile

$ docker build -t my-nvidia-smi . # will build an image named my-nvidia-smi and assumes you have Dockerfile in the current directory
$ nvidia-docker images # or docker images will show the nvidia-smi image name

Now, we are ready to execute the image. By default, you will see all GPUs on the host visible inside the container. Using NV_GPU environment variable, with nvidia-docker, provisions a container with number of GPUs desired. For example the command below will have container see only 1 GPU from the host:

NV_GPU=1 nvidia-docker run --rm my-nvidia-smi

| NVIDIA-SMI 384.81               Driver Version: 384.81                      |
| GPU Name          Persistence-M| Bus-Id      Disp.A  |Volatile  Uncorr. ECC |
| Fan Temp   Perf   Pwr:Usage/Cap|        Memory-Usage |GPU-Util   Compute M. |
| 0 Tesla V100-SXM2... On        |00000000:06:00.0 Off |                    0 |
| N/A    37C   P0     45W / 300W | 10MiB / 16152MiB    |           0% Default |

| Processes:                                                        GPU Memory|
| GPU         PID       Type       Process         name             Usage     |
| No running processes found                                                  |

Getting started with Nvidia optimized containers for Deep Learning

To get started with DL development, you can pull the nvidia DIGITS container from docker hub and launch DIGITS web service. The command below maps port 8000 on the host to port 5000 on container and you can access DIGITS at http://localhost:8000 after running this command:

nvidia-docker run --name digits -ti -p 8000:5000 nvidia/digits

Looking forward

Newer version of nvidia-docker (2.0) project which is based on an alpha release of libnvidia-container is announced, and will be the preferred way of deploying GPU containers in future. Now, that you have got an overview of deploying GPU-based containerized applications, Nvidia also provides containerized easy-to-use, fully optimized deep learning software stack through newly announced NVIDIA GPU Cloud container registry service.

About the author

Rohit Taneja is a solutions architect at Nvidia working on supporting deep learning and data analytics applications.

Cover Photo // CC BY NC

The post How to containerize GPU applications appeared first on OpenStack Superuser.

by Rohit Taneja at December 01, 2017 04:09 PM

SUSE Conversations

SUSE Wins Software Defined Infrastructure Product of the Year!

Every year the SVC Awards celebrate excellence in storage, cloud and digitilization, recognizing the achievements of end-users, channel partners and vendors.  The annual event was held last week and SUSE was proud to win the Software Defined Infrastructure Product of the year for SUSE OpenStack Cloud. SUSE OpenStack Cloud was recognized for delivering improved agility, …

+read more

The post SUSE Wins Software Defined Infrastructure Product of the Year! appeared first on SUSE Blog. Terri Schlosser

by Terri Schlosser at December 01, 2017 03:04 PM

Cisco Cloud Blog

Cloud Unfiltered Podcast, Episode 28: VPP, Kubernetes, and with Ed Warnicke

I know what you’re going to say when I reveal that this week’s guest is going to talk about Vector Packet Processing (VPP): “Why do we—the Cloud Unfiltered audience—care about VPP?! I mean, it’s a networking technology, right? And this is a cloud show. We’re here to talk cloud technology and share cloud strategies…not muck […]

by Ali Amagasu at December 01, 2017 02:03 PM

Daniel P. Berrangé

Full colour emojis in virtual machine names in Fedora 27

Quite by chance today I discovered that Fedora 27 can display full colour glyphs for unicode characters that correspond to emojis, when the terminal displaying my mutt mail reader displayed someone’s name with a full colour glyph showing stars:

Mutt in GNOME terminal rendering color emojis in sender name

Chatting with David Gilbert on IRC I learnt that this is a new feature in Fedora 27 GNOME, thanks to recent work in the GTK/Pango stack. David then pointed out this works in libvirt, so I thought I would illustrate it.

Virtual machine name with full colour emojis rendered

No special hacks were required to do this, I simply entered the emojis as the virtual machine name when creating it from virt-manager’s wizard

Virtual machine name with full colour emojis rendered

As mentioned previously, GNOME terminal displays colour emojis, so these virtual machine names appear nicely when using virsh and other command line tools

Virtual machine name rendered with full colour emojis in terminal commands

The more observant readers will notice that the command line args have a bug as the snowman in the machine name is incorrectly rendered in the process listing. The actual data in /proc/$PID/cmdline is correct, so something about the “ps” command appears to be mangling it prior to output. It isn’t simply a font problem because other comamnds besides “ps” render properly, and if you grep the “ps” output for the snowman emoji no results are displayed.

by Daniel Berrange at December 01, 2017 01:28 PM

November 30, 2017


OpenStack Summit Sydney: Welcome to Australia

Earlier this month, Aptira’s founder and CEO Tristan Goode, and the OpenStack Foundation’s Community Manager Tom Fifield welcomed the OpenStack Summit to Sydney. As the founders of Australia’s rapidly growing OpenStack user group, they opened the summit in true Aussie style, warning attendees of the dangerous drop bears and bin chickens.

Check out the keynote video below.

The post OpenStack Summit Sydney: Welcome to Australia appeared first on Aptira Cloud Solutions.

by Jessica Field at November 30, 2017 11:25 PM


Open Source Summit, Prague

In October, RDO had a small presence at the Open Source Summit (formerly known as LinuxCon) in Prague, Czechia.

While this event does not traditionally draw a big OpenStack audience, we were treated to a great talk by Monty Taylor on Zuul, and Fatih Degirmenci gave an interesting talk on cross-community CI, in which he discussed the joint work between the OpenStack and OpenDaylight communities to help one another verify cross-project functionality.

centos_fedora On one of the evenings, members of the Fedora and CentOS community met in a BoF (Birds of a Feather) meeting, to discuss how the projects relate, and how some of the load - including the CI work that RDO does in the CentOS infrastructure - can better be shared between the two projects to reduce duplication of effort.

This event is always a great place to interact with other open source enthusiasts. While, in the past, it was very Linux-centric, the event this year had a rather broader scope, and so drew people from many more communities.

Upcoming Open Source Summits will be held in Japan (June 20-22, 2018), Vancouver (August 29-31, 2018) and Edinburgh (October 22-24, 2018), and we expect to have a presence of some kind at each of these events.

by Rich Bowen at November 30, 2017 09:29 PM

Upcoming changes to test day

TL;DR: Live RDO cloud will be available for testing on upcoming test day. for more info.

The last few test days have been somewhat lackluster, and have not had much participation. We think that there's a number of reasons for this:

  • Deploying OpenStack is hard and boring
  • Not everyone has the necessary hardware to do it anyways
  • Automated testing means that there's not much left for the humans to do

In today's IRC meeting, we were brainstorming about ways to improve participation in test day.

We think that, in addition to testing the new packages, it's a great way for you, the users, to see what's coming in future releases, so that you can start thinking about how you'll use this functionality.

One idea that came out of it is to have a test cloud, running the latest packages, available to you during test day. You can get on there, poke around, break stuff, and help test it, without having to go through the pain of deploying OpenStack.

David has written more about this on his blog.

If you're interested in participating, please sign up.

Please also give some thought to what kinds of test scenarios we should be running, and add those to the test page. Or, respond to this thread with suggestions of what we should be testing.

Details about the upcoming test day may be found on the RDO website.


by Rich Bowen at November 30, 2017 06:47 PM

Getting started with Software Factory and Zuul3


Software Factory 2.7 has been recently released. Software Factory is an easy to deploy software development forge that is deployed at and Software Factory provides, among other features, code review and continuous integration (CI). This new release features Zuul V3 that is, now, the default CI component of Software Factory.

In this blog post I will explain how to deploy a Software Factory instance for testing purposes in less than 30 minutes and initialize two demo repositories to be tested via Zuul.

Note that Zuul V3 is not yet released upstream however it is already in production, acting as the CI system of OpenStack.


Software Factory requires CentOS 7 as its base Operating System so the commands listed below should be executed on a fresh deployment of CentOS 7.

The default FQDN of a Software Factory deployment is In order to be accessible in your browser, must be added to your /etc/hosts with the IP address of your deployment.


First, let's install the repository of the last version then install sf-config, the configuration management tool.

sudo yum install -y
sudo yum install -y sf-config

Activating extra components

Software Factory has a modular architecture that can be easily defined through a YAML configuration file, located in /etc/software-factory/arch.yaml. By default, only a limited set of components are activated to set up a minimal CI with Zuul V3.

We will now add the hypervisor-oci component to configure a container provider, so that OCI containers can be consumed by Zuul when running CI jobs. In others words it means you won't need an OpenStack cloud account for running your first Zuul V3 jobs with this Software Factory instance.

Note that the OCI driver, on which hypervisor-oci relies, while totally functional, is still under review and not yet merged upstream.

echo "      - hypervisor-oci" | sudo tee -a /etc/software-factory/arch.yaml

Starting the services

Finally run sf-config:

sudo sfconfig --enable-insecure-slaves --provision-demo

When the sf-config command finishes you should be able to access the Software Factory web UI by connecting your browser to You should then be able to login using the login admin and password userpass (Click on "Toggle login form" to display the built-in authentication).

Triggering a first job on Zuul

The –provision-demo option is a special command to provision two demo Git repositories on Gerrit with two demo jobs.

Let's propose a first change on it:

sudo -i
cd demo-project
touch f1 && git add f1 && git commit -m"Add a test change" && git review

Then you should see the jobs being executed on the ZuulV3 status page.

Zuul buildset

And get the jobs' results on the corresponding Gerrit review page.

Gerrit change

Finally, you should find the links to the generated artifacts and the ARA reports.

ARA report

Next steps to go further

To learn more about Software Factory please refer to the user documentation. You can reach the Software Factory team on IRC freenode channel #softwarefactory or by email at the mailing list.

by fboucher at November 30, 2017 06:47 PM

NFVPE @ Red Hat

Are you exhausted? IPv4 almost is — let’s setup an IPv6 lab for Kubernetes

It’s no secret that there’s the inevitability that IPv4 is becoming exhausted. And it’s not just tired (ba-dum-ching!). Since we’re a bunch of Kubernetes fans, and we’re networking fans – we really want to check out what we can do with IPv6 with Kubernetes. Thanks to some slinky automation by my colleague, Feng Pan, contributed to kube-centos-ansible, he was able to implement some creative work by leblancd. In this simple setup today, we’re going to deploy Kubernetes with custom binaries from leblancd and have two pods (ideally on different nodes) ping one another with ping6 and declare victory! In the future let’s hope to iterate on what’s necessary to get IPv6 functionality in Kubernetes.

by Doug Smith at November 30, 2017 06:40 PM

OpenStack Superuser

What’s next for OpenStack container orchestration: Magnum updates

At the recent Sydney Summit OpenStack project team leads (PTLs) and core team members offered updates for the OpenStack projects they manage, what’s new for this release and what to expect for the next one, plus how you can get involved and influence the roadmap.

Superuser features summaries of the videos; you can also catch them on the OpenStack Foundation YouTube channel.


What Magnum makes container orchestration engines such as Docker Swarm, Kubernetes, and Apache Mesos available as first-class resources in OpenStack. Magnum uses Heat to orchestrate an OS image which contains Docker and Kubernetes and runs that image in either virtual machines or bare metal in a cluster configuration. Magnum uses Heat to orchestrate an OS image which contains Docker and Kubernetes and runs that image in either virtual machines or bare metal in a cluster configuration.

Who  Spyros Trigazis, the project team lead (PTL) who works on the compute management and provisioning team at CERN.

What’s new
Trigazis talks about the Pike release —  bugs that contributors discovered and fixed — and what you can expect from the Queens release. He also covers the available cluster life-cycle operations that the team has planned for Queens as well as new features and configuration options for the cluster drivers the project currently supports.

What’s next
In Queens, he says they expect to be able to deliver cluster rolling upgrades for Swarm and Kubernetes as well as cluster healing (fixing malfunctioning nodes). Magnum will have in-tree an OpenSuse driver soon as well as the CentOS DC/OS driver. Finally, he summarizes the ongoing work on Magnum Ironic integration with Heat.

How to get involved
Use Ask OpenStack for general questions
For roadmap or development issues, subscribe to the OpenStack development mailing list, and use the tag [magnum] The project’s weekly meetings are held on alternating Tuesdays at 16:00 UTC.

You can check out the whole 20-minute session below.


The post What’s next for OpenStack container orchestration: Magnum updates appeared first on OpenStack Superuser.

by Superuser at November 30, 2017 05:44 PM

November 29, 2017

Cisco Cloud Blog

Cloud Unfiltered Podcast, Episode 27: Cloud Storage with Joe Arnold

Are you a lover of data storage technology? Do your ears perk up when you hear terms like “data gravity” and “erasure coding”? Do you enjoy a good debate about the proper uses of deterministic data placement versus algorithmic placement? Then this is the podcast for you. In this episode, SwiftStack founder and CEO Joe […]

by Ali Amagasu at November 29, 2017 11:59 PM

Julio Villarreal Pelegrino

Bringing Worlds Together: Designing and Deploying Kubernetes on an OpenStack multi-site environment

Bringing Worlds Together: Designing and Deploying Kubernetes on an OpenStack multi-site environment

At OpenStack Summit Sydney, fellow Red Hatter Roger Lopez and I presented on designing and deploying kubernetes on an OpenStack multi-site environment.
Here is the abstract and video of the presentation.

Bringing Worlds Together: Designing and Deploying Kubernetes on an OpenStack multi-site environment

As companies expand their reach to meet new customer demands and needs, so do their IT infrastructures. This expansion brings to the forefront the complexities of managing technologies such as OpenStack in multiple regions and/or countries. Prior to building and expanding these technologies IT teams are likely to ask themselves:

  • How will we manage our growing infrastructure and applications?
  • How will we handle authentication between regions and/or countries?
  • How will we backup/restore these environments?

In order to simplify these complexities and to answer these questions, we look towards a multi-site solution. This session will focus on the best practices on building a highly available multi-site Kubernetes container platform environment on OpenStack.

This session is best suited for OpenStack administrators, system administrators, cloud administrators and container platform administrators.


by Julio Villarreal Pelegrino at November 29, 2017 08:13 PM

OpenStack: The Perfect Virtual Infrastructure Manager (VIM) for a Virtual Evolved Packet Core (vEPC)

OpenStack: The Perfect Virtual Infrastructure Manager (VIM) for a Virtual Evolved Packet Core (vEPC)

I had the honor to present at OpenStack Summit Sydney. One of my presentations was with my co-worker Rimma Iontel. Here is the abstract and the video recording of the presentation.

OpenStack: The Perfect Virtual Infrastructure Manager (VIM) for a Virtual Evolved Packet Core (vEPC)

Virtualizing core services to reduce costs and increase efficiency is a priority for the telecommunications industry. A great example of this trend is the virtualization of the evolved packet core (EPC), a key component that provides voice and data on 4G long-term evolution (LTE) networks.

This presentation will address, with real-life examples and architectures, why OpenStack is the perfect virtual infrastructure manager for this use case. We will also answer the following questions:

  • How does OpenStack fit within the ETSI NFV Reference Architecture?
  • What is the use case for virtual evolved packet core (vEPC)?
  • Why OpenStack?
  • How to architect and design a vEPC deployment on OpenStack to meet a provider’s scale and performance requirements?
  • What are the considerations and best practices?

This session is best suited for telco operators and OpenStack and cloud administrators that want to get exposure to real-life vEPC deployments, their use case, and architectures.


by Julio Villarreal Pelegrino at November 29, 2017 07:47 PM


A summary of Sydney OpenStack Summit docs sessions

Here I'd like to give a summary of the Sydney OpenStack Summit docs sessions that I took part in, and share my comments on them with the broader OpenStack community.

Docs project update

At this session, we discussed a recent major refocus of the Documentation project work and restructuring of the OpenStack official documentation. This included migrating documentation from the core docs suite to project teams who now own most of the content.

We also covered the most important updates from the Documentation planning sessions held at the Denver Project Teams Gathering, including our new retention policy for End-of-Life documentation, which is now being implemented.

This session was recorded, you can watch the recording here:

Docs/i18n project onboarding

This was a session jointly organized with the i18n community. Alex Eng, Stephen Finucane, and yours truly gave three short presentations on translating OpenStack, OpenStack + Sphinx in a tree, and introduction to the docs community, respectively.

As it turned out, the session was not attended by newcomers to the community, instead, community members from various teams and groups joined us for the onboarding, which made it a bit more difficult to find out what the proper focus of the session should be to better accommodate the different needs and expectations of those in the audience. Definitely something to think about for the next Summit.

Installation guides updates and testing

I held this session to identify what are the views of the community on the future of installation guides and testing of installation procedures.

The feedback received was mostly focused on three points:

  • A better feedback mechanism for new users who are the main audience here. One idea is to bring back comments at the bottom of install guides pages.

  • To help users better understand the processes described in instructions and the overall picture, provide more references to conceptual or background information.

  • Generate content from install shell scripts, to help with verification and testing.

The session etherpad with more details can be found here:

Ops guide transition and maintenance

This session was organized by Erik McCormick from the OpenStack Operators community. There is an ongoing effort driven by the Ops community to migrate retired OpenStack Ops docs over to the OpenStack wiki, for easy editing.

We mostly discussed a number of challenges related to maintaining the technical content in wiki, and how to make more vendors interested in the effort.

The session etherpad can be found here:

Documentation and relnotes, what do you miss?

This session was run by Sylvain Bauza and the focus of the discussion was on identifying gaps in content coverage found after the documentation migration.

Again, Ops-focused docs tuned out to be a hot topic as well as providing more detailed conceptual information together with the procedural content, and structuring of release notes. We should also seriously consider (semi-)automating checks for broken links.

You can read more about the discussion points here:

by Petr Kovar at November 29, 2017 07:25 PM

OpenStack Superuser

What’s next for OpenStack container networking: Kuryr updates

At the recent Sydney Summit OpenStack project team leads (PTLs) and core team members offered updates for the OpenStack projects they manage, what’s new for this release and what to expect for the next one, plus how you can get involved and influence the roadmap.

Superuser features summaries of the videos; you can also catch them on the OpenStack Foundation YouTube channel.

What Kuryr is a project aimed on bringing OpenStack networking and storage to container platforms like Docker, Kubernetes and Mesos. Like the name (“courier”) implies, its goal is to be the “integration bridge” between the communities.

Who Antoni Segura Puimedon, PTL, and Daniel Mellado Area, contributor, both work at Red Hat.

What’s new

The latest version of Kuryr offers support for kubeadm. “Until now we were using hypercube to deploy Kubernetes…” Mellado says. “So it should be more reliable now.”

Other important new features for this release include:

“One of the things we’ll be working on for next cycle is to try to containerize the whole output into a container instead of using it as a virtual machine.”

What’s next: Lightning fast pods

“This is the first time we have shown these results in public, this shows some scale tests that were running with OpenShift. Even if Kubernetes is cooler, with OpenShift it works just as well. We tested it quite a bit and we’re going to add support for that also in DevStack,” Segura says.

The blue line in the chart is the time it takes for the Kubernetes API to create a pod, the red line is the time it takes the API to report it as running. “The good thing is with just a bit of work we went from four pods per second to 22 in one week.  We still think there’s a lot of room for improvement,” he adds.

To further boost speed, here are some of the upcoming improvements:

And a full list of what’s coming in Queens:

On the health endpoints, Segura adds: “As more and more OpenStack services are run by Kubernetes it’s more important that they have these health endpoints — that Kubernetes can know of Neutron is being run in a container it’s healthy enough it can talk to Keystone or MySQL and if something happens it can restart.”

Get involved!
Use Ask OpenStack for general questions
For roadmap or development issues, subscribe to the OpenStack development mailing list, and use the tag [kuryr]
Participate in the meetings: Weekly on Mondays at 1400 UTC in #openstack-meeting-4 (IRC webclient)

You can catch the whole 26-minute video below.

Cover Photo // CC BY NC

The post What’s next for OpenStack container networking: Kuryr updates appeared first on OpenStack Superuser.

by Superuser at November 29, 2017 05:15 PM

David Moreau Simard

An experiment: Come try a real OpenStack Queens deployment !

The RDO project community provides vanilla RPM packages and mirrors for deploying OpenStack on the CentOS or RHEL linux distributions. The packages provided by RDO can be deployed manually or through different OpenStack installers such as TripleO, Kolla, Packstack and Puppet-OpenStack. OpenStack-Ansible also relies on RDO for dependencies although it currently installs OpenStack projects from source. At each OpenStack development cycle milestone, the RDO community holds a test day. This gives the opportunity to the greater community of OpenStack users, developers and operators to try out the latest and the greatest of OpenStack with people around to help on IRC in the #rdo channel.

November 29, 2017 12:00 AM

November 28, 2017

Chris Dent

TC Report 48

Due to the recent Summit in Sydney, related travel, and Thanksgiving, it has been a while since I put a TC Report together. It is hard to get back in the groove. Much of the recent discussion has either been reflecting on Summit-initiated discussions or trying to integrate results from those discussions into plans for the future.

Summit Reflections

A lot of my TC-related summit thinking is in a series of blog posts I made last week. This isn't the "Chris promotes his blog report" but I do think that these represent some important OpenStack issues, related to stuff the TC talks about often, so here they are:

Some other summit summaries that might be of interest:

Graham mentions a few things of interest from the joint leadership meeting that happened the Sunday before summit:

  • The potential expansion of the Foundation to include other projects, separate from OpenStack and with separate governance, to address the complexities of integrating all the pieces that get involved in doing stuff with clouds. OpenStack itself continues with its focus on the base infrastructure. There's a press release with a bit more information, and it was talked about during the keynote.

  • A somewhat bizarre presentation suggesting the Board and the TC manage the OpenStack roadmap. There wasn't time to actually discuss this as previous topics ran way over, but at a superficial glance it appeared to involve a complete misunderstanding of not just how open source works in OpenStack, but how open source works in general.

A Tech/Dev/? Blog

Throughout the past week there's been a lot of discussion of how to address the desire for a blog that's been variously described as a "dev blog" (news of what's going on with OpenStack development) or a "tech blog" (a kind of "humble brag" about any cool (dev-related) stuff going on, to remind people that OpenStack does interesting things).

On Thursday there was talk about technology to use, differences of opinion on what content should be present, and the extent to which curation should be involved. If none, why not just carry on with planet?

There was more on Monday and then an email thread.

The eventual outcome is that the existing but rarely used OpenStack Blog would make sense for this but only if there were human involvement in choosing what content should be present. An Acquisitions Editor was suggested. Josh Harlow was press ganged, but it's not clear if the hook set.

PTL Meeting (or tech leadership void filling)

Another topic on Thursday was the notion of having some kind of formal process whereby project roadmaps were more actively visible to other projects in the OpenStack ecosystem. There's an etherpad started but probably best to start with the log which also links to some twitter discussion. A summary (common throughout all the discussion this past week) is "maybe we should get people talking to each other more often?"

The topic evolved and went what might look like two ways: how do we address the perceived void of technical leadership and

I think underlying all of this is that there are people in the commu[n]ity who are concerned that sometimes we have bad or at least not on the same page actors, and we have no mechanism for dealing with that. me

but to some extent that's part and parcel of the same thing.

PTG Timing

Yet more on Thursday: initial discussion of how to divide up time and otherwise format things at the forthcoming PTG in Dublin. There's further discussion on an os-dev thread. Most people seem to be coming down in favor of sticking with what we know.

Project Goals

The Rocky cycle approaches, and that means it is time to start thinking about goals. Logs today for more on that. We are at the stage where candidate goals are being sought. Meanwhile there's some discussion on how best to manage tracking the goals. The current process can be somewhat noisy.

Engaging with the Board

Another topic that happened throughout the week was reflection on the difficulty engaging in full and inclusive conversation with the Board at the leadership meeting. People who either won't or can't engage in an interruption and interjection style of interaction are left out of the discussion. Entry points into the log at Thursday, Friday, Monday.

In the discussion there appear to be two different approaches or attitudes in response to this problem. One is that the problems are the result of too many people attending the meetings and that smaller meetings could address the problems.

The other is that meeting formalisms and general rules of good behavior are not being followed and that as it is important for the entire TC to be engaged with the board, something ought to be done to raise awareness that while people would like to participate the current set up does not make that easy.

I'm in the latter camp. The TC is intentionally an elected body that is fairly large, large enough for it to have a diversity of perspectives. Whatever the base definitions are of "governance", being elected makes the TC representatives of the people who elected them. The leadership meetings are the one time when the TC gets to engage in an official capacity with the Board and User Committee. We should do our best to make sure that it is a setting where all representatives have an opportunity to be present, hear, and be heard.

by Chris Dent at November 28, 2017 04:30 PM

OpenStack Superuser

What’s next for OpenStack compute: Nova updates

At the recent Sydney Summit OpenStack project team leads (PTLs) and core team members offered updates for the OpenStack projects they manage, what’s new for this release and what to expect for the next one, plus how you can get involved and influence the roadmap.

Superuser features summaries of the videos; you can also catch them on the OpenStack Foundation YouTube channel.

What Nova, OpenStack’s compute service. The project aims to implement services and associated libraries to provide massively scalable, on demand, self-service access to compute resources, including bare metal, virtual machines and containers.

Who: two Nova core contributors Matt Riedemann of Huawei and Melanie Witt of Red Hat.

What’s new

Witt offered an overview of new features delivered with the Pike release:

Other improvements include four main microversions. There are more – which you can find under “new features” in the release notes – but Riedemann says “these are the sexy user-API microversions, instead of the technical-debt deprecated stuff…”

What’s next

Here’s what to expect for the upcoming Queens release.

“The big efforts have to do with cells v2,” Riedemann says. “In Pike, you can list instances across cells but we’re not merge-sorting results so you get this barber pole stripe, it’s an unsorted list of cells, in Queens that’s been fixed. Regardless of the number of cells you’ll get a merged sorted result.”  You can check out all the specifications for the next release here.

Get involved!
Use Ask OpenStack for general questions
For roadmap or development issues, subscribe to the OpenStack development mailing list, and use the tag [nova]
Check out the Nova wiki for more information on how to get involved – whether you’re just getting started or interested in going deeper. Participate in the weekly meetings: Thursdays alternating 14:00 UTC (#openstack-meeting) and 21:00 UTC (#openstack-meeting).

View the entire 40-minute session below and download the slides here.


The post What’s next for OpenStack compute: Nova updates appeared first on OpenStack Superuser.

by Superuser at November 28, 2017 03:47 PM


Anomaly Detection in CI logs

Continous Integration jobs can generate a lot of data and it can take a lot of time to figure out what went wrong when a job fails. This article demonstrates new strategies to assist with failure investigations and to reduce the need to crawl boring log files.

First, I will introduce the challenge of anomaly detection in CI logs. Second, I will present a workflow to automatically extract and report anomalies using a tool called LogReduce. Lastly, I will discuss the current limitations and how more advanced techniques could be used.


Finding anomalies in CI logs using simple patterns such as "grep -i error" is not enough because interesting log lines doesn't necessarly feature obvious anomalous messages such as "error" or "failed". Sometime you don't even know what you are looking for.

In comparaison to regular logs, such as system logs of a production service, CI logs have a very interresting characteristic: they are reproducible. Thus, it is possible to carefully look for new events that are not present in other job execution logs. This article focuses on this particular characteristic to detect anomalies.

The challenge

For this article, baseline events are defined as the collection of log lines produced by nominal jobs execution and target events are defined as the collection of log lines produced by a failed job run.

Searching for anomalous events is challenging because:

  • Events can be noisy: they often includes unique features such as timestamps, hostnames or uuid.
  • Events can be scattered accross many differents files.
  • False positives events may appear for various reasons, for example when a new test option has been introduced. However they often share a common semantic with some baseline events.

Moreover, there can be a very high number of events, for example, more than 1 million lines for tripleo jobs. Thus, we can not easily look for each target event not present in baseline events.

OpenStack Infra CRM114

It is worth noting that anomaly detection is already happening live in the openstack-infra operated review system using classify-log.crm, which is based on CRM114 bayesian filters.

However it is currently only used to classify global failures in the context of the elastic-recheck process. The main drawbacks to using this tool are:

  • Events are processed per words without considering complete lines: it only computes the distances of up to a few words.
  • Reports are hard to find for regular users, they would have to go to elastic-recheck uncategorize, and click the crm114 links.
  • It is written in an obscure language


This part presents the techniques I used in LogReduce to overcome the challenges described above.

Reduce noise with tokenization

The first step is to reduce the complexity of the events to simplify further processing. Here is the line processor I used, see the Tokenizer module:

  • Skip known bogus events such as ssh scan: "sshd.+[iI]nvalid user"
  • Remove known words:
    • Hashes which are hexa decimal words that are 32, 64 or 128 characters long
    • UUID4
    • Date names
    • Random prefixes such as (tmp|req-|qdhcp-)[^\s\/]+
  • Discard every character that is not [a-z_\/]

For example this line:

  2017-06-21 04:37:45,827 INFO [nodepool.builder.UploadWorker.0] Uploading DIB image build 0000000002 from /tmpxvLOTg/fake-image-0000000002.qcow2 to fake-provider

Is reduced to:

  INFO nodepool builder UploadWorker Uploading image build from /fake image fake provider

Index events in a NearestNeighbors model

The next step is to index baseline events. I used a NearestNeighbors model to query target events' distance from baseline events. This helps remove false-postive events that are similar from known baseline events. The model is fitted with all the baseline events transformed using Term Frequency Inverse Document Frequency (tf-idf). See the SimpleNeighbors model

vectorizer = sklearn.feature_extraction.text.TfidfVectorizer(
    analyzer='word', lowercase=False, tokenizer=None,
    preprocessor=None, stop_words=None)
nn = sklearn.neighbors.NearestNeighbors(
train_vectors = vectorizer.fit_transform(train_data)

Instead of having a single model per job, I built a model per file type. This requires some pre-processing work to figure out what model to use per file. File names are converted to model names using another Tokenization process to group similar files. See the filename2modelname function.

For example, the following files are grouped like so:

audit.clf: audit/audit.log audit/audit.log.1
merger.clf: zuul/merger.log zuul/merge.log.2017-11-12
journal.clf: undercloud/var/log/journal.log overcloud/var/log/journal.log

Detect anomalies based on kneighbors distance

Once the NearestNeighbor model is fitted with baseline events, we can repeat the process of Tokenization and tf-idf transformation of the target events. Then using the kneighbors query we compute the distance of each target event.

test_vectors = vectorizer.transform(test_data)
distances, _ = nn.kneighbors(test_vectors, n_neighbors=1)

Using a distance threshold, this technique can effectively detect anomalies in CI logs.

Automatic process

Instead of manually running the tool, I added a server mode that automatically searches and reports anomalies found in failed CI jobs. Here are the different components:

  • listener connects to mqtt/gerrit event-stream/ and collects all success and failed job.

  • worker processes jobs collected by the listener. For each failed job, it does the following in pseudo-code:

Build model if it doesn't exist or if it is too old:
	For each last 5 success jobs (baseline):
		Fetch logs
	For each baseline file group:
		Tokenize lines
		TF-IDF fit_transform
		Fit file group model
Fetch target logs
For each target file:
	Look for the file group model
	Tokenize lines
	TF-IDF transform
	file group model kneighbors search
	yield lines that have distance > 0.2
Write report
  • publisher processes each report computed by the worker and notifies:
    • IRC channel
    • Review comment
    • Mail alert (e.g. periodic job which doesn't have a associated review)

Reports example

Here are a couple of examples to illustrate LogReduce reporting.

In this change I broke a service configuration (zuul gerrit port), and logreduce correctly found the anomaly in the service logs (zuul-scheduler can't connect to gerrit): sf-ci-functional-minimal report

In this tripleo-ci-centos-7-scenario001-multinode-oooq-container report, logreduce found 572 anomalies out of a 1078248 lines. The interesting ones are:

  • Non obvious new DEBUG statements in /var/log/containers/neutron/neutron-openvswitch-agent.log.txt.
  • New setting of the firewall_driver=openvswitch in neutron was detected in:
    • /var/log/config-data/neutron/etc/neutron/plugins/ml2/ml2_conf.ini.txt
    • /var/log/extra/docker/docker_allinfo.log.txt
  • New usage of cinder-backup was detected accross several files such as:
    • /var/log/journal contains new puppet statement
    • /var/log/cluster/corosync.log.txt
    • /var/log/pacemaker/bundles/rabbitmq-bundle-0/rabbitmq/rabbit@centos-7-rax-iad-0000787869.log.txt.gz
    • /etc/puppet/hieradata/service_names.json
    • /etc/sensu/conf.d/client.json.txt
    • pip2-freeze.txt
    • rpm-qa.txt

Caveats and improvements

This part discusses the caveats and limitations of the current implementation and suggests other improvements.

Empty success logs

This method doesn't work when the debug events are only included in the failed logs. To successfully detect anomalies, failure and success logs need to be similar, otherwise all the extra information in failed logs will be considered anomalous.

This situation happens with testr results where success logs only contain 'SUCCESS'.

Building good baseline model

Building a good baseline model with nominal job events is key to anomaly detection. We could use periodic execution (with or without failed runs), or the gate pipeline.

Unfortunately Zuul currently lacks build reporting and we have to scrap gerrit comments or status web pages, which is sub-optimal. Hopefully the upcomming zuul-web builds API and zuul-scheduler MQTT reporter will make this task easier to implement.

Machine learning

I am by no means proficient at machine learning. Logreduce happens to be useful as it is now. However here are some other strategies that may be worth investigating.

The model is currently using a word dictionnary to build the features vector and this may be improved by using different feature extraction techniques more suited for log line events such as MinHash and/or Locality Sensitive Hash.

The NearestNeighbors kneighbors query tends to be slow for large samples and this may be improved upon by using Self Organizing Map, RandomForest or OneClassSVM model.

When line sizes are not homogeneous in a file group, then the model doesn't work well. For example, mistral/api.log line size varies between 10 and 8000 characters. Using models per bins based on line size may be a great improvement.

CI logs analysis is a broad subject on its own, and I suspect someone good at machine learning might be able to find other clever anomaly detection strategies.

Further processing

Detected anomalies could be further processed by:

  • Merging similar anomalies discovered accross different files.
  • Looking for known anomalies in a system like elastic-recheck.
  • Reporting new anomalies to elastic-recheck so that affected jobs could be grouped.


CI log analysis is a powerful service to assist failure investigations. The end goal would be to report anomalies instead of exhaustive job logs.

Early results of LogReduce models look promising and I hope we could setup such services for any CI jobs in the future. Please get in touch by mail or irc (tristanC on Freenode) if you are interrested.

by tristanC at November 28, 2017 06:13 AM

OpenStack Blog

Developer Mailing List Digest November 18-27

Community Summaries

  • Glance priorities [0]
  • Nova placement resource provider update [1]
  • Keystone Upcoming Deadlines [2]
  • Ironic priorities and subteam reports [3]
  • Keystone office hours [4]
  • Nova notification update [5]
  • Release countdown [6]
  • Technical committee status update [7]

Self-healing SIG created

Adam Spiers announced the formation of a SIG around self-healing. Its scope is to coordinate the use and development of several OpenStack projects which can be combined in various ways to manage OpenStack infrastructure in a policy-driven fashion, reacting to failures and other events by automatically healing and optimising services.

Proposal for a QA SIG

A proposal to to have a co-existing QA special interest group (SIG) that would be a place for downstream efforts to have a common place in collaborating and sharing tests. Example today the OPNFV performs QA on OpenStack releases today and are actively looking for opportunities to share tools and test cases. While a SIG can exist to do some code, the QA team will remain for now since there are around 15 QA projects existing like Tempest and Grenade.

Improving the Process for Release Marketing

Collecting and summarizing “top features” during release time is difficult for both PTL’s and Foundation marketing. A system is now in place for PTL’s to highlight release notes [0]. Foundation marketing will work with the various teams if needed to understand and make things more press friendly.

by Mike Perez at November 28, 2017 12:24 AM

November 27, 2017


Emilien Macchi talks TripleO at OpenStack Summit

While at OpenStack Summit, I had an opportunity to talk with Emilien Macchi about the work on TripleO in the Pike and Queens projects.

by Rich Bowen at November 27, 2017 09:39 PM

OpenStack @ NetApp

Setting up an Edge system All in One with OpenStack and ONTAP Select

Some sites aren’t large enough to need or have an entire datacenter infrastructure. Sometimes you want to be able to run just a datacenter in a box. Remote locations, small sites, any place where you might need the entire features of a data center, but only have one box. How do you get all those ... Read more

The post Setting up an Edge system All in One with OpenStack and ONTAP Select appeared first on thePub.

by David Blackwell at November 27, 2017 07:00 PM


How to install OpenStack on your local machine using Devstack

The purpose of this guide is to allow you to install and deploy OpenStack on your own laptops or cloud VMs.

by Guest Post at November 27, 2017 06:52 PM

OpenStack Superuser

What’s next for OpenStack networking: Neutron updates

At the recent Sydney Summit OpenStack project team leads (PTLs) and core team members offered updates for the OpenStack projects they manage, plus how you can get involved and influence the roadmap.

Superuser features summaries of the videos; you can also catch them on the OpenStack Foundation YouTube channel.

Neutron’s goal is to implement services and associated libraries to provide on-demand, scalable and technology-agnostic network abstraction. Neutron provides networking-as-a-service between interface devices (e.g., vNICs) managed by other Openstack services (e.g., Nova, the compute service).

Who Armando Migliaccio, PTL, for the M, N, and O releases and Miguel Lavalle, PTL for the Queens release.

What’s new

The pair outlined a ton of updates for this latest release:

What’s up for the next release

“For the next release, we’ll continue being focused on stability and community-led improvements,” Migliaccio says. That means “Python 3, once and for all” and solidifying the ability to deliver on fewer use cases but “in a rock-solid fashion,” he adds.

The team is also busy on cross-project work:

Optimization of Nova instances migration with multiple port bindings
● Currently, port binding is triggered in the post_live_migration stage, after the migration has
completed. If the binding process fails, the migrated instance goes to error state
● Proposed solution is to allow multiple port bindings. A new inactive port binding will be
created in the destination host during the pre_live_migration stage. If this step succeeds, then
the migration proceeds
● Once the instance is migrated, the destination host port binding will be activated and the
instance and binding in the source host will be removed
● This will also minimize the interval with no network connectivity for the instance

Check out the full 40-minute talk below and download the slides here.

Get involved!
Use Ask OpenStack for general questions
For roadmap or development issues, subscribe to the OpenStack development mailing list, and use the tag [neturon]
To get code, ask questions, view blueprints, etc, see: Neutron Launchpad Page
Check out Neutron’s regular IRC meetings on the #openstack-meeting channel:

The post What’s next for OpenStack networking: Neutron updates appeared first on OpenStack Superuser.

by Superuser at November 27, 2017 03:48 PM

Adam Spiers

Announcing OpenStack’s Self-healing SIG

One of the biggest promises of the cloud vision was the idea that all infrastructure could be managed in a policy-driven fashion, reacting to failures and other events by automatically healing and optimising services.

In OpenStack, most of the components required to implement such an architecture already exist, and are nicely scoped, for the most part without too much overlap:

However, there is not yet a clear strategy within the community for how these should all tie together. (The OPNFV community is arguably further ahead in this respect, but hopefully some of their work could be applied outside NFV-specific environments.)

Designing a new SIG

To address this, I organised an unofficial kick-off meeting at the PTG in Denver, at which it became clear that there was sufficient interest in this idea from many of the above projects in order to create a new “Self-healing” SIG. However, there were still open questions:

  1. What exactly should be the scope of the SIG? Should it be for developers and operators, or also end users?
  2. What should the name be? Is “self-healing” good enough, or should it also include, say, non-failure scenarios like optimization?

In an attempt to answer these, I formally proposed the creation of the SIG, asking the community to fill in a short survey to vote on its creation, and to provide their feedback regarding the name and scope. Unfortunately whilst everyone unanimously supported its creation, opinions were split more or less 50%-50% on the name and the scope! So on advice from Thierry, I listed the SIG as “forming”, created the corresponding wiki page, and proposed a session for the Sydney Forum, which was subsequently accepted.

A SIG is born!

We had around 30 people attend the Sydney Forum session, which was extremely encouraging! You can read more details in the etherpad, but here is the quick summary …

Most importantly, we resolved the naming and scoping issues, concluding that to avoid biting off too much in one go, it was better to be pragmatic and start small:

  • Initially focus on cloud infrastructure, and not worry too much about the user-facing impact of failures yet; we can add that concern whenever it makes sense (which is particularly relevant for telcos / NFV).
  • Not worry too much about optimization initially; Watcher is possibly the only project focusing on this right now, and again we can expand to include optimization any time we want.

So now that the naming and scoping issues are resolved, I am excited to announce that the Self-healing SIG is officially formed!

Discussion went beyond mere administravia, however:

  • We collected a few initial use cases.
  • We informally decided the governance of the SIG. I asked if anyone else would like to assume leadership, but noone seemed keen, dashing my hopes of avoiding extra work 😉 But Eric Kao, PTL of Congress, generously offered to act as co-chair.
  • We discussed health check APIs, which were mentioned in at least 2 or 3 other Forum sessions this time round.
  • We agreed that we wanted an IRC channel, and that it could host bi-weekly meetings. However as usual there was no clean solution to choosing a time which would suit everyone ;-/ I’ll try to figure out what to do about this!

Get involved

You are warmly invited to join, if this topic interests you:

Next steps

I have sent out a similar announcement to the mailing list, and next will set up the IRC channel, and see if we can make progress on agreeing times for regular IRC meetings.

Other than this administravia, it is of course up to the community to decide in which direction the SIG should go, but my suggestions are:

  • Continue to collect use cases. It makes sense to have a very lightweight process for this (at least, initially), so Eric has created a Google Doc and populated it with a suggested template and a first example. Feel free to add your own based on this template.
  • Collect links to any existing documentation or other resources which describe how existing services can be combined. This awesome talk on Advanced Fault Management with Vitrage and Mistral is a perfect example, and here is another, but we need to make it easier for operators to understand which combinations like this are possible, and easier for them to be set up.
  • Finish the architecture diagram drafted in Denver.
  • At a higher level, we could document reference stacks which address multiple self-healing cases.
  • Talk more with the OPNFV community to find out what capabilities they have which could be reused within non-NFV OpenStack clouds.
  • Perform gaps analysis on the use cases, and liase with specific projects to drive development in directions which can address those gaps.

The origin of the idea for the SIG

In case you’re interested in the history …

I first became aware of the need for this SIG while working upstream within the community on OpenStack HA – specifically on compute plane HA, where failures of compute nodes or hypervisors are automatically handled by resurrecting affects VMs on other compute nodes. I saw many groups independently trying to solve the same problem, so I created the #openstack-ha IRC channel, organised weekly meetings, and tried to bring all stakeholders together to converge on a single upstream solution. Progress was gradually made, which we presented in Austin, Boston, and most recently in Tel Aviv at OpenStack Day Israel 2017.

After the talk I had a great conversation with Ifat Afek, who is the PTL of OpenStack Vitrage, which is an awesome project providing RCA (Root Cause Analysis) of faults within OpenStack. Since Vitrage can do things like receive an alert about a fault on a compute node (e.g. from Aodh) and then automatically determine all affected VMs and call out to another service like Mistral to enact appropriate remediation, there was obvious synergy between our work.

However Vitrage goes much further than just compute HA: since it can receive various alerts from multiple types of data source, model relationships between many types of resource, and trigger external services to take action, this kind of combination has tremendous potential for building automatically self-healing cloud infrastructure. And as shown above, there are several other OpenStack projects operating in the same space, which could take this approach further; for example, Congress can be used to specify policies regarding how failures should be handled.

Talking with Ifat resulted in the idea to create a new SIG with the goals of identifying self-healing use cases, establishing and documenting what can already be achieved by combining existing OpenStack services, and enhancing collaboration between the projects and with operators to fill in any remaining gaps. And now you know the rest of the story 🙂


The post Announcing OpenStack’s Self-healing SIG appeared first on Structured Procrastination.

by Adam at November 27, 2017 02:24 PM


OpenStack 3rd Party CI with Software Factory


When developing for an OpenStack project, one of the most important aspects to cover is to ensure proper CI coverage of our code. Each OpenStack project runs a number of CI jobs on each commit to test its validity, so thousands of jobs are run every day in the upstream infrastructure.

In some cases, we will want to set up an external CI system, and make it report as a 3rd Party CI on certain OpenStack projects. This may be because we want to cover specific software/hardware combinations that are not available in the upstream infrastructure, or want to extend test coverage beyond what is feasible upstream, or any other reason you can think of.

While the process to set up a 3rd Party CI is documented, some implementation details are missing. In the RDO Community, we have been using Software Factory to power our 3rd Party CI for OpenStack, and it has worked very reliably over some cycles.

The main advantage of Software Factory is that it integrates all the pieces of the OpenStack CI infrastructure in an easy to consume package, so let's have a look at how to build a 3rd party CI from the ground up.


You will need the following:

  • An OpenStack-based cloud, which will be used by Nodepool to create temporary VMs where the CI jobs will run. It is important to make sure that the default security group in the tenant accepts SSH connections from the Software Factory instance.
  • A CentOS 7 system for the Software Factory instance, with at least 8 GB of RAM and 80 GB of disk. It can run on the OpenStack cloud used for nodepool, just make sure it is running on a separate project.
  • DNS resolution for the Software Factory system.
  • A 3rd Party CI user on Follow this guide to configure it.
  • Some previous knowledge on how Gerrit and Zuul work is advisable, as it will help during the configuration process.

Basic Software Factory installation

For a detailed installation walkthrough, refer to the Software Factory documentation. We will highlight here how we set it up on a test VM.

Software installation

On the CentOS 7 instance, run the following commands to install the latest release of Software Factory (2.6 at the time of this article):

$ sudo yum install -y
$ sudo yum update -y
$ sudo yum install -y sf-config

Define the architecture

Software Factory has several optional components, and can be set up to run them on more than one system. In our setup, we will install the minimum required components for a 3rd party CI system, all in one.

$ sudo vi /etc/software-factory/arch.yaml

Make sure the nodepool-builder role is included. Our file will look like:

description: "OpenStack 3rd Party CI deployment"
  - name: managesf
      - install-server
      - mysql
      - gateway
      - cauth
      - managesf
      - gitweb
      - gerrit
      - logserver
      - zuul-server
      - zuul-launcher
      - zuul-merger
      - nodepool-launcher
      - nodepool-builder
      - jenkins

In this setup, we are using Jenkins to run our jobs, so we need to create an additional file:

$ sudo vi /etc/software-factory/custom-vars.yaml

And add the following content

nodepool_zuul_launcher_target: False

Note: As an alternative, we could use zuul-launcher to run our jobs and drop Jenkins. In that case, there is no need to create this file. However, later when defining our jobs we will need to use the jobs-zuul directory instead of jobs in the config repo.

Edit Software Factory configuration

$ sudo vi /etc/software-factory/sfconfig.yaml

This file contains all the configuration data used by the sfconfig script. Make sure you set the following values:

  • Password for the default admin user.
  admin_password: supersecurepassword
  • The fully qualified domain name for your system.
  • The OpenStack cloud configuration required by Nodepool.
  - auth_url:
    name: microservers
    password: cloudsecurepassword
    project_name: mytestci
    region_name: RegionOne
    regions: []
    username: ciuser
  • The authentication options if you want other users to be able to log into your instance of Software Factory using OAuth providers like GitHub. This is not mandatory for a 3rd party CI. See this part of the documentation for details.

  • If you want to use LetsEncrypt to get a proper SSL certificate, set:

  use_letsencrypt: true

Run the configuration script

You are now ready to complete the configuration and get your basic Software Factory installation running.

$ sudo sfconfig

After the script finishes, just point your browser to https:// and you can see the Software Factory interface.

SF interface

Configure SF to connect to the OpenStack Gerrit

Once we have a basic Software Factory environment running, and our service account set up in, we just need to connect both together. The process is quite simple:

  • First, make sure the local Zuul user SSH key, found at /var/lib/zuul/.ssh/, is added to the service account at

  • Then, edit /etc/software-factory/sfconfig.yaml again, and edit the zuul section to look like:

  default_log_site: sflogs
  external_logservers: []
  - name: openstack
    port: 29418
    username: mythirdpartyciuser
  • Finally, run sfconfig again. Log information will start flowing in /var/log/zuul/server.log, and you will see a connection to port 29418.

Create a test job

In Software Factory 2.6, a special project named config is automatically created on the internal Gerrit instance. This project holds the user-defined configuration, and changes to the project must go through Gerrit.

Configure images for nodepool

All CI jobs will use a predefined image, created by Nodepool. Before creating any CI job, we need to prepare this image.

  • As a first step, add your SSH public key to the admin user in your Software Factory Gerrit instance.

Add SSH Key

  • Then, clone the config repo on your computer and edit the nodepool configuration file:
$ git clone ssh:// sf-config
$ cd sf-config
$ vi nodepool/nodepool.yaml
  • Define the disk image and assign it to the OpenStack cloud defined previously:
  - name: dib-centos-7
      - centos-minimal
      - nodepool-minimal
      - simple-init
      - sf-jenkins-worker
      - sf-zuul-worker
      DIB_CHECKSUM: '1'
      QEMU_IMG_OPTIONS: compat=0.10

  - name: dib-centos-7
    image: dib-centos-7
    min-ready: 1
      - name: microservers

  - name: microservers
    cloud: microservers
    clean-floating-ips: true
    image-type: raw
    max-servers: 10
    boot-timeout: 120
    pool: public
    rate: 2.0
      - name: private
      - name: dib-centos-7
        diskimage: dib-centos-7
        username: jenkins
        min-ram: 1024
        name-filter: m1.medium

First, we are defining the diskimage-builder elements that will create our image, named dib-centos-7.

Then, we are assigning that image to our microservers cloud provider, and specifying that we want to have at least 1 VM ready to use.

Finally we define some specific parameters about how Nodepool will use our cloud provider: the internal (private) and external (public) networks, the flavor for the virtual machines to create (m1.medium), how many seconds to wait between operations (2.0 seconds), etc.

  • Now we can submit the change for review:
$ git add nodepool/nodepool.yaml
$ git commit -m "Nodepool configuration"
$ git review
  • In the Software Factory Gerrit interface, we can then check the open change. The config repo has some predefined CI jobs, so you can check if your syntax was correct. Once the CI jobs show a Verified +1 vote, you can approve it (Code Review +2, Workflow +1), and the change will be merged in the repository.

  • After the change is merged in the repository, you can check the logs at /var/log/nodepool and see the image being created, then uploaded to your OpenStack cloud.

Define test job

There is a special project in OpenStack meant to be used to test 3rd Party CIs, openstack-dev/ci-sandbox. We will now define a CI job to "check" any new commit being reviewed there.

  • Assign the nodepool image to the test job
$ vi jobs/projects.yaml

We are going to use a pre-installed job named demo-job. All we have to do is to ensure it uses the image we just created in Nodepool.

- job:
    name: 'demo-job'
    defaults: global
      - prepare-workspace
      - shell: |
          cd $ZUUL_PROJECT
          echo "This is a demo job"
      - zuul
    node: dib-centos-7
  • Define a Zuul pipeline and a job for the ci-sandbox project
$ vi zuul/upstream.yaml

We are creating a specific Zuul pipeline for changes coming from the OpenStack Gerrit, and specifying that we want to run a CI job for commits to the ci-sandbox project:

  - name: openstack-check
    description: Newly uploaded patchsets enter this pipeline to receive an initial +/-1 Verified vote from Jenkins.
    manager: IndependentPipelineManager
    source: openstack
    precedence: normal
      open: True
      current-patchset: True
        - event: patchset-created
        - event: change-restored
        - event: comment-added
          comment: (?i)^(Patch Set [0-9]+:)?( [\w\\+-]*)*(\n\n)?\s*(recheck|reverify)
        verified: 0
        verified: 0

  - name: openstack-dev/ci-sandbox
      - demo-job

Note that we are telling our job not to send a vote for now (verified: 0). We can change that later if we want to make our job voting.

  • Apply configuration change
$ git add zuul/upstream.yaml jobs/projects.yaml
$ git commit -m "Zuul configuration for 3rd Party CI"
$ git review

Once the change is merged, Software Factory's Zuul process will be listening for changes to the ci-sandbox project. Just try creating a change and see if everything works as expected!


If something does not work as expected, here are some troubleshooting tips:

Log files

You can find the Zuul log files in /var/log/zuul. Zuul has several components, so start with checking server.log and launcher.log, the log files for the main server and the process that launches CI jobs.

The Nodepool log files are located in /var/log/nodepool. builder.log contains the log from image builds, while nodepool.log has the log for the main process.

Nodepool commands

You can check the status of the virtual machines created by nodepool with:

$ sudo nodepool list

Also, you can check the status of the disk images with:

$ sudo nodepool image-list

Jenkins status

You can see the Jenkins status from the GUI, at https:///jenkins/, if logged on with the admin user. If no machines show up at the 'Build Executor Status' pane, that means that either Nodepool could not launch a VM, or there was some issue in the connection between Zuul and Jenkins. In that case, check the jenkins logs at `/var/log/jenkins`, or restart the service if there are errors.

Next steps

For now, we have only ran a test job against a test project. The real power comes when you create a proper CI job on a project you are interested in. You should now:

  • Create a file under jobs/ with the JJB definition for your new job.

  • Edit zuul/upstream.yaml to add the project(s) you want your 3rd Party CI system to watch.

by jpena at November 27, 2017 11:58 AM

November 24, 2017

OpenStack Superuser

How to use OpenStack Glance Image Import

Glance image services include discovering, registering and retrieving virtual machine images. Glance has a RESTful API that allows querying of VM image metadata as well as retrieval of the actual image. VM images made available through Glance can be stored in a variety of locations from simple filesystems to object-storage systems like the OpenStack Swift project.

The long-awaited Image Import Refactor was delivered with the Pike release. There has been “much rejoicing” say Brian Rosmaita (current project team lead, PTL, for the project) and Erno Kuvaja (senior software engineer at Red Hat and Glance core) who recently spoke about it at the OpenStack Summit Sydney. (More on updates to Glance here.)

It replaces the Image Upload and has been years in the making, Kuvaja says. For now, it’s a minimal viable product that enables development for a lot of popular use cases and sets the stage for more to come in Queens. In the 40-minute talk, they walk you through what’s currently available and what you can do to help get the features that will be most useful to you into the next releases.

Glance Image Import was carefully designed with community input from many people and many OpenStack projects to be interoperable and discoverable. The pair explain exactly what that means, why it’s so important and what it means for you as a cloud operator, cloud administrator or cloud user.

“In some small clouds, you might not care what users upload and if something happens you can go find them in their office if they do something bad,” Rosmaita says. “But as clouds get larger, you have different requirements and you may have different levels of trust for your users. The challenge was to come up with an interoperable way to be flexible with how stuff gets into your cloud but make it regimented enough that it’s interoperable between different Openstack clouds.”

This single API workflow which lets operators and tooling developers define and users choose a way they want to upload image data makes uploads more performant. Check out the whole talk below and download the slides here.


Cover Photo // CC BY NC

The post How to use OpenStack Glance Image Import appeared first on OpenStack Superuser.

by Superuser at November 24, 2017 04:02 PM

5 new OpenStack resources

Keep tabs on what's happening on the technical side of OpenStack with these guides, tutorials, and other great learning assets.

by Jason Baker at November 24, 2017 08:00 AM

November 23, 2017

Chris Dent

OpenStack Forum View

I've been lucky enough to go to the last few OpenStack Summits and to the first two PTG (Project Team Gathering) events. This means I've witnessed the transition from the unified conference + design summit to the separated conference + forum and the PTG. I think I've got enough data to say that the transition is not yet complete and needs some additional effort to be optimal.

Part of this is because though there are documents, such as the Forum wiki page, which define what the forum is pretty clearly, there isn't complete agreement on that definition nor on the distinctions between the Forum, the PTG, any remaining mid-cycles, and why any particular individual might like to go to them.

The wiki page currently says:

At the Forum the entire OpenStack community (users and developers) gathers to brainstorm the requirements for the next release, gather feedback on the past version and have strategic discussions that go beyond just one release cycle.

That sounds pretty good, but it doesn't emphasize one of the primary benefits of the Forum versus the PTG:

At the PTG, and the mid-cylces that came before, it is assumed and expected that the people involved are active regular contributors (not necessarily, but often, developers) to the project; people with ongoing relationships with one another. They talk to one another regularly outside the PTG, in IRC, email, hangouts, etc. They talk to each other some more, with higher fidelity, in person, at the PTG.

At the Forum, the hope was that there would be enhanced feedback from, and interaction with, other people: users, operators, casual contributors, members of adjacent communities. All of this does happen, but with some challenges. Some members of the community who either fondly remember the design summit days of yore, or who simply don't appreciate the expanded feedback goals, tend to dominate the conversation, excluding others.

As a case in point, a Forum discussion was held to discuss the future of Mogan, up for review as a potential official project. After the session, I noted on the review:

The forum session was frustrating and I don't think was able to really help move the discussion here forward in any substantial way. After an initial introduction of Mogan, the conversation turned to typically dominant members of the community who frequently speak to one another in IRC speaking to one another in the room to discuss solving problems in Nova and Ironic (and not Mogan) while other people listened. [emphasis added]

Elsewhere many sessions proceeded in an interrupt-driven style where only people comfortable with that style were able to participate.

Again this was often the people who already had established relationships, spoke to one another on IRC often, and were accustomed to each other's pattern of speech. This behavior sidelines other people in the room who can't or won't behave in what they may perceive to be rudeness. They become passive listeners rather than active participants and more than likely experience the clubbiness that has been identified as a limiting factor in developer satisfaction.

It's simple to say, but hard to do: If we want fulsome feedback and inclusivity we all must make a conscious and active effort to listen before we speak.

by Chris Dent at November 23, 2017 11:00 AM

November 22, 2017

Graham Hayes

Sydney OpenStack Summit

OpenStack Down Under

This year the travelling circus that is the OpenStack summit migrated to Sydney. A lot of us in Europe / North America found out exactly how far away from our normal venues it really is. (#openstacksummit on twitter for the days before the summit was an entertaining read :) )

Sunday Board / Joint Leadership Meeting

As I was in Sydney, and staying across the road from the meeting, I decided to drop in and listen. It was an interesting discussion, with a couple of highlights.

Chris Dent had a very interesting item about developer satisfaction - he has blogged about it on his blog: and it is well worth the read.

Johnathon Bryce lead the presentation of a proposed new expansion of the foundation, which he touched on in the Keynote the next day - I have a few concerns, but they are all much longer term issues, and may just be my own interal biases. I think the first new addition to the foundation will let us know how the rest of the process is going to go.

Colleen Murphy and Julia Kreger told us that they (along with Flavio Percoco) will be starting research to help improve our inclusiveness in the community.

The last item was brought forward by 2 board members, and they focused on LTS (Long Term Support / Stable) branches. The time from an upstream release until a user has it in production is actually long than expected - with a lot of time being used by distros packaging and ensuring installers are up to date.

This means that by the time users have a release in production, the upstream branches may be fully deprecated. There was a follow up Forum Session, and there is now an effort to co-ordinate a new methodology for long term collaboration in the LTS Etherpad.

There seems to be an assumption that distros are keeping actual git branches around for the longer term, and not layering patches inside of deb / rpm files, which I think is much more likely. I hope this effort succeeds, but my cynical side thinks this is more of a "fix it for us" cry, than "help us fix it". I suppose we will see if people show up.

One slide from this section was not discussed but concerned me. It was talking about having an enforced "TC Roadmap" which had lines from various workgroups and SIGs. Coming from a project that gets a lot of "Can you do x feature?" (to which I usually respond with "Do you have anyone to write the code?") this concerns me. I understand that it can be hard to get things changed in OpenStack, really I do, but a top down enforced "Roadmap" is not the way forward. Honestly, that two board members of an Open Source foundation think it is is worrying.


Designate had 3 sessions in Sydney:

  • Our project update
  • Project On Boarding
  • Ops Feedback

The project update was good - much improved from Boston, where the 2 presenters were not paid to work on the project. We covered the major potential features, where we were for Cycle goals (both Queens goals completed, and Pike goals underway).

Project on boarding was not hugely attended, but I am hoping that was a side effect of the summit being both smaller and far away.


Ops feedback was great - we got a lot of bugs that were impacting our users and deployers, and collected it in our Feedback Etherpad (any comments welcome).

Cross Project Work

I went to quite a few cross project sessions - there was a good amount of discussion, and some useful work came out of it.

Application Tokens

This is something that had completely slipped past me until now, but the ideas were great, and it would have made things I have done in previous companies much much easier.

Healthchecks per service

We came to a good agreement on how we can do standardised health checks across OpenStack, we now need to write a spec and start coding a new piece of middleware :)

Edge Computing

Not so sure this was worth a vist - it was much more crowded than any of the other Forum sessions I went to, and ended up Bike Shedding on where the Edge ends (we literally spent 10 mins talking about if a car was part of the Edge or a thing managed by the edge.)

I kept hearing "smaller and lighter OpenStack" in that session, but have yet to hear what is too heavy about what we currently have. Nearly all our service scale down to some extent, and you can run a complete infrastructure on an 8GB VM.

Overall, it was a good summit - not too busy, and short. Looking forward to not traveling for the next PTG, I think the DUB -> DOH -> SYD and back drained the enthusiasm for flights for the next few months.

by Graham Hayes at November 22, 2017 07:35 PM

OpenStack Superuser

Why atmail chose OpenStack for email-as-a-service

In a world where most companies broadcast what they’re having for lunch, atmail likes working hard in the background. For users of over 150 million email accounts in nearly 100 countries, the Australian company toils behind the scenes making sure those emails get delivered.

Working from headquarters in the small town of Peregian Beach, Queensland, atmail provides email solutions for service providers (such as internet service providers (ISPs), hosted service provider (HSPs), telecoms, global corporations and government agencies in Australia and the U.S.

OpenStack is critical to getting the job done — currently more than 15 percent of atmail infrastructure is powered by OpenStack-powered DreamHost Cloud. Its infrastructure has gradually migrated from hardware to virtualized.

In a recent talk at the OpenStack Summit Sydney, atmail senior dev ops engineer Matt Bryant talks about why the company chose OpenStack, the journey into cloud and future plans. He was joined onstage by Dreamhost’s VP of product and development, Jonathan LaCour.

atmail started looking to the cloud to replace aging hardware in multiple data centers, maximize cost efficiency and increase flexibility. They searched for the right partner, finding a good match with DreamHost. “A lot of OpenStack providers now are regional,” says LaCour. “They’re serving very specific use cases in particular markets that Amazon probably doesn’t about care that much, this is a good example of why OpenStack has a long-term future.”

The pair set out with a very high goal – to make the transition with little or no impact on our customers,” says Bryant. “What that meant in practice was we had to revisit our architecture at the software and infrastructure layer.”

Looking at their needs through a “prism” of security, performance and scalability, Bryant adds what they started with was pre-cloud, based on the idea of a mail server in a box. “There were a few decisions early on that didn’t play well with a cloud environment. We had to decide what to re-code and what to work around.” Automation was another big component — they went with Ansible’s OpenStack module as well as in-house Perl scripts — and then it came time to “test the hell out of it” Bryant says to see if it was possible to maintain the servers and level of service.

“We got to the stage where both of us were happy, and then we were on to migration.” This is where both companies learned a few important lessons. “You can’t do mail migrations completely without customer interaction,” Bryant says. “There was a whole lot of data, a massive amount of storage (a few terabytes) to pore over,” adds LaCour. They ran into a number of issues with networking and storage.

Among the other takeaways were to keep it simple (forget traditional network topologies and VLANs), the importance of reviewing architecture, dedicating enough resources, having direct access to engineers (they had an IRC channel with their DreamHost counterparts) and finally, test, test and test again.  “Know your failure scenarios and what can go wrong,” Bryant underlines.

“We went from a fairly simple architecture in bare metal to one behind load balancers and multiple nodes behind load balancers,” Bryant says. When you have an intermittent problem on a five-node cluster, it may only happen so often but can be much more complicated to fix, he adds.

What’s next? Bryant says they’re looking into a number of OpenStack projects, namely: Manila (shared file systems), Octavia (load balancer), Monasca (monitoring), Heat (orchestration) and Vitrage (Root Cause Analysis service).

“The more that we can push off into services and concentrate more on our core product, the better,” Bryant says.

You can catch the whole 27-minute talk below.

The post Why atmail chose OpenStack for email-as-a-service appeared first on OpenStack Superuser.

by Superuser at November 22, 2017 03:32 PM

Derek Higgins

Booting baremetal from a Cinder Volume in TripleO

Up until recently in tripleo booting, from a cinder volume was confined to virtual instances, but now thanks to some recent work in ironic, baremetal instances can also be booted backed by a cinder volume.

Below I’ll go through the process of how to take a CentOS cloud image, prepare and load it into a cinder volume so that it can be used to back the root partition of a baremetal instance.

First I do make a few assumptions

  1. you have a working ironic in a tripleo overcloud
    – if this isn’t something you’re familiar with you’ll find some instructions here
    – If you can boot and ssh to a baremetal instance on the provisioning network then your good to go
  2. You have a working cinder in the TripleO overcloud with enough storage to store the volumes
  3. I’ve tested tripleo(and openstack) using RDO as of 2017-11-14, earlier versions had at least one bug and wont work


Baremetal instances in the overcloud traditionally use config-drive for cloud-init to read config data from, config-drive isn’t supported in ironic boot from volume, so we need to make sure that the metadata service is available. To do this, if your subnet isn’t already attached to one, you need to create a neutron router and attach it to the subnet you’ll be booting your baremetal instances with,

 $ neutron router-create r1
 $ neutron router-interface-add r1 provisioning-subnet

Each node defined in ironic that you would like to use for booting from volume needs to use the cinder storage driver, the iscsi_boot capability needs to be set and it requires a unique connector id (increment <NUM> for each node)

 $ openstack baremetal node set --property capabilities=iscsi_boot:true --storage-interface cinder <NODEID>
 $ openstack baremetal volume connector create --node <NODEID> --type iqn --connector-id<NUM>

The last thing you’ll need is a image capable of booting from iscsi, we’ll be starting with the Centos Cloud image but need to alter it slightly so that its capable of booting over iscsi

1. download the image

 $ curl > /tmp/CentOS-7-x86_64-GenericCloud.qcow2.xz
 $ unxz /tmp/CentOS-7-x86_64-GenericCloud.qcow2.xz

2. mount it and change root into the image

 $ mkdir /tmp/mountpoint
 $ guestmount -i -a /tmp/CentOS-7-x86_64-GenericCloud.qcow2 /tmp/mountpoint
 $ chroot /tmp/mountpoint /bin/bash

3. load the dracut iscsi module into the ramdisk

 chroot> mv /etc/resolv.conf /etc/resolv.conf_
 chroot> echo "nameserver" > /etc/resolv.conf
 chroot> yum install -y iscsi-initiator-utils
 chroot> mv /etc/resolv.conf_ /etc/resolv.conf
 # Be careful here to update the correct ramdisk (check/boot/grub2/grub.cfg)
 chroot> dracut --force --add "network iscsi" /boot/initramfs-3.10.0-693.5.2.el7.x86_64.img 3.10.0-693.5.2.el7.x86_64

4. enable rd.iscsi.firmware so that dracut gets the iscsi target details from the firmware[1]

The kernel must be booted with rd.iscsi.firmware=1 so that the iscsi target details are read from the firmware (passed to it by ipxe), this needs to be added to the grub config

In the chroot Edit the file /etc/default/grub and add rd.iscsi.firmware=1 to GRUB_CMDLINE_LINUX=…


5. leave the chroot, unmount the image and update the grub config

 chroot> exit
 $ guestunmount /tmp/mountpoint
 $ guestfish -a /tmp/CentOS-7-x86_64-GenericCloud.qcow2 -m /dev/sda1 sh "/sbin/grub2-mkconfig -o /boot/grub2/grub.cfg"

You now have a image that is capable of mounting its root disk over iscsi, load it into glance and create a volume from it

 $ openstack image create --disk-format qcow2 --container-format bare --file /tmp/CentOS-7-x86_64-GenericCloud.qcow2 centos-bfv
 $ openstack volume create --size 10 --image centos-bfv --bootable centos-test-volume

Once the cinder volume is finish creating(wait for it to become “available”) you should be able to boot a baremetal instance from the newly created cinder volume

 $ openstack server create --flavor baremetal --volume centos-test-volume --key default centos-test
 $ nova list 
 $ ssh centos@
[centos@centos-test ~]$ lsblk
sda 8:0 0 10G 0 disk 
└─sda1 8:1 0 10G 0 part /
vda 253:0 0 80G 0 disk 
[centos@centos-test ~]$ ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx. 1 root root 9 Nov 14 16:59 -> ../../sda
lrwxrwxrwx. 1 root root 10 Nov 14 16:59 -> ../../sda1
lrwxrwxrwx. 1 root root 9 Nov 14 16:58 virtio-pci-0000:00:04.0 -> ../../vda

To see how the cinder volume target information is being passed to the hardware you need to take a look at the iPXE template for the server in questions e.g.

 $ cat /var/lib/ironic/httpboot/<NODEID>/config
set username vRefJtDXrEyfDUetpf9S
set password mD5n2hk4FEvNBGSh
set initiator-iqn
sanhook --drive 0x80 || goto fail_iscsi_retry
sanboot --no-describe || goto fail_iscsi_retry

[1] – due to bug in dracut(now fixed upstream [2]) setting this means that the image can’t be used for local boot
[2] –

by higginsd at November 22, 2017 12:23 AM


Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.


Last updated:
December 15, 2017 07:50 AM
All times are UTC.

Powered by: