September 18, 2018

Chris Dent

TC Report 18-38

Rather than writing a TC Report this week, I've written a report on the OpenStack Stein PTG.

by Chris Dent at September 18, 2018 02:55 PM

OpenStack Stein PTG

For the TL;DR see the end.

The OpenStack PTG finished yesterday. For me it is six days of continuous meetings and discussions and one of the most busy and stressful events in my calendar. I always get ill. There are too few opportunities to refresh my introversion. The negotiating process is filled with hurdles. I often have a sense of not being fully authentic. I have a lot of sympathy for people who come away from the event making tweets like:

I wasn't in the nova room when that happened, so I don't know the full context, but whatever it was, it sounds wrong.

For some people the PTG is a grand time, for some it is challenging and difficult. For most it is a mix of both. Telling it how it is can help to make it better, even if it is uncomfortable.

There was a great deal of discussion about placement being extracted from nova. In the weeks leading up to the PTG there was a quite a lot of traffic, some of it summarized in a recent TC Report. Because I've been involved to a lot of that discussion I got to hear a lot of jokes about placement this past week. I'm sure most of them are meant well, but when the process of extraction has been so long, and sometimes so frustrating, the jokes are tiresome and another load on my already taxed brain. Much of the time they just made me want to punch someone or something.

I'd like to thank Eric Fried, Balazs Gibizer, Ed Leafe, Tetsuro Nakamura, and Matt Riedemann for doing a huge amount of work the past few weeks to get the extracted placement to a state where it has a working collection of tests and creates an operating service. As a team, we've made progress on a thing people have been saying they want for years: making nova smaller and decomposing it into smaller parts. Let's make it a trend.

The PTG was a long week, and I want to remember what happened, so I'm going to write down my experience of the event. This will not be the same as the experiences other people have had, but I hope it is useful.

This was written partially on Saturday while I was still in Denver, and the rest on Tuesday after I returned. On Saturday I was already starting to forget details and now on Tuesday it's all fading away.


Sunday afternoon we held the first of two Technical Committee sessions (the other was Friday). The agenda had a few different topics. The big one was reviewing the existing commitments that TC members have. No surprise: Most people are way over-extended and many tasks, both personal and organisational, fall on the floor. Based on that information we were able to remove several tasks from the TC Tracker. Items that will never get done or should not be the TC's responsibility.

We also talked about needing to manage applications to be official projects with a little more care and attention so that the process is less open-ended than it often is. To help with this there will be some up front time limits on application and we'll ensure that each application has a shepherd from the TC from earlier in the process.

Alan Clark, from the Foundation board, joined in on the conversation for a while. We discussed how to make the joint leadership meetings more effective and what the board needs from the TC: Information about the exciting and interesting things that are in progress in the technical community. To some extent this is needed to help the board members understand why they are bothering to participate and while there is always plenty of cool and interesting stuff going on, it is not always obvious.

This is useful advice as it helps to focus the purpose of the meetings, which sometime have a sense of "why are we here?"

Doug produced a summary of his notes from both days.

Lance Bragstad also made a report.



The API-SIG had a room for all of Monday. At the first PTG in Atlanta we had two days, and used them both. Now we struggled to use all of Monday. In part this is because the great microversion battles have lost their energy, but also the SIG is currently going through a bit of a lull while other topics have more relevance.

We talked about most of the isues on the etherpad and kept notes on the discussion.

One interesting question was whether the SIG was interested in being a home for people interested in distributed APIs that are not based on HTTP. The answer is: "Sure, of course, but those people need to show up."

(People needing to show up was a theme throughout the week.)

Prior to the PTG we tried to kill off the common healthcheck middleware due to lack of attention. This threat drew out some interested parties and brought it back to life.


Right after lunch Ed Leafe (the other API-SIG "leader" who was able to attend the PTG) and I were pulled away to attend a discussion about how cyborg interacts with nova and placement.



Tuesday morning there was a gathering of blazar, nova and placement people to figure out the best ways for blazar to interact. There are some notes on the related etherpad.

The two main outcomes from that were that it ought to be possible to satisfy many of the desired features by implementing a "not member of" functionality in placement which allows a caller to say "I'll accept resources that are not in this aggregate". A spec for that has been started.

That discussion made it clear that the existing member_of functionality is not entirely correct for nested resource providers. The currently functionality requires all the participants in a nested tree to be in an aggregate to show up in results. We decided this not what we want. A bug was created.

placement governance

Right before lunch there was an impromptu gathering of the various people involved in placement to create a list of technical milestones that need to be reached to allow placement to be an independent project. A good summary of that was posted to the mailing list.

It was a useful thing to do, the plan is solid, but nobody seemed to be in the right frame of mind to get into any of the personal, social, and political issues that have caused so much tension, either locally in the past few weeks, or in the last two years.


Later in the afternoon there was a meeting with cinder to see if there was a way that placement could be useful to cinder. It turns out there is a bit of a conceptual mismatch between placement and cinder.

Placement wants to represent a hard measurement of resources as they actually are while cinder, especially when "thin provisioning" is being used, needs to be more flexible. Representing that flexibility to placement in a way that is "safe" is difficult.

Dynamic inventory management is considered either too costly or too racy. I'm not certain this has to be the case. Architecturally, the system ought to be able to cope. There are some risks, but if we wanted to accommodate the risk it might be manageable and would make placement useful to more types of resources.


nova retrospective

Wednesday morning started with a nova cycle retrospective. There was limited attention to that etherpad before the event, but once we got rolling in person it turned out to be a pretty important topic. The main takeaway, for me, was that when we have to change priorities because of unforeseen events, we must trim the list of existing priorities to remove something. It was surprisingly difficult to get people to agree that this was necessary. Time and resources are finite. What other conclusion can we make?

placement topics

Then began a multi-day effort to cover all of the placement topics on the nova etherpad. A lot of this was done Wednesday, but in gaps on Thursday and Friday people returned to placement. Rather than trying to cover each day's topics on the day it happened, all the discussion is aggregated here in this section.

Interestingly (at least to me), during these discussion I had a very clear moment explaining why I often feel culturally alienated while working in the OpenStack community. While trying to argue that we should wait to do something, I use the term YAGNI. Few people in the room were familiar with it, and once it was explained, few people appeared to be sympathetic to the concept. In my experience this is a fundamental concept and driver of good software development.

This was then followed by a lack of sympathy for wanting or needing to define when a project can be considered "done". This too is something I find critical to software development: What are we striving for? How will we know when we get there? When do we get to stop? The reaction in the room seemed to be something along the lines of "never" and "why would we want to?".

These two experiences combined may explain why my experience of OpenStack development, especially in nova, feels so unconstrained and cancerous: There's a desire to satisfy everything, in advance, and to never be done. This is exactly opposite of what I want: narrow what we satisfy, do only what it is required, now, and figure out a way to reach done.

I suspect the reality of things is much less dramatic, but in the moment it felt that way and helped me understand things more.

Once through that, I felt like we managed to figure out some things that we need to do:

  • An idempotent upgrade script that makes it easy for a deployment to move placement data to a new home. Dan has started something.

  • Long term goals include managing affinity in placement, and enabling a form of service sharding so that one placement can manage multiple openstacks.

  • GET /allocation_candidates needs, eventually, an in_tree query parameter to allow the caller to say "give me candidates from these potential trees".

  • Highest priority at this time is getting nested resource providers working on the nova side, in nova-scheduler, in the resource tracker and in the virt drivers.

  • As other services increase their use of placement, and we have more diverse types of hardware being represented as resource providers, we badly need documentation that explains best practices for using resource classes and traits to model that hardware.

  • We need to create an os-resource-classes library, akin to os-traits, to contain the standard resource classes and manage the existing static identifiers associated with those classes. Since naming things is the hardest problem we spent a long time trying to figure out how to name such a thing. There are issues to be resolved with not causing pain for packagers and deployers.

    While we figure that out I went ahead and created a cookiecutter-based os-resource-classes.

  • Getting shared providers working on the nova side is not an immediate concern in the face of the attention required to finish placement extraction and get nested providers working. However Tushar and his colleagues may devote some time to it.


There was continued discussion of placement on Thursday, mostly noted above. Towards the end of this day I was running out of attention and working more on making minor changes to the placement repo. The energy required to give real attention to the room is so high, especially when it is couched in making sure I don't say something that's going to be taken the wrong way. After a while it is easier and a more productive use of time to give attention to something else. The people who are able to stick through a solid three days in the nova room are made of sterner stuff than me.


On Friday it was back to TC-related discussions, following the agenda on the etherpad. As stated above, Doug made a good summary email.

We started off by reviewing team health. Lots of different issues but a common thread is that many teams are suffering from lack of contributors. Some teams report burn out in their core reviewers. In the room we discussed why we sometimes I only find out about issues in team late; Why aren't project team members seeking out the assistance of the TC sooner? I suggested that perhaps there's insufficient evidence that the TC is empowered to resolve things.

Even if that's the case (we did not resolve that question), reaching out to the TC sooner than later is going to be beneficial for all involved as it increases awareness and can help direct people to the right resources.

There was a great deal of discussion in the room about making OpenStack (including the TC) more accessible to contributors from China. This resulted in a proposed resolution for a tc role in global reachout.

There was also a lot of discussion about strategies for increasing traction for SIGs, such as the public cloud sig. Some of this reflected the orchestration thread that Matt Riedemann started. During the discussion another resolution was proposed to Add long term goal proposal to community wide goals.

Discussion of the pending tech vision was around clarifying what the vision is for and making sure we publicize it well enough to get real feedback. Two main reasons to have the vision is to help drive the decision making process when evaluating projects that wish to be "official" and when selecting community wide goals. These are both important things but I think the main thing a vision we've all agreed to can provide is as a guide in any decisions in OpenStack. If we are able to point at a thing as the overarching goal of all of OpenStack, it becomes easier to say "no" to things that are clearly out of scope and thus have more energy for the things to which we clearly say "yes".

Throughout the discussion of project health and gaps in contribution I kept thinking it's important that we make gaps more visible, not come up with ways to do more with the resources we have. Many many people are expressing that they are overextended. We cannot take on more and remain healthy. If something is important enough people will come. If they don't come, the needs are either not important or not visible enough. The role of the TC should be to make things visible.

Feature wise we need to be more reactive and enabling, "we will make space for you to do this thing" and less "we're listening and will do this thing for you".

This includes allowing things to be less than perfect so their brokenness operates as an attracting influence. As a community we've been pre-disposed to thinking that if we don't make things proper from the start people will ignore us. I think we need to have some confidence that we are making useful stuff and make room for people to come and help us, for the sake of helping themselves.

What Now?

Based on what I've been able to read from various members of the community in blog posts, tweets, posts to the os-dev mailing list, it sounds like it was a pretty good week: We made some plans and figured out solutions to some difficult problems. The trick now is to follow through and focus on those things while avoiding adding yet more to the list.

For me, however, it is hard to say that it is worth it. I do not come away from these things motivated and focused. I'm overwhelmed by the vast array of things we seem to promise and concerned by the unaddressed disconnects and differences in our perceptions and actions. I'm sure once I've recovered I'll be back to making steady progress, but for now if I'm "telling it how it is" I have to wonder if the situation would be any different if I hadn't gone, or if none of us had gone.

by Chris Dent at September 18, 2018 02:30 PM

OpenStack Superuser

How to manage micro-services with Istio

No matter how you stack up the projections, containers will only grow in popularity.

Olaph Wagoner launched into his talk about micro-services citing research that projects
container revenue will approach $3.5 billion in 2021, up from a projected $1.5 billion
in 2018.

Wagoner, a software engineer and developer advocate at IBM, isn’t 100 percent sure about that prediction but he is certain that containers are here to stay.

In a talk at the recent OpenInfra Days Vietnam, he offered an overview on micro-services, Kubernetes and Istio.

Small fry?

There is no industry consensus yet on the properties of micro-services, he says, though defining characteristics include being independently deployable and easy to replace.

As for these services being actually small, that’s up for debate. “If you have a hello world application, that might be considered small if all it does is print to the console,” he explains. “A database server that runs your entire application could still be considered a micro-service, but I don’t think anybody would call that small.”


Kubernetes defines itself as a portable, extensible open-source platform for managing containerized workloads and services that facilitates both declarative configuration and automation. “Simply put, it’s a way to manage a bunch of containers or services whatever you want to call them,” he says.

Meshing well

Istio is an open-source service mesh that layers transparently onto existing distributed applications, allowing you to connect, secure, control and observe services. And one last definition: service mesh is the network of micro-services that make up these distributed applications and the interactions between them.

What does this mean for users?

“Istio expands on all the things you can do with the Kubernetes cluster,” Wagoner says. And that’s a long list: Automatic load balancing for HTTP, gRPC, WebSocket, and TCP traffic; fine-grained control of traffic behavior with rich routing rules, retries, failovers and fault injection; a pluggable policy layer and configuration API supporting access controls, rate limits and quotas; Secure service-to-service authentication with strong identity assertions between services in a cluster.

“The coolest things are the metrics, logs and traces,” he says. “All of a sudden you can read logs to your heart’s content and know exactly which services are talking to each other, how often and who’s mad at who.”

What’s inside Istio

Wagoner outlined Istio’s main components:

“Getting down to the nitty gritty”

At the 13:07 mark, Wagoner goes into detail about how Istio works with OpenStack and why it’s worthwhile.

Once you’ve got Kubernetes and installed Istio on top, the K8s admin can create a cluster using the OpenStack API. The K8s user in OpenStack can just use the cluster — they don’t get to see all the APIs doing the heavy lifting behind the scenes. All of this is now possible thanks to the Kubernetes OpenStack Cloud Provider.  “What does this bad boy do?” Wagoner asks. “The Kubernetes services sometimes needs stuff from the underlying cloud,  services, endpoints etc. and that’s the goal.” Ingress (part of Mixer) is a perfect example, he says, it relies on the OpenStack Cloud Provider for load balancing and add end points.

“This is my favorite part of the whole idea of why you would want to make OpenStack run  Kubernetes in the first place —  the idea of mesh expansion. You’ve got your cloud and you’ve got Kubernetes running on your OpenStack cloud and it’s do telling you everything your cluster is doing. You can expand that expand that service mesh to include not only virtual machines from your OpenStack cloud is but also bare metal instances.”


Catch the full 27-minute talk here or download the slides.

The post How to manage micro-services with Istio appeared first on Superuser.

by Nicole Martinelli at September 18, 2018 02:20 PM

Trinh Nguyen

My first month as Searchlight's PTL

At the end of the Rocky development cycle of OpenStack, the Technical Committee intended to remove Searchlight [1] and some other projects (e.g. Freezer) from the Governance (OpenStack official projects). The reasons for that are because Searchlight has missed several milestones and lacked communication of the PTLs (Project Technical Lead). I couldn't stand the fact that a pretty cool project would be abandoned so I nominated to be the PTL of Searchlight for the Stein cycle. I hope that I can do something to revive it.

Why should Searchlight be revived? 

That is the question I tried to answer when decided to take the lead. It's harder and harder for cloud operators to get the information about the resources of other services (e.g Nova Compute, Swift, Cinder etc.) because they evolve very fast and there are more and more resources and APIs are made (Do you hear about the placement API of Nova?). Searchlight is pretty cool in this case because It centralizes all of that information with one set of APIs. So, you don't have to update your software whenever there is a new API update, Searchlight does it for you. Searchlight has a plugin-type architecture so It's pretty easy to add support for a new service. Currently, you can search for information about these services with Searchlight: Ironic, Neutron, Nova, Swift, Cinder, Designate, Glance.

My job as the PTL
  • Analyze Searchlight's current situation and propose the next action plan
  • Clean up the mess (bugs, unreviewed patches, blueprints)
  • Organize meetings with the active contributors
  • Attract more contributors that would create a sustainable project
  • Finally, the most important thing is to release Searchlight in Stein :)
What I did so far
  • Cleaned up most of the patches, fix some bugs, merge some other patches [4], [5], [6], [7]
  • Moved Searchlight from Launchpad to Storyboard [8]
  • Be able to contact the other 2 active contributors of the team :)
  • Have some features to release in Stein-1 (e.g. ElasticSearch 5.x)
Nearest milestone to look for

Stein-1 (Oct 22-26)

My expectations
  • Maintain a solid number of core reviewers (currently we have 3 including me)
  • Fix most of the existing and new-found bugs in Stein-1
  • Create some use cases of Searchlight so that it can attract more users and developers
  • Can release Searchlight's stable/stein

How people can get involved?

It's really easy to contribute to Searchlight. You can just simply do one of these below:
  • Join the IRC channel #openstack-searchlight and discuss with us how you want to use Searchlight
  • Review some patches [4] [5] [6] [7]. This will help other improve their code and learn from you.
  • Fix some bugs / docs



by Trinh Nguyen ( at September 18, 2018 12:14 AM

September 17, 2018

Lance Bragstad

OpenStack Stein PTG TC Report

I spent all day Friday, except for one nova session, in the TC room. I’ll admit I wasn’t able to completely absorb all discussions and outcomes, but these are the discussions I was able to summarize.

Note, if the picture in this post looks familiar, it’s because you saw it while ordering a beer at Station 26, located right behind the conference hotel. The brewery operates out of an old fire station, hence the name. They show support for first responders, firefighters, law enforcement, military, and emergency medical services by displaying department patches throughout their establishment. Besides that, their beer is delicious.

Project Health

The morning started off discussing project health. This initiative is relatively new and came out of the Vancouver summit. The purpose is to open lines of communication between project leads and members of the TC. It also helps the TC keep a pulse on overall health across OpenStack project teams. The discussion focused on feedback, determining how useful it was, and the ways it could be improved.

Several TC members reported varying levels of investment in the initiative, ranging from an hour to several hours. Responses from PTLs varied from community goal status to contributor burnout. The TC decided to refine the phrasing used when reaching out to projects hoping that it clarifies the purpose, reduces time spent collecting feedback, and makes it easier for PTLs to formulate accurate responses. Action items included amending the PTL guide to include a statement about open communication with the TC and sending welcome emails to new PTLs with a similar message.

The usefulness of Help Wanted lists surfaced a few times during this discussion. Several people in the room voiced concerns that the lists were not driving contributions as effectively as we'd initially hoped. No direct action items came from this as far as I could tell, but this is a topic for another day.

Global Outreach

We spent the remainder of the morning discussing ways we can include contributors in other regions, specifically the Asia-Pacific region. Not only do different time zones and language barriers present obstacles in communication, but finding common tooling is tough. Most APAC developers struggle with connecting to IRC, which can have legal ramifications depending on location and jurisdiction. The ask was to see if participants would be receptive to a non-IRC-based application to facilitate better communication, specifically WeChat, which is a standard method of communication in that part of the world. Several people in the room made it clear that officially titling a chat room as "OpenStack Technical Committee" would be a non-starter if there wasn't unanimous support for the idea. Another concern was that having a TC-official room might eventually be empty as TC members rotate, thus resulting in a negative experience for the audience we're trying to reach.

The OpenStack Foundation does have formal WeChat groups for OpenStack discussions, and a few people were open to joining as a way to bridge the gap. It helped to have a couple of APAC contributors participating in the discussion, too. They were able to share a perspective that only a few other people in the room have experienced first-hand.

Ultimately, I think everyone agreed that fragmenting communication would be a negative side-effect of doing something like this. Conversely, using WeChat as a way to direct APAC contributors to formal mailing list communication could be very useful in building our contributor base and improving project health.

Howard sent a note to the mailing list after the session, continuing the discussion with a specific focus on asking TC candidates for their opinions.

Evolving Service Architecture & Dependency Management

After lunch, I stepped out to attend a nova session about unified limits. When I returned to the TC room, they were in the middle of discussing service dependencies and architectures.

OpenStack has a rich environment full of projects and services, some of which aren't under OpenStack governance but provide excellent value for developers and operators. On the contrary, there is much duplication across OpenStack services systemic of hesitation to add dependencies. In particular, service dependencies that raise the bar for operators. A great example of this duplication is the amount of user secret or security-specific code for storing sensitive data across services when Barbican was developed to solve that issue. Another good example is the usage of etcd, which was formally accepted as a base service shortly after the Boston summit in 2017. How do we allow developers the flexibility to solve problems using base services without continually frustrating operators because of changing architectural dependencies?

Luckily, there were some operators in the room that were happy to share their perspective. More-often-than-not, the initial reaction operators have when told they need to deploy yet another service is, no. Developers either continue to push the discussion or decide to fix the problem another way. The operators in the room made it clear that justification was the next logical step in that conversation. It's not that operators oppose architectural decisions made by developers, but the reason behind it needs to be explicit. Informing operators that a dependency for secure user secret storage probably isn't going to result in as much yelling and screaming as you might think. Ultimately, developers need to build services in ways that make sense with the tools available to them, and they need to justify why specific dependencies are required. This concise clarification is imperative for operators, deployers, and packagers.

In my opinion, explanations like this are a natural fit for the constellation work in OpenStack, especially since deployers and operators would consume constellations to deploy OpenStack for a particular use-case. I didn't raise this during the meeting, and I'm unsure if others feel the same way. I might try and bring this up in a future office hours session.

Long-Term Community Goals

Community goals fall within the domain of the TC. Naturally, so do long-running community goals. Some points raised in this discussion weren't specific to long-running goals, but community goals in general.

As a community, we started deciding on community-wide initiatives during the Ocata development cycle. Community goals are useful, but they are contentious for multiple reasons. Since they usually affect many projects, resources are always a bottleneck. They are also subject to the priorities of a particular project. Long-running goals are difficult to track, especially if it's a considerable non-trivial amount of work across 30+ projects.

While those things affect the success rate of community-wide goals, we made some progress on making it easier to wrangle long-running initiatives. First and foremost, breaking complicated goals into more digestible sub-goals was a requirement. Some previous goals that were relatively trivial are good examples that even straight-forward code changes can take the entire cycle to propagate across the OpenStack ecosystem. That said, breaking a goal into smaller pieces makes pushing change through our community easier, especially significant change. However, this introduces another problem, which is making the vision for multiple goals clear. Often there are only a few people who understand the end game. We need to leverage the domain-knowledge of those experts to document how all the pieces fit together. A document like this disseminates the knowledge, making it easier for people to chip in effectively and understand the approach. At the very least, it helps projects get ahead of changes and incorporate them into their roadmap early.

There is a patch up for review to clarify what this means for goal definitions. I'd like to try this process with the granular RBAC work that we've been developing over the last year. We already have a project-specific document describing the overall direction in our specification repository. At the very least, going through this process might help other people understand how we can make OpenStack services more consumable to end-users and less painful for deployers to maintain.

by Lance Bragstad at September 17, 2018 08:12 PM

OpenStack Superuser

How Arm is becoming a first class citizen in OpenStack

Over the past two years, the software defined infrastructure (SDI) team at Linaro has worked to successfully deliver a cloud running on Armv8-A AArch64 (Arm64) hardware that’s interoperable with any OpenStack cloud.

To measure interoperability with other OpenStack clouds, we use the OpenStack interop guidelines as a benchmark. Since 2016, we’ve run tests against different OpenStack releases – Newton, Queens and most recently Rocky. With Rocky, OpenStack on Arm64 hardware passed 100 percent of the tests in the 2018.02 guidelines, with enough projects enabled that Linaro’s deployment with Kolla and kolla-ansible are now compliant. This is a big achievement. Linaro is now able to offer a cloud that is, from a user perspective, fully interoperable while running on Arm64 hardware.

So what have we done so far towards making Arm a first-class citizen in OpenStack?

The Linaro Reference Architecture: A simple OpenStack multi-node deployment

We started with a handful of Arm64 servers, installed Debian/CentOS on them and tried to use distro packages to create a very basic OpenStack multi node deployment. At the time, Mitaka was the release and we didn’t get very far. We kept finding and fixing bugs, trying to contribute them upstream and to Linaro’s setup simultaneously, with all the backporting problems that entails. In the end, we decided to build our own packages from OpenStack master (which would later become Newton) and start testing/fixing issues as we found them.

The deployment was a very simple three-node control plane, N node compute plus Ceph for storage of virtual machines and volumes. It was called the Linaro Reference Architecture to ensure all Linaro engineers conducting testing remotely generated comparable results and were able to accurately reproduce failures.

The Linaro Developer Cloud generates data centers

In 2016, Arm hardware was very scarce and challenging to manage (with a culture of boards sitting on engineer’s desks). The team therefore built three data centers (in the United States, China and the United Kingdom) so that Linaro member engineers would find it easier to share hardware and automate workloads.

Linaro co-location at London data center (five racks)

During Newton, Linaro cabled servers, installed operating systems (almost by hand from a local PXE server) and tried to install/run OpenStack master with Ansible scripts that we wrote to install our packages. A cloud was installed in the UK with this rudimentary tooling and a few months later we were at the OpenStack Summit Barcelona, demoing it during the OpenStack Interoperability Challenge 2016.

These were early days for Linaro’s cloud offering and the workload was very simple (LAMP), but it spun five VMs and ran on a multi-node Newton Arm64 cloud with a Ceph backend — successfully and fully automated without any architecture specific changes. Linaro’s clouds are built on hardware donated by Linaro members and capacity from the clouds is contributed to specific open-source projects for Arm64 enablement under the umbrella of the Linaro Developer Cloud.

After that Interoperability Challenge, a team of fewer than five people spent significant time working on the Newton release, fixing bugs on the entire stack (kernel, qemu, libvirt, OpenStack) and keeping up with new OpenStack features. For every new release we used, the interop bar was raised: we were testing against a moving target, the interop guidelines and OpenStack itself.

Going upstream: Kolla

During Pike we decided to move to containers with Kolla, rather than building our own. Working with an upstream project meant our containers would be consumable by others and they would be production ready from the start. With this objective in mind, we joined the Kolla team and started building Arm64 containers alongside the ones already being built. Our goal was to fix the scripts to be multi architecture aware and ensure we could build as many containers as necessary to run a production cloud. Kolla builds a lot of containers that we don’t really use or test on our basic setup, so we only enabled a subset of them. We agreed with the Kolla team that our containers would be Debian based, so we added Debian support back into Kolla, which was at risk of being deprecated because no one was responsible for maintaining it at the time.

Queens was the first release that we could install with kolla-ansible. Rocky is the first one that’s truly interoperable with other OpenStack deployments. For comparison during Pike, we didn’t have object storage support due to a lack of manpower to test it. This support was added during Queens and enabled as part of the services of the UK Developer Cloud.

Once Linaro had working containers and working kolla-ansible playbooks to deploy them, we started migrating the Linaro Developer Cloud away from the Newton self-built package and into a Kolla-based deployment.

Being part of upstream Kolla efforts also meant committing to test the code we wrote. This is something we have started doing, but there’s still more ground to cover. As a first step, Linaro has contributed availability in its China cloud to OpenStack-Infra and the infra team were most helpful bringing up all their tooling on Arm64. This cloud has connectivity challenges when talking to the rest of the OpenStack Foundation infrastructure that needs resolution. In the meantime, Linaro has given OpenStack-Infra access to the UK cloud.

The UK Linaro Developer Cloud has been upgraded to Rocky before ever going into production with Queens. This means it will be Linaro’s first available zone that is fully interoperable with other OpenStack clouds. The other zones will be upgraded shortly to a Kolla-based Rocky deployment.

There are a few changes we’d like to highlight that were necessary to enable Arm64 OpenStack clouds:
Ensuring images boot in UEFI mode
Being able to delete instances with NVRAM
Adding virtio-mmio support
• Adding graphical console
Making the number of PCIe ports configurable
Enabling Swift
Updating documentation

We’ve also been contributing other changes that were not necessarily architecture related but related to the day-to-day operation of the Linaro Developer Cloud. For example, we’ve added some monitoring changes to Kolla to improve the ability to deploy monitorization. For the Rocky cycle, Linaro is the fifth contributor to the Kolla project according to Stackalytics data.

What’s next?

Once the Linaro Developer Cloud is fully functional on OpenStack-Infra, we’ll be able to add gate jobs for the Arm64 images and deployments to the Kolla gates. This is currently work in progress. The agreement with Infra is that any project that wants to have a go at running on Arm64 can also run tests on the Linaro Developer Cloud, if desired. This enables anyone in the OpenStack community to run tests on Arm64. Linaro is still working on adding enough capacity to make this a reality during peak testing times, currently experimental runs can be added to get ready for it.


It’s been an interesting journey, particularly when asked by engineers if we were running OpenStack on Raspberry Pi! Our response has always been: “We run OpenStack on real servers with IPMI, multiple hard drives, reasonable amounts of memory and Arm64 CPUs!”

We’re actively using servers with processors from Cavium, HiSilicon, Qualcomm and others. We’ve also found and fixed bugs in the kernel, in the server firmware, in libvirt and added some features like a guest console. Libvirt made multi-architectural improvements when it reached version three; we’ve been eagerly keeping up with libvirt over the past releases, especially when Arm64 improvements came along. There are issues when it comes to libvirt being fully able to cope with Arm64 servers and hardware configurations. We’re looking into all the missing pieces necessary on the stack so that live migration will work across different vendors.

As with any first deployment, we found issues in Ceph and OpenStack when running the Linaro Developer Cloud in production since running tests on a test cloud is hardly equivalent to having a long standing cloud with VMs that survive host reboots and upgrades. Subsequently, we’ve had to improve maintainability and debuggability on Arm64. In our 18.06 release (we produce an Enterprise Reference Platform that gives interested stakeholders a preview of the latest packages), we added a few patches to the kernel that allow us to get crashdumps when things go wrong.

We’re actively using servers with processors from Cavium, HiSilicon, Qualcomm and others. We’re currently starting to work with the OpenStack-Helm and LOCI teams to see if we can deploy Kubernetes smoothly on Arm64.

If you are interested in running your projects on Arm64, get in touch with us!

About the author

Gema Gomez, technical lead of the SDI team at Linaro Ltd,  joined the OpenStack community in 2014.

The post How Arm is becoming a first class citizen in OpenStack appeared first on Superuser.

by Superuser at September 17, 2018 03:39 PM

September 14, 2018

OpenStack Superuser

Spot the difference: Tenant, provider and external Neutron networks

To this day I see confusion surrounding the terms tenant, provider and external networks. No doubt countless words have been spent trying to tease apart these concepts, so I thought that it’d be a good use of my time to write 470 more.

At a glance

A closer look

Tenant networks are created by users and Neutron is configured to automatically select a network segmentation type like VXLAN or VLAN. The user cannot select the segmentation type.

Provider networks are created by administrators, that can set one or more of the following attributes:

  • Segmentation type (flat, VLAN, Geneve, VXLAN, GRE)
  • Segmentation ID (VLAN ID, tunnel ID)
  • Physical network tag
    Any attributes not specified will be filled in by Neutron.

OpenStack Neutron supports self-service networking – the notion that a user in a project can articulate their own networking topology, completely isolated from other projects in the same cloud, via the support of overlapping IPs and other technologies. A user can create their own network and subnets without the need to open a support ticket or the involvement of an administrator. The user creates a Neutron router, connects it to the internal and external networks (defined below) and off they go. Using the built-in ML2/OVS solution, this implies using the L3 agent, tunnel networks, floating IPs and liberal use of NAT techniques.

Provider networks (read: pre-created networks) is an entirely different networking architecture for your cloud. You’d forgo the L3 agent, tunneling, floating IPs and NAT. Instead, the administrator creates one or more provider networks, typically using VLANs, shares them with users of the cloud, and disables the ability of users to create networks, routers and floating IPs. When a new user signs up for the cloud, the pre-created networks are already there for them to use. In this model, the provider networks are typically routable – They are advertised to the public internet via physical routers via BGP. Therefor, provider networks are often said to be mapped to pre-existing data center networks, both in terms of VLAN IDs and subnet properties.

External networks are a subset of provider networks with an extra flag enabled (aptly named ‘external’). The ‘external’ attribute of a network signals that virtual routers can connect their external facing interface to the network. When you use the UI to give your router external connectivity, only external networks will show up on the list.

To summarize, I think that the confusion is due to a naming issue. Had the network types been called: self-service networks, data center networks and external networks, this blog post would not have been necessary and the world would have been even more exquisite.

About the author

Assaf Muller manages the OpenStack network engineering team at Red Hat. This post first appeared on his blog.

Superuser is always looking for tutorials and opinion pieces about open infrastructure, get in touch at

The post Spot the difference: Tenant, provider and external Neutron networks appeared first on Superuser.

by Superuser at September 14, 2018 02:09 PM

September 13, 2018

OpenStack Superuser

Why it’s time to get serious about NFV

Over the years, the term network functions virtualization may have deviated a little bit from its original definition. This post tries to get back to basics and to start a discussion around the possible deviations.

Let’s start by going back to 2012 and the question: What is NFV trying to achieve?

If you read the original white paper, you’ll see that it was written to address common concerns of the telco industry, brought on by the increasing variety of proprietary hardware appliances, including:

  • Capital expenditure challenges
  • Space and power to accommodate boxes
  • Scarcity of skills necessary to design, integrate and operate them
  • Procure-design-integrate-deploy cycles repeated with little or no revenue benefit due to accelerated end-of- life
  • Hardware life cycles becoming shorter

You’ll also see that NFV was defined to address these problems by “…leveraging standard IT virtualization technology to consolidate many network equipment types onto industry standard high-volume servers, switches and storage, which could be located in data centers, network nodes and in the end-user premises…”

In other words, to implement our network functions and services in a virtualized way, using commodity hardware — but is this really happening?

Almost six years later, we’re still at the early stages of NFV, even when the technology is ready and multiple carriers are experimenting with different vendors and integrators with the vision of achieving the NFV promise. So, is the vision becoming real?

Some initiatives that are clearly following that vision are AT&T’s Flexware and Telefonica’s UNICA projects, where the operators are deploying vendor agnostic NFVI and VIM solutions, creating the foundations for their VNFs and Network Services.

However, most NFV implementations around the world are still not led by the operators, but by the same vendors and integrators who participated in the original root cause of the problem (see: “increasing variety of proprietary hardware appliances.”) This is not inherently bad, because they’re all evolving, but the risk lies in the fact that most of them still rely on a strong business based on proprietary boxes.

The result is that “vertical” NFV deployments, comprised of NFV full stacks, from servers to VNFs, are all provided, supported (and understood) by just a single vendor.  There are even instances of some carriers deploying the whole NFV stack multiple times, sometimes even once per VNF (!), so we’re back to appliances with a VNF tag on them.

We could accept this is part of a natural evolution, where operators feel more comfortable working with the trusted vendor or integrator, while trying to start getting familiar with the new technologies as a first step.  However, this approach might be creating a distortion in the market, making NFV architectures look more expensive than traditional legacy solutions, when, in fact, people are expecting the opposite.

But in the future, is this the kind of NFV that vendors and integrators really want to recommend to their customers? Is this the kind of NFV deployments operators should accept in their networks, with duplicated components and disparate technologies all over the place?

To comply with the NFV vision, operators should lead their NFV deployments and shift gradually from the big appliances world (in all of its forms, including big NFV full stacks) towards “horizontal” NFV deployments, where a common telco cloud (NFVI/VIM) and management (MANO) infrastructure is shared by all the VNFs, with operators having complete control and knowledge of that new telco infrastructure.

Most of the industry believes in that vision, it is technically possible and it’s even more cost-effective, so what is playing against it?

I think we need to admit that the same vendors and integrators will take time to evolve their business models to match this vision.  In the meantime, many operators have started to realize they need to invest in understanding new technologies and creating new roles for taking ownership of their infrastructure and even in opening the door to a new type of integrators, those born with the vision of software-defined, virtualized network functions, using commodity hardware and open-source technologies in their ADN.

Three key elements are not only recommended, but necessary for any horizontal NFV deployment to be feasible:

  • Embrace the NFV concept, with both technology and business outcomes, ensuring the move away from appliances and/or full single-vendor proprietary stacks towards a commodity infrastructure.
  • Get complete control of that single infrastructure and its management and orchestration stack, which should provide life cycle management for all the network services at different abstraction levels.
  • Maximize the usage of components based on open-source technologies, as a mechanism to (1) accelerate innovation by using solutions built by and for the whole industry, and (2) to decrease dependency on a reduced set of vendors and their proprietary architectures.

Regarding the latter, OpenStack is playing a key role in providing an software validated by the whole industry for managing a telco cloud as a NFV VIM, while other open-source projects like Open Source MANO and ONAP are starting to provide the management software needed at higher layers of abstraction to control the life cycle of virtualized network service.

In particular, at Whitestack we selected the OpenStack + Open Source MANO combination for building a solution for NFV MANO.  If you want to explore why, check the following presentation at the OpenStack Summit 2018 (Vancouver) for details: Achieving end-to-end NFV with OpenStack and Open Source MANO.

There are also over a dozen sessions on NFV at the upcoming Berlin Summit.

About the author

Gianpietro Lavado is a network solutions architect interested in the latest software technologies to achieve efficient and innovative network operations. He currently works at Whitestack, a company whose mission is to promote SDN, NFV, cloud and related deployments all around the world, including a new OpenStack distribution.

The post Why it’s time to get serious about NFV appeared first on Superuser.

by Gianpietro Lavado at September 13, 2018 02:13 PM

September 12, 2018

OpenStack Superuser

Talking to the women of “Chasing Grace”: Nithya Ruff

If you’ve been to an major open source conference in, say, the last five years you’ve either spotted Nithya Ruff’s warm smile in the hallway track or heard her speak.

Currently charged with founding and growing an open-source practice at Comcast, Ruff was also founding director of the open source strategy and engagement office for SanDisk and chaired the SanDisk Open Source Working Group. Her open-source credentials stretch back to 1998 while working at the venerable SGI and include projects like Tripwire, Wind River Linux, Yocto Project, Tizen Automotive, Ceph and OpenStack.

Few of us know how she got started, though. The super advocate shares her origin story in a recent interview for the documentary series “Chasing Grace.” The series was created to share the stories of women who faced adversity and became change agents in the tech workplace. Inspired by pioneering computer scientist Grace Hopper, early sponsors include the Linux Foundation and the Cloud Foundry Foundation.

Growing up in Bangalore, India her engineer father showed her a path by not only hiring women in technical positions but also by including her when colleagues visited their home. And while her mother advocated for an early arranged marriage, he nixed the idea in favor of his daughter making it on her own first.

Meet the Women of Chasing Grace: Nithya from Wicked Flicks Productions on Vimeo.

“He showed me that there were very strong careers that women could have in business,” she says. “He really involved me in those conversations, treated me as an equal.” He insisted that she finish her education before settling down and when one of his colleagues planted the idea of studying in the United States, choosing computer science seemed natural.

“It’s important to have people around who push us to do more than we believe we can,” she says. These days Ruff hopes to inspire the next generation as active participant in cross-community initiatives including the recent Diversity Empowerment Summit and the Women of OpenStack group.

Stay tuned for details on how to can catch “Chasing Grace” screenings at upcoming OpenStack Summits.

Superuser would love to hear how you got started in open source, drop us a line at

H/T Shilla Saebi

The post Talking to the women of “Chasing Grace”: Nithya Ruff appeared first on Superuser.

by Superuser at September 12, 2018 02:10 PM

September 11, 2018

OpenStack Superuser

Zuul case study: Software Factory

Zuul drives continuous integration, delivery and deployment systems with a focus on project gating and interrelated projects. In a series of interviews, Superuser asks users about why they chose it and how they’re using it.

Here Superuser talks to the Software Factory team: David Moreau Simard, Fabien Boucher, Nicolas Hicher, Matthieu Huin and Tristan Cacqueray.

SF is a collection of services that provides a powerful platform to build software. Designed to be deployed in anyone’s infrastructure, the project started four-and-a-half years ago and is vastly influenced by the OpenStack project infrastructure. The team operates an instance at where the project is being developed. RDO’s CI at is also based on an SF deployment.

In one of the blog posts it’s explained that “SF is for Zuul what Openshift is for Kubernetes,” what are the advantages for users with Software Factory?

SF is a distribution that integrates all the components as CentOS packages with an installer/operator named sfconfig to manage service configuration, backup, recovery and upgrades. The main advantage for our users is the simplicity of use. There’s a single configuration file to customize the settings and all the services are configured with working defaults so that it is usable out of the box. For example, Zuul is setup with default pipelines and the base job automatically publishes jobs artifacts to a log server.

Another advantage is how customizable deployments can be: whether you need a whole software development pipeline from scratch, from a code review system to collaborative pads, or if you just want to deploy a minimal Zuul gating system running on containers, or if you need an OpenStack third party CI quickly up and running, SF has got you covered.

SF also sets up CI/CD jobs for the configuration repository, similar to the openstack-infra/project-config repository. For example, SF enables users to submit project creation requests through code review and a config-update post job automatically applies the new configuration when approved.

To learn more, check out this presentation: and the project documentation: .

How are people are using SF?

The RDO project has been using SF successfully for about three years now. The goal is to keep the same user experience from upstream to downstream and to allow them to share configurations, jobs, pipelines and easily run third party CI jobs.

With the addition of GitHub support to Zuul, SF can now be used without Gerrit and people are looking into running their own gating CI/CD for GitHub organizations with SF to use the log processing features.

As mentioned earlier, we dogfood the platform since we do our CI on SF, but we also have a release pipeline for the developers team’s blog: the CI pre-renders the tentative articles and once they’re approved they are automatically published.

The latest Software Factory release adds support for tenant deployment and Zuul configuration management from the resources. What does this mean for users and why is it important to ship now?

While Zuul supports multi-tenancy, the rest of the services such as Gerrit or log processors do not. The latest SF release enables tenants to deploy these services on a dedicated and isolated instance. This is important for us because we can now consolidate the RDO deployment to share the same Zuul instance used by, so that we can rationalize the use of our shared resources.

We are also onboarding the Ansible network organization at to leverage OpenStack cloud provider for network appliance testing.

There are a lot of features for possible Zuul future contributions, which are highest priority?

The highest priority is to provide a seamless experience for Jenkins users. Similar to the OpenStack project infrastructure, SF used to rely on Jenkins for job execution. To improve the user experience, we contributed some features such as jobs and builds web interface in Zuul. However, we still lack the capacity to abort and manually trigger jobs through the REST API and we’d like to see that sooner rather than later.

Another goal is to be able to run jobs on OpenShift and we would like to see Nodepool be able to do that too.

Generally speaking, a modular, plugin-based architecture would allow third parties like us to experiment with Zuul without burdening the community with contributions that might be of lower priority to them.

What are you hoping the Zuul community will focus on / deliver?

As packagers of Zuul for CentOS, it would make our lives easier if Zuul followed a release schedule. It may sound at odds with the continuous deployment trend, but packaging and consuming RPMs stability and reliability are of utmost importance. We rely on Zuul’s continuous integration and testing capabilities to ensure that.

The user experience can really make or break Zuul’s acceptance outside of the OpenStack community. As users and operators who have helped the RDO community migrate from Jenkins to Zuul 3 and Ansible, we realized how important it is to educate users about what Zuul is (the concept of gating system itself being very novel) and what it does (for instance, it does not aim to be a one-to-one replacement of Jenkins).

We hope the community will work on spreading the word, help decrease the entry costs and learning curves and generally improve Zuul’s user experience. Of course we will help toward this goal as well!

The days of annual, monthly or even weekly releases are long gone. How is CI/CD defining new ways to develop and manage software within open infrastructure?

As hinted above, we have a more nuanced opinion regarding continuous deployment: it’s great for some workflows but overkill for some others, so it really depends on your production environment and needs. We are, however, firm believers in continuous integration in general and code gating in particular and we hope to see more development teams adopting the “OpenStack way” of doing CI, possibly after trying out SF!

Continuous delivery is a big challenge for packaging RPMs, especially when following an upstream source and especially when this upstream source is OpenStack where hundreds of changes get merged every day. The RDO community came up with a “master chaser” called DLRN which, combined with Software Factory, allows them to ensure that OpenStack packages are always healthy, or at least notify packagers as fast as possible when a change introduces a problem. This new way of managing software is what allows RDO to deliver working OpenStack packages for CentOS just mere hours after an upstream release.

Generally speaking, CI/CD becomes incredibly powerful when paired with the concept of  everything as code. Virtualization, containerization, provisioning technologies turn whole infrastructures into code repositories on which you can apply CI and CD just like you would on “ordinary” code. We actually toyed a while ago with a proof-of-concept where SF would be used to test and deploy configuration changes on an OpenStack infrastructure with OSP Director. Imagination is the limit!

Got questions? You’ll find the Zuul team at AnsibleFest Oct. 2-3.  You can also check out the sessions featuring Zuul at the upcoming Berlin Summit.

Photo // CC BY NC

The post Zuul case study: Software Factory appeared first on Superuser.

by Nicole Martinelli at September 11, 2018 02:12 PM

September 10, 2018

OpenStack Superuser

OpenStack Rocky case study: Flying high now

It’s a show of strength to run a new software release on day one. Even more so if you’re using it to power a brand new region.

Vexxhost did just that, getting into the ring with OpenStack Rocky on the first day of its release.

The Canadian company launched a new region in Santa Clara, California in the heart of the Silicon Valley that’s also the first cloud provider to run Rocky. Founded back in 2006, Vexxhost started out as a web hosting provider offering services from shared hosting to VPS. They adopted OpenStack software in 2011, to provide infrastructure-as-a-service public cloud, private cloud and hybrid cloud solutions to companies of varying sizes around the world.

The new region offers users the latest features and bug fixes right out of the gate. Also on offer are 40Gbps internal networking and nested virtualization, allowing users to take advantage of technology like Kata Containers, an open-source project building lightweight virtual machines that seamlessly plug into the containers ecosystem. Each virtual machine  has access to the 10Gbps public internet and the new data center is also equipped with high-performance triple-replicated and distributed SSD storage.

For the Rocky release, Vexxhost can also now deploy across three operating systems openSUSE, Ubuntu and CentOS, which helps people use the operating system they’re most comfortable with.

The confidence to take on this challenge comes in part from long experience with OpenStack. CEO Mohammed Naser has been involved since 2011, including as an elected member of the Technical Committee and the current project team lead of OpenStack Ansible as well as a core contributor to OpenStack Puppet.

“The really really cool thing is that the cloud is already being used by the upstream OpenStack infrastructure team,” says Naser. “And it’s always awesome to see the community work together and get all these things done.”

Check out the case study on the community webinar starting at the 13:25 mark or the press release here.

Photo // CC BY NC

The post OpenStack Rocky case study: Flying high now appeared first on Superuser.

by Superuser at September 10, 2018 02:09 PM

September 08, 2018


Interviews at OpenStack PTG Denver

I’m attending PTG this week to conduct project interviews. These interviews have several purposes. Please consider all of the following when thinking about what you might want to say in your interview:

  • Tell the users/customers/press what you’ve been working on in Rocky
  • Give them some idea of what’s (what might be?) coming in Stein
  • Put a human face on the OpenStack project and encourage new participants to join us
  • You’re welcome to promote your company’s involvement in OpenStack but we ask that you avoid any kind of product pitches or job recruitment

In the interview I’ll ask some leading questions and it’ll go easier if you’ve given some thought to them ahead of time:

  • Who are you? (Your name, your employer, and the project(s) on which you are active.)
  • What did you accomplish in Rocky? (Focus on the 2-3 things that will be most interesting to cloud operators)
  • What do you expect to be the focus in Stein? (At the time of your interview, it’s likely that the meetings will not yet have decided anything firm. That’s ok.)
  • Anything further about the project(s) you work on or the OpenStack community in general.

Finally, note that there are only 40 interview slots available, so please consider coordinating with your project to designate the people that you want to represent the project, so that we don’t end up with 12 interview about Neutron, or whatever.

I mean, LOVE me some Neutron, but let’s give some other projects love, too.

It’s fine to have multiple people in one interview – maximum 3, probably.

Interview slots are 30 minutes, in which time we hope to capture somewhere between 10 and 20 minutes of content. It’s fine to run shorter, but 15 minutes is probably an ideal length.

by Rain Leander at September 08, 2018 04:36 PM

September 07, 2018

Corey Bryant

OpenStack Rocky for Ubuntu 18.04 LTS

The Ubuntu OpenStack team at Canonical is pleased to announce the general availability of OpenStack Rocky on Ubuntu 18.04 LTS via the Ubuntu Cloud Archive. Details of the Rocky release can be found at:

To get access to the Ubuntu Rocky packages:

Ubuntu 18.04 LTS

You can enable the Ubuntu Cloud Archive pocket for OpenStack Rocky on Ubuntu 18.04 installations by running the following commands:

sudo add-apt-repository cloud-archive:rocky
sudo apt update

The Ubuntu Cloud Archive for Rocky includes updates for:

aodh, barbican, ceilometer, ceph (13.2.1), cinder, designate, designate-dashboard, glance, gnocchi, heat, heat-dashboard, horizon, ironic, keystone, magnum, manila, manila-ui, mistral, murano, murano-dashboard, networking-bagpipe, networking-bgpvpn, networking-hyperv, networking-l2gw, networking-odl, networking-ovn, networking-sfc, neutron, neutron-dynamic-routing, neutron-fwaas, neutron-lbaas, neutron-lbaas-dashboard, neutron-vpnaas, nova, nova-lxd, octavia, openstack-trove, openvswitch (2.10.0), panko, sahara, sahara-dashboard, senlin, swift, trove-dashboard, vmware-nsx, watcher, and zaqar.

For a full list of packages and versions, please refer to:

Python 3 support
Python 3 packages are now available for all of the above packages except swift. All of these packages have successfully been unit tested with at least Python 3.6. Function testing is ongoing and fixes will continue to be backported to Rocky.

Python 3 enablement
In Rocky, Python 2 packages will still be installed by default for all packages except gnocchi and octavia, which are Python 3 by default. In a future release, we will switch all packages to Python 3 by default.

To enable Python 3 for existing installations:

# upgrade to latest Rocky package versions first, then:
sudo apt install python3-<service> [1]
sudo apt install libapache2-mod-wsgi-py3 # not required for all packages [2]
sudo apt purge python-<service> [1]
sudo apt autoremove --purge
sudo systemctl restart <service>-*
sudo systemctl restart apache2 # not required for all packages [2]

For example:

sudo apt install aodh-*
sudo apt install python3-aodh libapache2-mod-wsgi-py3
sudo apt purge python-aodh
sudo apt autoremove --purge
sudo systemctl restart aodh-* apache2

To enable Python 3 for new installations:

sudo apt install python3-<service> [1]
sudo apt install libapache2-mod-wsgi-py3 # not required for all packages [2]
sudo apt install <service>-<name>

For example:

sudo apt install python3-aodh libapache2-mod-wsgi-py3 aodh-api

[1] The naming convention of python packages is generally python-<service> and python3-<service>. For horizon, however, the packages are named python-django-horizon and python3-django-horizon.

[2] The following packages are run under apache2 and require installation of libapache2-mod-wsgi-py3 to enable Python 3 support:

aodh-api, cinder-api, barbican-api, keystone, nova-placement-api, openstack-dashboard, panko-api, sahara-api

Other notable changes
sahara-api: sahara API now runs under apache2 with mod_wsgi

Branch package builds
If you would like to try out the latest updates to branches, we deliver continuously integrated packages on each upstream commit via the following PPA’s:

sudo add-apt-repository ppa:openstack-ubuntu-testing/mitaka
sudo add-apt-repository ppa:openstack-ubuntu-testing/ocata
sudo add-apt-repository ppa:openstack-ubuntu-testing/pike
sudo add-apt-repository ppa:openstack-ubuntu-testing/queens
sudo add-apt-repository ppa:openstack-ubuntu-testing/rocky

Reporting bugs
If you have any issues please report bugs using the ‘ubuntu-bug’ tool to ensure that bugs get logged in the right place in Launchpad:

sudo ubuntu-bug nova-conductor

Thanks to everyone who has contributed to OpenStack Rocky, both upstream and downstream. Special thanks to the Puppet OpenStack modules team and the OpenStack Charms team for their continued early testing of the Ubuntu Cloud Archive, as well as the Ubuntu and Debian OpenStack teams for all of their contributions.

Have fun and see you in Stein!

(on behalf of the Ubuntu OpenStack team)

by coreycb at September 07, 2018 03:21 PM

OpenStack Superuser

How to auto scale a self-healing cluster with Heat

Heat is the core project in the OpenStack orchestration program.

It implements an orchestration engine to launch multiple composite cloud applications based on templates in the form of text files that can be treated like code. A native Heat template format is evolving, but Heat also aims to provide compatibility with the AWS CloudFormation template format, so that many existing CloudFormation templates can be launched on OpenStack. Heat provides both an OpenStack-native ReST API and a CloudFormation-compatible Query API.

Rico Lin offered this tutorial on how to auto-scale a self-healing cluster with Heat at the recent OpenInfra Days in Vietnam. Lin has been the project team lead for Heat in the Rocky, Pike and Queens cycles as well as a Heat core contributor member since the Liberty release. He’s currently a software engineer at EasyStack.

Here he walks you through how to configure Heat, set up Heat container agents before discussing options for auto-scaling, choosing your structure and then launching a self-healing cluster.

For the full 14-minute demo see below and check out his slides here.

Get involved

Check out the Heat self-healing special interest group (SIG). The auto-scaling templates for Heat can be found at GitHub.

The developers use IRC in #heat on Freenode for development discussion.


Meetings are held on IRC in #heat on Freenode every Wednesday. See the Heat agenda page for times and details.

Mailing list

Discussions about Heat happen on the openstack-dev mailing list. Please use the tag [Heat] in the subject line for new threads.

Getting started guides

There are guides for a number of distributions available in the Heat Documentation

The installation guides on


The post How to auto scale a self-healing cluster with Heat appeared first on Superuser.

by Superuser at September 07, 2018 02:10 PM

Chris Dent

Placement Update 18-36

Welcome back to the placement update. The last one was 5 weeks ago. I took a break to focus on some other things for a while. I plan to make it a regular thing again, but will be skipping next week for the PTG.

The big news is that there is now a placement repository. That's the thing I was focussing on. Work is progressing to get it healthy and happy.

Because of that, henceforth the shape of this update will change a bit. If I'm able to find them, I'm going to try to include anything that directly relates to placement. Primarily this will be stuff in the placement repo itself, and related changes in nova, but hopefully it will also include work in Blazar, Cyborg, Neutron, Zun and other projects that are either already working with placement or planning to do so soon. I can't see everything though so if I miss something, please let me know. For this edition I'm not going to go out of my way to report on individual reviews, rather set the stage for the future.

Most Important

If you're going to be at the PTG next week there will be plenty to talk about related to placement.

  • On Monday between 2-3pm Cyborg, Nova, and Placement -interested people will meet in the Cyborg room.
  • On Tuesday 10am it's with Blazar.
  • Sometime, maybe Tuesday afternoon (TBD), with Cinder.
  • Much of Wednesday: in the Nova room to discuss Placement (the service) and placement (the process) -related topics.

The other pending issues are related to upgrades (from-nova, to-placement), migrating existing data, and management of schema migrations. Matt posted a summary of some of that to get feedback from the wider community.

What's Changed


Propose your changes to placement there, not nova. Nova still has placement code within itself, but for the time being the placement parts are frozen.


For now, bugs are still being tracked under nova using the tag placement. There will likely be some changes in this, but it works for now. There's also an etherpad where cleanups and todos are being remembered.


It's that time in the cycle, so let's have a specs section. This currently includes proposals in nova-specs (where placement-service-related specs will live for a while). In the future it will also have any other stuff I can find out there in the world.

Main Themes

We'll figure out what the main themes are next week at the PTG, once that happens this section will have more. In the meantime:

Reshape Provider Trees

Testing of the /reshaper from libvirt and xen drivers is showing some signs of success moving VGPU inventory from the compute node to a child provider.

Consumer Generations

There continues to be work in progress on the nova side to make best use of consumer generations.



The placement repo is currently small enough that looking at all open patches isn't too overwhelming.

Because of all the recent work with extraction, and because the PTG is next week I'm not up to date on what patches that are related to placement are in need of review. In the meantime if you want to go looking around, anything with 'placement' in the commit mesage is fun.

Next time I'll provide more detail.


Thanks to everyone for getting placement this far.

by Chris Dent at September 07, 2018 01:30 PM

September 06, 2018

OpenStack Superuser

Finding the edge: The next OpenStack Hackathon

It’s time to bring your favorite tech toys and start hacking on the edge. The next OpenStack Hackathon takes place on Saturday and Sunday November 10-11 ahead of the Berlin Summit.

The event, free to participants, is hosted by Open Telekom Cloud at their accelerator co-working space Hub:raum in the heart of the German capital.

The theme for this one is about exploring distributed infrastructure, so participants are advised to “get out your toys and apply this idea to a sample cloud with Raspberry Pis, edge routers, system-on-a-chip board designs and other gadgets” that you like. If you can’t pack your favorite devices, organizers will also have some on hand. Mentors will also be available to help getting you up to speed with the systems and technology. If you’re an OpenStack pro, check out the event page for info on how to volunteer and help out.

For team members of all roles–– app developers, devops, UX, sysadmin, or network engineers––hackathons are a great way to learn quickly in a fun and competitive environment. There will be prizes for the best technical solution, the most complete project, the most inclusive approach, the most clever design and more.

OpenStack Hackathons offer participants a chance to learn more about developing applications for OpenStack clouds from experts and put their skills to use by building applications. They launched in 2016 designed as a fast and furious weekend of work that also rewards the best projects with fantastic prizes.

New to hackathons? Check out this post about how to survive and thrive over the weekend.

Sign up for the event here.

The post Finding the edge: The next OpenStack Hackathon appeared first on Superuser.

by Superuser at September 06, 2018 02:06 PM

Robert Collins

Is OpenStack’s mission broken?


  1. Betteridge’s law applies.
  2. Ease of development is self inflicted and not mission creep.
  3. Ease of use is self inflicted and not mission creep.
  4. Ease of operations is self inflicted and not mission creep.
  5. I have concrete suggestions for 2/3/4 but to avoid writing a whole book I’m just going to tackle (2) today.

Warning: this is a little ranty. Its not aimed at any individual, its just crystalised out after a couple of years focused on different things, and was seeded by Jay Pipes when he recently put a strawman up about two related discussions that we haven’t really had as a community:

  1. What should the scope of OpenStack’s mission be?
  2. a technical proposal for ‘mulligan’, a narrowly defined new mission. And yes, I know that OpenStack has incredible velocity. Just imagine what it could be if the issues I describe didn’t exist.

So is it the mission?

I think OpenStack has lots of “issues”, to use the technical term, across, well, everything. I don’t think the mission is even slightly related to the problems though.

The mission has ultimately just brought a huge number of folk together with the idea that they might produce a thing that can act like a cloud.

This has been done before: organisations like AWS, Microsoft, Google and smaller players like Digital Ocean and Rackspace (before OpenStack).

I reject the idea that having such a big, hairy, inclusive mission is a problem.

We can be more rigorous about that though: if a smaller mission would structurally prevent a given issue, then it’s the mission that is the problem. Otherwise, it’s not.

I do think the mission is somewhat ridiculous, but there’s a phrase in some companies:a companies mission defines what it doesn’t do, not what it does.

And I think the current OpenStack mission does that quite well: there are two basic filters that can be applied, and unless at least one matches, it’s out of scope for OpenStack.

  • Can you get $thing from a Public Cloud?
  • Do you uniquely need $thing to run a Cloud?

And yes, there are a billion things in the grey cloud around the edge.

Know what else has this problem? Linux. Well over ~3/5th of its code is in that grey edge. 170M of core, 130M of architectures, 530M in drivers. X86 + arm is 50M of that 130M of architectures.

Linux’s response has been dramatically different to ours though. They have a single conceptual project being built, with enormous configurability in how it’s deployed. We’ve decided that we’re building a billion different things under the same umbrella, and that comes down to a cultural norm.

Cultural norms and silos

Concretely, Swift and Nova: the two original projects, have never conceptually regarded themselves as one project.

Should they?

I honestly don’t know :). But by not being one project (with enormous configurability in now it’s deployed), we set a cultural expectation in OpenStack, that variation in workload implied a new project and new codebase.

Every split out takes years to accomplish – both the literal ones like Glance, and the moral ones like Neutron.

The lines for the split-outs are drawn inconsistently.

To illustrate this, ask yourself: what manages a node in an OpenStack cloud? What’s the component that is responsible for working with the machines actual resources, reporting usages, reporting back to service discovery, healthchecks, liveness etc?

In a clean slate architecture you might design a single agent, and then make it extensible/modular. OpenStack has many separate agents, one per siloed team.

Similarly the scheduling problem for net/disk/compute: there is an enormous vertical stack of cloud-APIs that can be built on a solid base, many of which OpenStack has in its portfolio. But that stack is not being built on a common scheduler – and can’t be because the cultural norm is to split things out, not to actually figure out how to maintain things more effectively without moving the code around.

Some things really are better off as separate projects – and I’m not talking monorepo vs repo-per-project, thats really only about the ability to do some changes atomically. A reusable library like oslo.config is only reusable by being a separate project. oslo.db though, exists solely because we have many separate projects that all look like ‘REST on one side database on the other’. That is a concrete problem: high deployment overheads, redundant information in some places, inappropriate transaction boundaries in others. The objects work – passing structured objects around and centralising the DB access – makes things a lot better, but its broken into vertical silos much too early.

Our domain specific services include huge amounts of generic, common problem space code: persistence, placement, access control…

Cultural norms and agility

Back in the dawn of OpenStack, there were some very very strong personalities. Codebases got totally overhauled and replaced without code review. Distrust got baked in as another cultural norm. Code review became a control point. It’s extraordinarily common to spend weeks or months getting patches through.

In some of the most effective teams I’ve worked in code review is optional. Trust and iterate is the norm there: bypassing code review is a thing that needs to be justified, but code review is not how quality is delivered. Quality is delivered by continual improvement, rather than by the quality of any one individual commit.

A related thing is being super risk averse around what lands in master (more on that below). Some very very very clever folk have written very clever code to facilitate this combination of siloed projects + trying super hard not to let regressions into master. This is very hard to deliver – and in fact we stepped back from being an absolute-approach there, about 4 years ago, to a model where we try very hard to prevent it just within a small set of connected projects.

OpenStack has a deeply split personality. Many folk want to build a downloadable cloud construction kit (e.g. Ubuntu). Many more want to build a downloadable cloud product (direct release users). And many wanted (are there still public clouds running master?) to be able to use master directly with confidence. This last use case is a major driver for wanting master to be regression free…

Agility requires the ability to react to new information in a short timeframe. Doing CD (continuous deployment) requires a pipeline that starts with code review and ends with deployed code. OpenStack doesn’t do that. There’s a huge discontinuity between upstream and actual deployments, and effectively none of developers of any part of OpenStack upstream are doing operations day to day. Those that do – at Rackspace, previously at HP (where I was working when I was full time on OpenStack), and I’m going to presume at OVH and other public clouds – are having to separate out their operations work from their upstream changes.

Every initiative in a project will miss some details that have to be figured out later – thats the nature of all but the most exactly software development processes, and those processes are hugely expensive. (Formal methods just to start with). OpenStack copes with that by running huge planning cycles – 3-6 months apart.

Commits-as-control-points + long planning cycles + many developers not operating what they build => reaction to new information happens at a glacial scale.

To illustrate this, consider request tracing. 8 years ago Google released the Dapper whitepaper, Twitter wrote Zipkin and open sourced it, and we’re now at the point where distributed tracing is de rigeur – it’s one of the standard things a service operator will expect for any system. We spent years dealing with pushback from developers in service teams that didn’t understand the benefits of the proposed analogous system for OpenStack. Rackspace wrote their own and patched it in as part of their productionisation of master. Then we also got to have a debate about whether OpenStack should have one such system, or a plugin interface to allow Rackspace to not change. [Sidebar: Rackers, I love you and :heart: your company, but that drove me up the wall! I wish we’d managed to just join forces and get everyone to at least bring a damn tracing interface in for everything].

Test reliability

With TripleO we had the idea that we’d run a cloud based on master, provide feedback on what didn’t work, and create a virtuous circle. I think that that was ultimately flawed because the existing silos (e.g. of Nova, or Glance) were not extended into owning those components within TripleO: TripleO was just another deployer, rather than part of the core feedback cycle.

More generally, we had a team of people (TripleO) running other people’s code (all of OpenStack and commit rights were hard to get in other projects) with no SLA around that code.

I didn’t think of this that way at the time, for all that we understood that that was what we are doing, but that structure is actually structurally fragile: it’s the very antithesis of agile. When something broke it could stay broken for weeks, simply because the folk responsible for the break are not accountable for the non-brokenness of the system. (I’m not whinging about the teams we worked with – people did care, but caring and being accountable are fundamentally different things).

There is another place with that pattern: devstack. Devstack is a code base that exists to deploy all the other openstack components. It’s the purest essence of ‘run other people’s code with no SLA’, and devstack is the engine for pre-merge testing and pre-review testing in OpenStack.

I now believe that to be a key problem for OpenStack. Monty loves to talk about how many clouds OpenStack deploys daily in testing. Every one of those tests is running some number of components (typically the dependency graph for the service under test) which have not changed and are not written by the author, from scratch. And then of course the actual service being tested.

Thats structurally fragile: it’s running 5 or 10 times as much code as is relevant to the test being conducted. And the people able to fix any problems in those dependencies don’t feel the friction at the same time, in the same way, as their users do. (This isn’t a critique of the people, it’s just maths).

I’ll probably write more about this in detail later, as it ties into a larger discussion about testing and deployment of microservices, or testing in production. But imagine if we got rid of devstack for review and merge testing. It has several other use cases of course – ‘give me an OpenStack to hack on’ is an important, discrete test case, and folk probably care that that works. For simplicity I’m going to ignore that for now.

So, if we don’t use devstack, how do we deploy a cloud for pre-merge testing.

We don’t. We don’t need to. What we need to do is deploy the changed code into a cloud whose other components are expected to be compatible with that code. Devstack did this by taking a given branch of a bunch of components and bringing them up from scratch. Instead, we run a production grade, monitored and alerted deployment of all the components. Possibly we run many such deployments, for configurations that cannot coexist (e.g. different federation modes in keystone?). The people answering the pages for those alerts could be the service developers, or it could be an operations team with escalation back to the developers as-needed (to filter noise like ‘oh, cloud $X has just had an outage’). But ultimately the developers would be directly accountable in some realtime fashion.

Then the test workflow becomes:

  1. Build the code under test. (e.g. clean VM, pip install, whatever)
  2. Deploy that code into the existing cluster as a new shard
  3. Exercise it as desired
  4. Tear it down

Let’s use nova-compute as an example.

  1. pip install
  2. Run nova-compute reporting to an existing API server with some custom label on the hypervisor to allow targeting workloads to it
  3. Deploy a VM targeted it
  4. tear it down

I’m sure this raises lots of omg-we-can’t-do-that-because-technical-reason-X-about-what-we-do-today.

That’s fine, but for the purposes of this discussion, consider the destination – not the path.

If we did this:

  • Individual test runs could use substantially less resources
  • And perform substantially less work
  • Which implies better performance
  • Failures due to other components than the service under test would be a thing of the past (when you’re on the hook for your service running reliably, you engineer it to do that)

I think this post is long enough, so let me recap briefly. If there is interest out there I can drill into what sort of changes would be needed to transition to such a system, the suggestions I have for ease of use and ease of operations, and I think I’m also ready to provide some discussion about what the architecture of OpenStack should be.

Recap: why is development hard

Cultural problem #1: silos rather than collaboration in place. Moving the code rather than working with others.

Cultural problem #2: excessive entry controls. Make each commit right rather than trend up wards with a low-latency high change rate.

Cultural problem #3: developer feedback cycle is measured in weeks (optimistically), or years (realistically).

Technical problem #1: excessive code executed in tests: 80% of test activity is not testing the code under test.

Technical problem #2: our testing is optimised for new-cloud deployments: as our userbase grows upgrades become the common use case and testing should match that.

by rbtcollins at September 06, 2018 01:16 AM

September 05, 2018

OpenStack Superuser

Disrupting the disruptor: Diversity efforts at Google

Google’s motto of “Don’t be evil” hasn’t made it any easier for the tech giant to achieve better diversity.

The most recent statistics available show a need for improvement. For starters, despite efforts the company has given to getting more women on board, in the tech corridors of the company the percentage has risen about 25 percent since 2014 to 21 percent total. The global percentage of women working in any department at Google is 30 percent and it hasn’t budged for four years. Then there were the infamous James Damore memo and harassment of Danielle Brown, VP of diversity and a Wired investigation declaring a “dirty war” surrounding these issues at the company.

None of that is stopping Valeisha Butterfield-Jones. As the global head of women and black community engagement, she says it’s high time for disruption.

The greatest challenge? “Decoding what the real barriers to entry are, for people of color and for women,” she says in a profile at Harper’s Bazaar. One of these efforts is Google’s decoding race series, organized as the first step of a longer-term strategy – intended to inform and empower Googlers to have open and constructive conversations on race. One of the more provocative discussions called “Programming and Prejudice: Can Computers Be Racist?” and moderated by Van Jones is available online at YouTube.  Butterfield-Jones is also tackling the pipeline problem with a scholarship program aimed at historically black colleges.

“We’re trying to break the system, to rebuild it and to make it better. It is hard work,” she says. “Having good intentions isn’t enough. You have to actually do the work. I’m committed, and I know we are, to doing the work.”

The intro on the company’s diversity website states that “Google should be a place where people from different backgrounds and experiences come to do their best work. That’s why we continue to support efforts that fuel our commitments to progress.”

Full story over at Harpers Bazaar

The post Disrupting the disruptor: Diversity efforts at Google appeared first on Superuser.

by Nicole Martinelli at September 05, 2018 02:05 PM

Galera Cluster by Codership

Releasing Galera Cluster 3.24 with Improved Deadlock Error Management

Codership is pleased to announce the release of Galera Replication library 3.24, implementing wsrep API version 25.  The new release includes improved deadlock error management with foreign keys and security fixes. As always, Galera Cluster is now available as targeted packages and package repositories for a number of Linux distributions, including Ubuntu, Red Hat, Debian, CentOS, OpenSUSE and SLES, as well as FreeBSD. Obtaining packages using a package repository removes the need to download individual files and facilitates the easy deployment and upgrade of Galera Cluster nodes.

This release incorporates all changes up to MySQL 5.7.23, MySQL 5.6.41 and MySQL 5.5.61.


Galera Replication Library 3.24
New features and notable fixes in Galera replication since last binary release by Codership (3.23)

* A support for new certification key type was added to allow
more relaxed certification rules for foreign key references (galera#491).  Previous releases caused excessive (phantom) deadlock errors for transactions, which inserted rows into a table, which contained foreign key constraint. This anomaly has been optimized, and only true conflicts will result in deadlock error.

* New status variables were added to display the number of open transactions
and referenced client connections inside Galera provider (galera#492).

* GCache was sometimes cleared unnecessarily on startup if the recovered
state had smaller sequence number than the highest found from GCache.
Now only entries with sequence number higher than recovery point will be
cleared (galera#498).

* Non-primary configuration is saved into grastate.dat only when if the
node is in closing state (galera#499).

* Exception from GComm was not always handled properly resulting in
Galera to remain in half closed state. This was fixed by propagating the
error condition appropriately to upper layers (galera#500).

* A new status variable displaying the total weight of the cluster nodes
was added (galera#501).

* The value of pc.weight did not reflect the actual effective value after
setting it via wsrep_provider_options. This was fixed by making sure that
the new value is taken into use before returning the control back to
caller (galera#505, MDEV-11959)

* Use of ECHD algorithms with old OpenSSL versions was enabled (galera#511).

* Default port value is now used by garbd if the port is not explicitly
given in cluster address (MDEV-15531).

* Correct error handling for posix_fallocate().

* Failed causal reads are retried during configuration changes.


MySQL 5.7
New release of Galera Cluster for MySQL 5.7, consisting of MySQL-wsrep 5.7.23 and wsrep API version 25. Notable bug fixes in MySQL 5.7.23 and known issues:

* New configuration option wsrep_certification_rules to
enable more relaxed certification rules for foreign key references
on child table inserts. This option is effective only with Galera
version 3.24 or higher (galera#491).

* Resource leak in case of ROLLBACK TO SAVEPOINT followed
by COMMIT has been fixed (mysql-wsrep#318).

* FK constraint violation in applier after ALTER TABLE ADD FK was fixed
by adding both parent and child table keys into ALTER TOI write set

* Possible node hang with conflicting inserts in FK child table was
fixed (mysql-wsrep#335).

* Memory leak with native MySQL replication when InnoDB was used
as a relay log info and master info repository has been fixed

Known issues with this release:

* Server cannot be started using ‘service’ command on Debian Stretch.

* SST between 5.6 and 5.7 nodes is not supported

* The –wsrep-replication-bundle option has no effect and may be removed in
a future release

* InnoDB tablespaces outside of the data directory are not supported, as they
may not be copied over during SST

* Compilation with DTrace enabled may fail, so -DENABLE_DTRACE=:BOOL=OFF
may be given to cmake to disable DTrace


MySQL 5.6
New release of Galera Cluster for MySQL consisting of MySQL-wsrep 5.6.41 and wsrep API version 25. Notable bug fixes and known issues in MySQL 5.6.41:

* New configuration option wsrep_certification_rules to
enable more relaxed certification rules for foreign key references
on child table inserts. This option is effective only with Galera
version 3.24 or higher (galera#491).

* Fixed a resource leak in case of ROLLBACK TO SAVEPOINT which was followed
by COMMIT (mysql-wsrep#318).

* InnoDB undo tablespaces are now included in rsync SST (mysql-wsrep#337).

* FK constraint violation in applier after ALTER TABLE ADD FK was fixed
by adding both parent and child table keys into ALTER TOI write set

* Memory leak with native MySQL replication when InnoDB was used
as a relay log info and master info repository has been fixed.

Known issues with this release:

* If using the Ubuntu 16.04 Xenial package, the server can not be bootstrapped
using systemd. Please use the SysV init script with the ‘bootstap’ option to
bootstrap the node. Note that a server that has been started that way can not
be controlled via systemd and must be stopped using the SysV script. Normal
server startup and shutdown is possible via systemd.

* Server cannot be started using ‘service’ command on Debian Stretch.


MySQL 5.5
New release of Galera Cluster for MySQL consisting of MySQL-wsrep 5.5.61 and wsrep API version 25.This release incorporates all changes up to MySQL 5.5.61

* New configuration option wsrep_certification_rules to
enable more relaxed certification rules for foreign key references
on child table inserts. This option is effective only with Galera
version 3.24 or higher (galera#491).

* Resource leak in case of ROLLBACK TO SAVEPOINT followed
by COMMIT has been fixed (mysql-wsrep#318).

* FK constraint violation in applier after ALTER TABLE ADD FK was fixed
by adding both parent and child table keys into ALTER TOI write set


Reminder: Changes to Repositories Structure

With the new release the repository structure is changed
to allow for existence of all of the wsrep-patched mysql
versions currently supported: 5.5 through 5.7.
Thus the repository layout requires from the user to
adjust his or her repository configuration to accomodate
those changes. In order to have the WSREP and Galera
library installed, one would need to add the following
1. Galera-3 repository for galera library:
2. Corresponding mysql-wsrep repository:
here: *ldist* is Linux or BSD distribution (Ubuntu, Centos) and *mversion* is MySQL version, i.e.
5.5, 5.6, 5.7


How To Install

Repositories contain dummy or meta packages, called mysql-wsrep-<mversion>
which are convenience packages for installation of the corresponding version
of WSREP. One can install the whole suite by running, for example:
`apt-get install mysql-wsrep-5.6 galera-3`

#### Quirks for Ubuntu Xenial and 5.6
Due to the peculiarities of how apt resolves packages and presence of 5.7
libraries in Xenial repositories the command above might require additional
steps/changes in order to succeed.

One would need to either configure apt pinning for codership repositories for
them to have priority over upstream packages or to specify mysql-common package
version explicitly as the one located in the WSREP repositories in order to get
things installed.


by Sakari Keskitalo at September 05, 2018 12:50 PM

September 04, 2018

Chris Dent

TC Report 18-36

It's been a rather busy day, so this TC Report will be a quick update of some discussions that have happened in the past week.

PEP 8002

With Guido van Rossum stepping back from his role as the BDFL of Python, there's work in progress to review different methods of governance used in other communities to come up with some ideas for the future of Python. Those reviews are being gathered in PEP 8002. Doug Hellman has been helping with those conversations and asked for input on a draft.

There was some good conversation, especially the bits about the differences between "direct democracy" and whatever what we do here in OpenStack.

The result of the draft was quickly merged into PEP 8002.

Summit Sessions

There was discussion about concerns some people experience with some summit sessions feeling like advertising.

PTG Coming Soon

The PTG is next week! TC sessions are described on this etherpad.

Elections Reminder

TC election season is right now. Nomination period ends at the end of the day (UTC) 6th of September so there isn't much time left. If you're toying with the idea, nominate yourself, the community wants your input. If you have any questions please feel free to ask.

by Chris Dent at September 04, 2018 08:32 PM

StackHPC Team Blog

Heads Up: Ansible Galaxy Breaks the World

One of the great advantages arising from our technology choices has been that through standardising on Ansible we have been able to use a single, simple tool to drive everything we do.

Ansible is not really a programming language, and modularity cannot be ensured without some amount of programmer discipline. One great tool in providing a level of modularity and component reuse has been Ansible Galaxy. Our OpenStack deployment toolbag has been steadily growing and we've been thrilled to see others make use of our components as well. Share and enjoy!

Unfortunately, we are writing this post because of an event today which apparently without notice broke all our builds, and also all the work of our clients who use our technology.

It's Working Great, What Could Possibly Go Wrong?

We started to notice oddities when updating some of our roles on Galaxy earlier today. The first thing was that the implicit naming convention used for git repos such as our new BeeGFS role was no longer being honoured, so that the role name on Galaxy changed from beegfs to ansible-role-beegfs. As a result, the role could no longer be found by playbooks that required it.

This we fixed through adding a metadata tag role_name which explicitly sets the name. We did this to each of our 32 roles. Our repos are long established, many are cloned, some are forked. We can't simply rename them on a whim.

On pushing the change that sets this metadata tag, every one of our roles with a hyphenated name was silently converted to using underscores instead. This may seem innocuous, but the consequence is that, again, every playbook that referenced these roles - which is every playbook we write - could no longer retrieve the roles it required from Ansible Galaxy.

The root cause appears to be the combined effect of two changes. Ansible has removed the implicit naming convention for the git repos that back Galaxy roles. Around the same time they have introduced a newer, stricter naming convention for Galaxy roles that prevents names containing hyphens. The backwards-compatibility plans for these two changes are mutually exclusive. Unfortunately most of our roles fall into both categories.

We are not out of the woods as it appears the role_name tag that we now require to explicitly set the correct name for our roles may also be about to be deprecated. This may leave us needing to rename all the git repos for our roles.

What about Kayobe?

OpenStack Kayobe is a project that makes extensive use of Galaxy for reuse and modularity. At the time of writing Kayobe's CI is also broken by this change, and an extensive search-and-replace patchset is required, pending the outcome of our requests for upstream resolution.

What Do Our Clients Need to Do?

In summary, there seem to be a number of tedious but simple changes that must be applied everywhere:

  • All our roles now have underscores instead of hyphens in them from now on. This appears to be an inevitable change to accommodate forwards compatibility for future versions of Galaxy. We'd like to see a server-side fix to Galaxy to enable recognition of either hyphens or underscores, thus enabling a smooth transition.
  • The requirements and role invocations of every playbook that references them will need to be updated to change occurrences of - with _. We will commit those changes to our repos, but all clients will need to pull in the new changes. This should happen automatically when repos are cloned.
  • We might not be done with these build-breaking changes yet, although hopefully there will be a way forward that doesn't break things for users.

Let's hope this kind of event doesn't happen too often in future...

Mushroom cloud!

by Stig Telfer at September 04, 2018 03:40 PM

OpenStack Superuser

A passion for sharing and collaboration makes OpenInfra Days Vietnam a success

HANOI — The global community came together to celebrate the themes of sharing and collaboration for the first OpenInfra Days in Vietnam.

Held at the Sheraton Hanoi, the August 25 event was the culmination of two years of experience and four months of group work.  The efforts paid off: over 250 paying participants heard a packed morning of keynotes and three tracks of  speakers in the afternoon. This first edition was bolstered by 22 sponsors and sunny weather, too.

Deployment highlights

Here are some highlights from companies running OpenStack in production provided by members of the Vietnam OpenStack User Group. The user group, the main driver of the event, boasts some 3,000 members who have hosted 31 events in recent years before coming together to organize OpenInfra Days. For a further look inside the event, check out the slides from the presentations and take a look at the group’s photo album here.


  • One of the largest telecoms in Vietnam. They mainly provide IaaS (cloud servers, object storage, block storage, backup services) and cloud security with anti-DDoS and WAF from ISP networks. They’re currently testing and planing to integrate OVS-DPDK, Octavia and migrating legacy network function to VNF.
  • Currently deploying OpenStack core services (Nova, Neutron L3, Cinder, Glance and Keystone), Heat, Masakari, Ironic, Ceph in three data centers.


  • A web services hosting company providing exclusively cloud servers, block storage and backup.
  • Current deployment: OpenStack core services and Heat in three data centers.


  • A cloud infrastructure provider, providing CDN on IaaS and PaaS.
  • Deployed OpenStack projects:  OpenStack core services, Octavia and Heat in two data centers.

Vega Corp

  • A multimedia company that provides VAS and CDN services based on OpenStack.
  • Deployed OpenStack projects: OpenStack core services, Designate.


  • A cloud infrastructure provider company providing IaaS and CDN.
  • Currently deploying OpenStack core services in at least two data centers.

Nhan Hoa

  • A web services hosting company that provides IaaS with OpenStack core services.

Hosting Viet

  • A web services hosting company that provides IaaS with OpenStack core services.


  • A web services hosting company that provides IaaS with OpenStack core services.

Can Tho Government People’s Committee

  • A private OpenStack cloud for e-government services using OpenStack core services in one data center.


  • Another large national telecom offering IaaS.
  • Current deployment: OpenStack core services in two data centers with over 1,000 virtual machines

Ministry of Health

  • OpenStack core services in one data center through a private cloud with MediTechJSC.


  • A cloud infrastructure IaaS provider. Using OpenStack core services in one data center deployed MediTechJSC

Passion is the key ingredient

“Passion” is the word that the founder of the Vietnam User Group chose to describe the way they build community, reports Rico Lin. Based in Taiwan, Lin is the current project team lead for the Heat orchestration project and software engineer at EasyStack.

He spoke at the event – more on this in an upcoming post – and says that he especially appreciated the efforts made by organizers to provide simultaneous translation for all the sessions assuring that global OpenStack members could also participate and felt welcome. Organizers also attracted speakers from outside the country including board member Monty Taylor, OpenStack ambassador John Studarus, Shintaro Mizuno of the OpenStack Japan User Group and IBM’s Olaph Wagoner.

“They took steps to involve the global community and make local OpenInfra Days not local at all! I’m here just like an attendee from Vietnam, can go to whatever sessions I like and enjoyed them very much,” says Lin.

Join the Vietnam User Group or follow them on Twitter or Facebook for more on how to get involved next year.


The post A passion for sharing and collaboration makes OpenInfra Days Vietnam a success appeared first on Superuser.

by Nicole Martinelli at September 04, 2018 02:02 PM

Thierry Carrez

The Future of Project Teams Gatherings

Next week, OpenStack contributors will come together in Denver, Colorado at the Project Teams Gathering to discuss in-person the work coming up for the Stein release cycle. This regular face-to-face meeting time is critical: it allows us to address issues that are not easily fixed in virtual communications, like brainstorming solutions, agreeing on implementation details, or building up personal relationships. Since day 0 in OpenStack we have had such events, but their shape and form evolved with our community.

A brief history of contributor events

It started with the Austin Design Summit in July 2010, where the basics of the project were discussed. The second Design Summit in San Antonio at the end of 2010 introduced a parallel business track, which grew in importance as more organizations and potential users joined the fray. The contributors gathering slowly became a subevent happening at the same time as the main "Summit". By 2015, summits were 5-days events attracting 6000 people. It made for a very busy week, and very difficult for contributors to focus on the necessary discussions with the distractions and commitments of the main event going on at the same time.

Time was ripe for a change, and that is when we introduced the idea of a Project Teams Gathering (PTG). The PTG was a separate 5-day event for contributors to discuss in-person in a calmer, more productive setting. By the Austin Summit in 2016, it was pretty clear that was the only option to get productive gatherings again, and the decision was made to roll out our first PTG in February, 2017 in Atlanta. Attendees loved the small event feel and their restored productivity. Some said they got more done during that week than in all old Design Summits (combined), despite some challenges in navigating the event. We iterated on that formula in Denver and Dublin, creating tools to make the unstructured and dynamic event agenda more navigable, by making what is currently happening more discoverable. The format was extended to include other forms of contributor teams, like SIGs, workgroups, or Ops meetups. Feedback on the event by the attendees was extremely good.

The limits of the PTG model

While the feedback at the event was excellent, over the last year it became pretty clear that holding a separate PTG created a lot of tension. The most obvious tension was between PTG and Summit. The PTG was designed as an additional event, not a replacement. In particular, developers were still very much wanted at the main Summit event, to maintain the technical level of the event, to reach out to new contributors and users, to discuss with operators the future of the project at the Forum. But it is hard to justify traveling internationally 4 times per year to follow a mature project, so a lot of people ended up choosing one or the other. Smaller teams usually skipped the PTG, while a lot in larger teams would skip the Summit. That created community fragmentation between the ones who could attend 4 events per year and the ones who could not. And those who could not were on the rise: with the growth in OpenStack adoption in China, the number of contributors, team leaders and teams where most members are based in China increased significantly.

Beyond that, the base of contributors to OpenStack is changing: less and less vendor-driven and more and more user-driven. That is a generally good thing, but it means that we are slowly moving away from contributors who are 100% employed to work upstream (and therefore travel as many times a year as necessary to maximize that productivity) toward contributors that spend a couple of hours per week to help upstream (for which travel is at a premium). There are a lot of things OpenStack needs to change to be more friendly to this type of contributor, and the PTG format was not really helping in this transition.

Finally, over the last year it became clear that the days of the 5-day-long 5000-people events were gone. Once the initial curiosity and hype-driven attendance is passed, and people actually start to understand what OpenStack can be used for (or not used for), you end up with a less overwhelming event, with a more reasonable number of attendees and days. Most of the 2015-2016 reasons for a separate event are actually no longer applying.

Trying a different trade-off

We ran a number of surveys to evaluate our options -- across Foundation sponsors, across PTG attendees, across contributors at large. About 60% of contributors supported co-locating the PTG with the Summit. Even only considering past PTG attendees, 53% still support co-location. 85% of the 22 top contributing organizations also supported co-location, although some of the largest ones would prefer to keep it separate. Overall, it felt like enough of the environment changed that even for those who had benefited from the event in the past, the solution we had was no longer necessarily the optimal choice.

In Dublin, then in Vancouver, options were discussed with the Board, the Technical Committee and the User Committee, and the decision was made to relocate the Project Teams Gathering with the Summits in 2019. The current plan is to run the first Summit in 2019 from Monday to Wednesday, then a 3-day PTG from Thursday to Saturday. The Forum would still happen during the Summit days, so the more strategic discussions that happened at the PTG could move there.

Obviously, some of the gains of holding the PTG as a separate event will be lost. In particular, a separate event allowed to have strategic discussions (the Forum at the Summit) at a separate time in the development cycle as the more tactical discussions (the PTG). Some of the frustration of discussing both in the same week, when it's a bit late to influence the cycle focus, will be restored. The Summit will happen close to releases again, without giving that much time to vendors to build products on it, or deployers to try it, reducing the quality of the feedback we get at the Forum.

That said, the co-location will strive to keep as much as we can of what made the PTG unique and productive. In order to preserve the distinct productive feel of the event, the PTG will be organized as a completely separate event, with its own registration and branding. It will keep its unstructured content and dynamic schedule tools. In order to prevent the activities of the Summit from distracting PTG attendees with outside commitments, the co-located PTG will happen on entirely separate days, once the Summit is over. There are only so many days in a week though, so the trade-off here is to end the PTG on the Saturday.

This change is likely to anger or please you depending on where you stand. It is important to realize that there is no perfect solution here. Any solution we choose will be a trade-off between a large number of variables: including more contributors, maximizing attendee productivity, getting a critical mass of people present to our events, contain travel cost... We just hope that this new trade-off will strike a better balance for OpenStack in 2019, and the Foundation will continue adapt its event strategy to changing conditions in the future.

by Thierry Carrez at September 04, 2018 10:12 AM

September 03, 2018

OpenStack Superuser

An overview of micro-services, cloud native, containers and serverless

This three-part series was written by Mina Andrawos, author of  “Cloud Native programming with Golang,” which provides practical techniques and architectural patterns for cloud native micro-services. He’s also the author of the “Mastering Go Programming” and the “Modern Golang Programming” video courses.

This series hopes to shed some light and provide practical input on some key topics in the modern software industry, namely micro-services, cloud native applications, containers and serverless applications. It will cover both practical advantages and disadvantages of these technologies.


Micro-service architecture has gained a reputation as a powerful approach for building modern software applications. So what are micro-services? Micro-services separate the functionality required from a software application into multiple independent small software services. Each of these micro-services is responsible for an individual focused task. In order for micro-services to work together to form a large scalable application, they must communicate and exchange data among multiple servers, a practice known as horizontal scaling.

Each service can be deployed on a different server with dedicated resources or in separate containers (more on that below.) These services can be written in different programming languages, enabling greater flexibility and allowing separate teams to focus on each service, rendering the final application of higher quality.

Another notable advantage to using micro-services is the ease of continuous delivery, or the ability to deploy software often and at any time. Micro-services make continuous delivery easier because a new feature deployed to one micro-services is less likely to affect other micro-services.

Cloud native applications

Micro-service architectures are a natural fit for cloud native applications  — applications built from the ground up for cloud computing. An application is cloud native if designed with the expectation of deploying on a distributed and scalable infrastructure.

For example, building an application with a redundant micro-services architecture – we’ll see an example shortly – makes the application cloud native, since this architecture allows our application to be deployed in a distributed manner that allows it to be scalable and almost always available. A cloud native application doesn’t need to always be deployed to a public cloud like Amazon Web Services, it can be deployed to distributed cloud-like infrastructure as well.

In fact, what makes an application fully cloud native goes beyond just using micro-services. Your application should employ continuous delivery, the ability to continuously deliver updates to your production applications without disruptions. Your application should also make use of services like message queues and technologies like containers (covered in the next section).

Cloud native applications assume access to numerous server nodes, having access to pre-deployed software services like message queues or load balancers, ease of integration with continuous delivery services, among other things.

If you deploy your cloud native application to a commercial cloud like AWS or Azure, your application has the option to utilize cloud-only software services. For example, DynamoDB is a powerful database engine that can only be used on AWS for production applications. Another example is the DocumentDB database in Azure. There are also cloud-only message queues such as Amazon Simple Queue Service (SQS), that can be used to allow communication between micro-services in the Amazon Web Services cloud.

As mentioned earlier, cloud native micro-services should be designed to allow redundancy between services. If we take the events booking application as an example, the application will look like this:

Multiple server nodes would be allocated per micro-service, allowing a redundant micro-services architecture to be deployed. If the primary node or service fails for any reason, the secondary can take over ensuring lasting reliability and availability for cloud native applications. This availability is vital for fault intolerant applications such as e-commerce platforms, where downtime translates into lost revenue.

A notable tool worth mentioning in the world of micro-services and cloud computing is Prometheus. Prometheus is an open-source system monitoring and alerting tool that can be used to monitor complex micro-services architectures and send alerts when action needs to be taken. Originally created by SoundCloud to monitor their systems it grew to become an independent project and is now a part of the Cloud Native Computing Foundation.


A container is simply the idea of encapsulating some software inside an isolated user space or “container.” For example, a MySQL database can be isolated inside a container where the environmental variables and the configurations that it needs will live. Software outside the container will not see the environmental variables or configuration contained inside the container by default. Multiple containers can exist on the same local virtual machine, cloud virtual machine, or hardware server.

Containers provide the ability to run numerous isolated software services, with all their configurations, software dependencies, run times, tools, and accompanying files, on the same machine. In a cloud environment, this ability translates into saved costs and efforts, because the need for provisioning and buying server nodes for each micro-services will diminish since different micro-services can be deployed on the same host without disrupting each other. Containers combined with micro-services architectures are powerful tools to build modern, portable, scalable and cost-efficient software. In a production environment, more than a single server node combined with numerous containers would be needed to achieve scalability and redundancy.

Containers add more benefits to cloud native applications. With a container, you can move your micro-services, with all the configuration, dependencies and environmental variables that it needs, to fresh server nodes without the need to reconfigure the environment, achieving powerful portability.

Due to the power and popularity of the software containers technology, some new operating systems like CoreOS or Photon OS are built from the ground up to function as hosts for containers.

One of the most popular software container projects in the software industry is Docker. Major organizations such as Cisco, Google and IBM utilize Docker containers in their infrastructure as well as in their products. Another notable project in the software containers world is Kubernetes. Kubernetes is a tool that allows the automation of deployment, management, and scaling of containers. Built by Google to facilitate the management of their containers, Kubernetes provides some powerful features such as load balancing between containers, restart for failed containers, and orchestration of storage utilized by the containers.

Disadvantages of cloud native computing

It’s important to a point out the disadvantages of these technologies. One notable drawback of relying heavily on micro-services is that they can become too complicated to manage in the long run as they grow in numbers and scope. There are approaches to mitigate this by utilizing monitoring tools such as Prometheus to detect problems, container technologies such as Docker to avoid polluting the host environments and avoid over-designing the services. However, these approaches take effort and time.

For cloud native applications, if the need arises to migrate some or all of the applications, some challenges will be faced. There are multiple reasons for that, depending on where your application is deployed.

One is that if your cloud native application is deployed on a public cloud like AWS, cloud native APIs are not cross-cloud platform. For example, a DynamoDB database API utilized in an application will only work on AWS but not Azure, since DynamoDB only belongs to AWS. The API will also never work in a local environment because DynamoDB can only be utilized in AWS in production.

Another reason is that there are some assumptions made when some cloud native applications are built, such as the fact that there will be virtually unlimited number of server nodes to utilize when needed and that a new server node can be made available very quickly. These assumptions are sometimes hard to guarantee in a local data center environment, where real servers, networking hardware and wiring need to be purchased.

In the case of containers, sometimes the task of managing them can get rather complex for the same reasons as managing expanding numbers of micro-services. As containers or micro-services grow in size, there needs to be a mechanism to identify where each container or micro-services is deployed, what their purpose is and what they need in resources to keep running.

Serverless applications

Serverless architecture is a new software architectural paradigm that was popularized with the AWS Lambda service. To fully understand serverless applications, let’s start by defining function-as-a-service (FaaS.) This is the idea that a cloud provider such as Amazon or even a local piece of software such as or funktion provides a service where a user can request a function to run remotely in order to perform a very specific task and then after the function concludes, the function results return back to the user. No services or stateful data are maintained and the function code is provided by the user to the service that runs the function.

The idea of a properly designed production applications that utilize the serverless architecture is that instead of building multiple micro-services expected to run continuously in order to carry out individual tasks, build an application that has fewer micro-services combined with FAAS for tasks that don’t need services to run continuously.

FAAS is smaller construct than a micro-service. For example, in case of the event booking application we covered earlier, there were multiple micro-services covering different tasks. If we use a serverless applications model, some of those micro-services would be replaced with a number of functions that serve their purpose. For example, here’s a diagram that showcases the application utilizing a serverless architecture:

In this diagram, the event handler acts as micro-services as well as the booking handler — micro-services were replaced with a number of functions that produce the same functionality. This eliminates the need to run and maintain the two existing micro-services.

Such a monolithic application will work well with small to medium loads. It can run on a single server, connect to a single database and will be written probably in the same programming language.

Now, what happens if business booms and hundreds of thousands or millions of users need to be handled and processed? Initially, the short-term solution is to ensure that the server on which the application runs has powerful hardware specifications to withstand higher loads with vertical scaling (the act of increasing hardware specifications like RAM and hard drive to run heavy applications). Typically, though, it’s not sustainable as the load on the application continues to grow.

Another challenge with monolithic applications is the inflexibility caused by being limited to only one or two programming languages. This inflexibility can affect the overall quality and efficiency of the application. For example, node.js is a popular JavaScript framework for building web applications, whereas R is popular for data science applications. A monolithic application will make it difficult to utilize both technologies, whereas in a micro-services application, you can simply build a data science service written in R and a web service written in Node.js.

A third notable challenge with monolithic applications is collaboration. For example, in the event booking application, a change that caused a bug in the single front-end user interface layer could affect the other pieces of the application using it like the search, events and bookings handlers.

If you were to create a micro-services version of the events application it would take the below form:

This application will be capable of horizontal scaling among multiple servers. Each service can be deployed on a different server with dedicated resources, or in separate containers.

How do micro-services work with cloud native applications?

Micro-service architectures are a natural fit for cloud native applications. A cloud native application is simply defined as an application built from the ground up to run on a cloud platform. A cloud platform has numerous advantages such as obtaining multiple server nodes at a moment’s notice without the pains of IT hardware infrastructure planning. Building your application in a microservices architecture is an important step to develop scalable cloud native applications.

Another feature of cloud native applications is utilizing cloud only software services. For example, DynamoDB is a powerful database engine that can only be used on Amazon Web Services for production applications. Another example is the DocumentDB database in Azure. There are also cloud native message queues such as Amazon Simple Queue Service (SQS), which can be used to allow communication between micro-services in the Amazon Web Services cloud.

Cloud native micro-services can also be designed to allow redundancy between services. If we take the events booking application as an example, the application would look like this:

Because obtaining new server nodes is not a hardware infrastructure challenge, multiple server nodes could be allocated per micro-service. If the primary node or service fails for any reason, the secondary can take over, ensuring lasting reliability and availability for cloud native applications. The key here is availability for fault intolerant applications such as e-commerce platforms where downtime translates into lost revenue.

Micro-services cloud native applications combine the advantages of micro-services with the benefits of cloud computing to provide great value for developers, enterprises and startups.

Another notable advantage (aside from availability and reliability) is continuous delivery. The reason why micro-services make continuous delivery easier is because a new feature deployed to one micro-service is far less likely to affect other micro-services.


Cloud computing has opened avenues for developing efficient, scalable and reliable software. Here, we’ve covered some significant concepts in the world of cloud computing such as microservices, cloud native applications and serverless applications.

When designing an application, architects must choose whether to build a monolithic application, a microservices cloud native application, or a serverless application. Hopefully, I’ve given you some insight on how to make that decision.

In part two of this series, we’ll dive into cloud-native applications and containers also shed some light on the disadvantages of depending on them. Part three of the series will take you through serverless applications in detail and show you how all three fit in together.


Content courtesy Packt Publishing

The post An overview of micro-services, cloud native, containers and serverless appeared first on Superuser.

by Superuser at September 03, 2018 01:27 PM

August 31, 2018

Aija Jauntēva

Outreachy: Summary

This is the last blog post about my Outreachy internship that will summarize what I have done.

These are the 'main' additions to sushy and sushy tools project:

  • initial version for BIOS resource support in sushy
  • initial version for @Redfish.Settings used by BIOS and other resources in sushy
  • emulation of BIOS resource in sushy-tools for libvirt driver and openstacksdk driver
  • emulation of Ethernet Interface resource in sushy-tools (took over another patch and added openstacksdk driver part)
  • support for Message registries in sushy (some parts still in code review)

However, the implementation is not entirely complete for BIOS and @Redfish.Settings resource support. There are some additional fields that were left out for the first version. In BIOS there is Attribute Registry which currently is not exposed to sushy users but it can help to determine what are the allowed attributes for a particular BIOS and other metadata about the attributes. At the moment there is no input validation happening when setting new BIOS attributes and any failure messages are left until applying the settings and getting back the results in @Redfish.Settings. In @Redfish.Settings at the moment there is nothing implemented related to @Redfish.SettingsApplyTime. If supported by the Redfish service, this would allow to indicate preferred time to apply the updates. Also for @Redfish.Settings multi-threading support is not added. In cases if there were 2 or more users trying to update BIOS settings at the same time, for user it will be hard to determine if the update was successful or if failures in the results are caused by their update or one of the peers. When sushy starts supporting these features, these can also be added to sushy-tools emulation.

In addition to these patches, I did some smaller ones which were followup to these patches or some things that I encountered while working on the 'main' patches. One of the things was that tox was configured to use these python environments: py27, py35, pypy. I believe these were some standard template copy-paste or taken from another project. When running tox, I encountered interpreters not found error for py35 and pypy. Though tox has a flag to skip them, I asked my mentors what's the intention with these. After conversation I removed pypy because it is not expected that anyone will run sushy under pypy and replaced py35 with py3 so that the latest version of py3 is used. On my machine it was py36. Though I installed py35 side by side before removing the exact version from tox. Anyway, with these updates I was testing with py36 locally and I encountered 2 cases where Zuul CI which was still using py35 was failing when my environment was passing. I haven't looked into this more but for some reason there is something with methods in py35 when it is working fine in py27 and py36. So I am not entirely sure that is was ok to remove py35 locally as long as it is necessary to support it. I started to run py35 environment explicitly locally before putting in code review just in case there is this another odd exception with py35 in my code. Recently py36 was also added to Zuul CI so py36 gets tested there too now.

In one of my first blog posts I draw a diagram to comprehend the bare metal server domain which was new to me and I want to look at this diagram again as the internship has ended. There are not much changes - only 1 new component was introduced for sushy-emulator to start using openstacksdk which was added around the same time my internship started. I did not get to work on Ironic part and overall did not have to interact with other components in the diagram, but it was useful to explore the surroundings back then.

diagram: Context of sushy, updated

I think sushy and sushy-tools is a good starting point for new contributors - the project is small and easy to get around, though it did not appear so at the beginning - had to spend some time to get familiar with Redfish and the building blocks of sushy. At the beginning of the internship I had this 'I have no idea what I'm doing' feeling (shadowed by excitement that I'm working on OpenStack), but now I feel comfortable in this domain.

Overall I learned a lot during this 3 months internship and I would like to thank everyone who made this happen, especially my mentors Ilya and Dmitry. Best summer ever.

If anyone else is interested in Outreachy and would like to participate in the next round, they can check their eligibility and start applying very soon - 10th September, see Outreachy web page.

by ajya at August 31, 2018 05:37 PM

OpenStack Superuser

What you need to know about the OpenStack Rocky release

The 18th edition of OpenStack packs a mighty punch.

Rocky is driven by use cases like artificial intelligence, machine learning, NFV and edge computing––delivering enhanced upgrade features and support for diverse hardware architectures including bare metal.

Here are highlights from the community webinar. You can listen to or download the complete webinar here, including a case study from Vexxhost, updates on community wide goals and new projects like Cyborg and Qinling. For more on the Rocky release check out the press release, release notes, source code, contributor stats and the OpenStack project map.


Ironic is an integrated OpenStack program that aims to provision bare metal machines instead of virtual machines. Project team lead Julia Kreger  offered updates at the 7:15 mark.
This past cycle has been about building foundations for new future features that make operator’s lives easier, with a focus on scalability, she says. “We really took the feedback from the last two cycles to heart and direct what we were doing to meet that goal,” Kreger adds.

Three new developments were released with Rocky:

  • Ramdisk deployment interface. For scientific and large scale ephemeral workloads this offers the ability to  boot up machines in a flash.
  • BIOS setting management. This allows operators to enable virtualization/hyperthreading, SR-IOV and DPDK and “is going to be very useful in the future for operators, they won’t have to go one by one through all their servers and verify the BIOS settings and change them.”
  • Functionality to recover machines from power faults. This solves “another operator headache. It’s a minor thing with a huge impact,” says Kreger.


Alex Schultz, project team lead for the Queens and Rocky cycles, offered this update on the TripleO fast forward upgrade front, starting at the 9:53 mark. “We provide tooling for planning deployment and day two operations,” says Shultz by way of introduction. “We support advanced network topologies, allowing operators and deployers to configure their services as necessary as well as offer support for updates and upgrades.
The FFU were primarily targeted in Queens but we continue to work on the effort in Rocky.” We’re going to continue this path, for Queens to the T-cycle.” The FFU is driven mainly by Ansible.

An overview of what’s new:


Chris Hoge of the OSF talked about Airship, a collection of loosely coupled, interoperable open-source tools that provide for automated cloud provisioning and life cycle management in a declarative, predictable way.
“It’s in the early stages, we’re still trying it out but so far it’s been really fantastic,” says Hoge. He encourages folks to get involved in the weekly meetings and learn more about the project by joining the mailing lists: or on Freenode IRC: #airshipit.

Kata Containers

Eric Ernst, one of the technical leads, talked about what’s new for the Kata Containers project, starting at the 26:04 mark. “A typical container is just a great process running on your Linux host and it’s isolated from all the other processes running on your host,” he says, adding that it’s constrained on its CPU utilization and memory utilization. “In a sense, that’s all containers are, a couple of features on a Linux kernel, so they’re super lightweight and very fast and the design patterns that resolve from this are very strong.” Kata is looking to provide a different kind of isolation – not just software but also hardware isolation.

Looking forward, there’s a lot of work to do Ernst says. A 1.3 release is in the works for mid-September, including features like OpenTracing support “a big usability improvement,” as well as full network and storage hot plug, a way that allows users to run a lot of things in parallel, he adds. There’s a lot of other features in the works, including general security enhancements. Stay tuned for the next round.

Eye of the tiger

A post shared by Mark Collier (@sparkycollier) on


“It’s all singing, all dancing, very cool stuff,” says project lead Bruce Jones by way of introduction to the edge cloud software stack project at the 32:50 mark. Currently on the Pike release with plans to move to Queens as soon as possible, “we’re very much looking forward to going to the Denver PTG and going to talking to the community about how we align much more closely with upstream OpenStack,” he says.


Updates on the continuous integration and delivery program were offered by the OSF’s Clark Boylan at around 36 minutes in. The tool has GitHub support and more recent work include in-line commenting on code changes, improvement to the web dashboards. Though the project was born about six years ago in-house, the OSF has been running Zuul v3 since the beginning of the year along with other users. “This is a brand new thing for us and for them and we’ve spent a lot of time improving it based on what we’ve learned as we go along.”


Cover image via Lego Ideas

The post What you need to know about the OpenStack Rocky release appeared first on Superuser.

by Nicole Martinelli at August 31, 2018 02:08 PM

Rackspace Developer Blog

Using Terraform with Rackspace Public Cloud

Using Terraform with Rackspace Public Cloud

Handling a huge scale of infrastructure requires automation and infrastructure as code. Terraform is a tool that helps to manage a wide variety of systems including dynamic server lifecycle, configuration of source code repositories, databases, and even monitoring services. Terraform uses text configuration files to define the desired state of infrastructure. From those files, Terraform provides information on the changes to be made based on the current state of that infrastructure, and can make those changes.


Terraform can be downloaded for a variety of systems at the Terraform downloads page. After downloading, extract the archive and move the terraform binary to a location on your PATH. That's it. Because it's written in Go, it includes all dependancies and is a single binary.


Terraform does not officially support Rackspace Cloud as a provider, but the OpenStack provider does work on the Rackspace Cloud and only needs a bit of configuration to get started. We use the following configuration for this simple example:

variable "rax_pass" {}
variable "rax_user" {}
variable "rax_tenant" {}

# Configure the OpenStack Provider
provider "openstack" {
  user_name   = "${var.rax_user}"
  tenant_id = "${var.rax_tenant}"
  password    = "${var.rax_pass}"
  auth_url    = ""
  region      = "DFW"

resource "openstack_compute_instance_v2" "terraform-test" {
  name      = "terraform-test"
  region    = "DFW"
  image_id  = "8f47cf87-1e90-4370-b59d-730256265dce"
  flavor_id = "2"
  key_pair  = "mykey"

  network {
    uuid = "00000000-0000-0000-0000-000000000000"
    name = "public"
  network {
    uuid = "11111111-1111-1111-1111-111111111111"
    name = "private"

Theres a few things to point out in the configuration. When there are a lot of variables to be managed, variables are typically included in seperate files. Since this example only has a few, the variables are added in our configuration file right at the top. Another good thing to know about variables is that Terraform reads environment variables of TF_VAR_<variable>. That means that to prevent writing secrets in configuration files, environment variables of TF_VAR_rax_pass, TF_VAR_rax_user, and TF_VAR_rax_pass can be created and are read by Terraform at runtime.

To create an environment variable, use the following method to enter the value without that value being recorded as it would be when entering it direcly on the command line.

read -s TF_VAR_rax_pass && export TF_VAR_rax_pass

Another note on the configuration, image_id and flavor_id need be updated to values for what you want to create. Most OpenStack or command line clients that interface with Rackspace have a method to list available images and flavors. If you don't have one installed and ready, take a look at the image and flavor API documentation to see how to get these right from the Cloud Server API.

Finally, the network configuration provided in this configuration specifies the use of the default public and private network on the server that we create. Using the OpenStack plugin on Rackspace Cloud does require specifying networks, and not including specific networks in the configuration might result in the creation taking longer than expected and errors when attempting to destroy.

Terraform steps

Now that we have created our desired configuration in a simple text file, there are just a few steps to push that to a real environment. We will use the following commands to interact with our environment.

  1. terraform init
  2. terraform plan
  3. terraform apply
  4. terraform destroy

Terraform init

Terraform fetches the current state of the environment to compare it against the written configuration with terraform init. By default, this state information is recorded in a local file. There are other methods for storing this information, and Terraform can be configured to store the state information remotely. Using a remote store ensures that different members of the team all share the same environment state when using Terraform. For this example, we use the default local state file.

Use terraform init to initialize the current state of the environment.

$ terraform init

Initializing provider plugins...

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.

* provider.openstack: version = "~> 1.8"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Terraform plan

I would argue that this next step is the most important. Running terraform plan compares the current state of the environment with the changes required to make the environment match the configuration written. This is important to ensure that what we expect to happen is what is going to happen. This simple example probably won't run into any collisions on a normal environment, but always double check to make sure that any destroy actions listed in the plan output are intended.

$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

openstack_compute_instance_v2.terraform-test: Refreshing state... (ID: a3fd1c5a-5d01-4434-a673-223cc3266696)


An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + openstack_compute_instance_v2.terraform-test
      id:                       <computed>
      access_ip_v4:             <computed>
      access_ip_v6:             <computed>
      all_metadata.%:           <computed>
      availability_zone:        <computed>
      flavor_id:                "2"
      flavor_name:              <computed>
      force_delete:             "false"
      image_id:                 "8f47cf87-1e90-4370-b59d-730256265dce"
      image_name:               <computed>
      key_pair:                 "mykey"
      name:                     "terraform-test"
      network.#:                "2"
      network.0.access_network: "false"
      network.0.fixed_ip_v4:    <computed>
      network.0.fixed_ip_v6:    <computed>
      network.0.floating_ip:    <computed>
      network.0.mac:            <computed>           "public"
      network.0.port:           <computed>
      network.0.uuid:           "00000000-0000-0000-0000-000000000000"
      network.1.access_network: "false"
      network.1.fixed_ip_v4:    <computed>
      network.1.fixed_ip_v6:    <computed>
      network.1.floating_ip:    <computed>
      network.1.mac:            <computed>           "private"
      network.1.port:           <computed>
      network.1.uuid:           "11111111-1111-1111-1111-111111111111"
      power_state:              "active"
      region:                   "DFW"
      security_groups.#:        <computed>
      stop_before_destroy:      "false"

Plan: 1 to add, 0 to change, 0 to destroy.


Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

Terraform apply

With the backend initialized and the changes that will be caused by applying this configuration confirmed, we now run terraform apply. This action first provides the same output as seen with terraform plan and requires confirmation to continue. Additionally this provides output every ten seconds to monitor elapsed time as these changes are made, and a summary on completion.

$ terraform apply
openstack_compute_instance_v2.terraform-test: Refreshing state... (ID: a3fd1c5a-5d01-4434-a673-223cc3266696)

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + openstack_compute_instance_v2.terraform-test
      id:                       <computed>
      access_ip_v4:             <computed>
      access_ip_v6:             <computed>
      all_metadata.%:           <computed>
      availability_zone:        <computed>
      flavor_id:                "2"
      flavor_name:              <computed>
      force_delete:             "false"
      image_id:                 "8f47cf87-1e90-4370-b59d-730256265dce"
      image_name:               <computed>
      key_pair:                 "mykey"
      name:                     "terraform-test"
      network.#:                "2"
      network.0.access_network: "false"
      network.0.fixed_ip_v4:    <computed>
      network.0.fixed_ip_v6:    <computed>
      network.0.floating_ip:    <computed>
      network.0.mac:            <computed>           "public"
      network.0.port:           <computed>
      network.0.uuid:           "00000000-0000-0000-0000-000000000000"
      network.1.access_network: "false"
      network.1.fixed_ip_v4:    <computed>
      network.1.fixed_ip_v6:    <computed>
      network.1.floating_ip:    <computed>
      network.1.mac:            <computed>           "private"
      network.1.port:           <computed>
      network.1.uuid:           "11111111-1111-1111-1111-111111111111"
      power_state:              "active"
      region:                   "DFW"
      security_groups.#:        <computed>
      stop_before_destroy:      "false"

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

openstack_compute_instance_v2.terraform-test: Creating...
  access_ip_v4:             "" => "<computed>"
  access_ip_v6:             "" => "<computed>"
  all_metadata.%:           "" => "<computed>"
  availability_zone:        "" => "<computed>"
  flavor_id:                "" => "2"
  flavor_name:              "" => "<computed>"
  force_delete:             "" => "false"
  image_id:                 "" => "8f47cf87-1e90-4370-b59d-730256265dce"
  image_name:               "" => "<computed>"
  key_pair:                 "" => "mykey"
  name:                     "" => "terraform-test"
  network.#:                "" => "2"
  network.0.access_network: "" => "false"
  network.0.fixed_ip_v4:    "" => "<computed>"
  network.0.fixed_ip_v6:    "" => "<computed>"
  network.0.floating_ip:    "" => "<computed>"
  network.0.mac:            "" => "<computed>"           "" => "public"
  network.0.port:           "" => "<computed>"
  network.0.uuid:           "" => "00000000-0000-0000-0000-000000000000"
  network.1.access_network: "" => "false"
  network.1.fixed_ip_v4:    "" => "<computed>"
  network.1.fixed_ip_v6:    "" => "<computed>"
  network.1.floating_ip:    "" => "<computed>"
  network.1.mac:            "" => "<computed>"           "" => "private"
  network.1.port:           "" => "<computed>"
  network.1.uuid:           "" => "11111111-1111-1111-1111-111111111111"
  power_state:              "" => "active"
  region:                   "" => "DFW"
  security_groups.#:        "" => "<computed>"
  stop_before_destroy:      "" => "false"
openstack_compute_instance_v2.terraform-test: Still creating... (10s elapsed)
openstack_compute_instance_v2.terraform-test: Still creating... (20s elapsed)
openstack_compute_instance_v2.terraform-test: Still creating... (30s elapsed)
openstack_compute_instance_v2.terraform-test: Still creating... (40s elapsed)
openstack_compute_instance_v2.terraform-test: Still creating... (50s elapsed)
openstack_compute_instance_v2.terraform-test: Still creating... (1m0s elapsed)
openstack_compute_instance_v2.terraform-test: Creation complete after 1m9s (ID: 2c1537b4-0bfa-4293-a6fb-b708553263f7)

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Once the apply is complete, this server is now accessible and can be viewed in the Rackspace Cloud portal, API, or your command line interface of choice.

Terraform destroy

Once the resource, or resources, are no longer needed, the resource can be destroyed just as easily as it was created. Running terraform destroy displays what will be destroyed and requires confirmation to remove the resources provided from the configuration.

$ terraform destroy
openstack_compute_instance_v2.terraform-test: Refreshing state... (ID: 2c1537b4-0bfa-4293-a6fb-b708553263f7)

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  - openstack_compute_instance_v2.terraform-test

Plan: 0 to add, 0 to change, 1 to destroy.

Do you really want to destroy?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

openstack_compute_instance_v2.terraform-test: Destroying... (ID: 2c1537b4-0bfa-4293-a6fb-b708553263f7)
openstack_compute_instance_v2.terraform-test: Still destroying... (ID: 2c1537b4-0bfa-4293-a6fb-b708553263f7, 10s elapsed)
openstack_compute_instance_v2.terraform-test: Destruction complete after 16s

Destroy complete! Resources: 1 destroyed.

Next steps

I have covered the basics on how to use Terraform with Rackspace Cloud, but I did not scratch the surface of what to use Terraform for. Check out the docs to see the huge amount of things Terraform can manage to increase your productivity, as well as the consistency and reliability of your infrastructure.

August 31, 2018 07:00 AM

August 30, 2018

OpenStack Superuser

Making open source work from the inside out at your company

Isabel Drost-Fromm is a true open-source believer who also knows how easy it is for companies to stall on the road to collaboration.

She’s a director of the Apache Software Foundation, co-founder of Apache Mahout and an active supporter of the Free Software Foundation Europe who describes herself  as “just another humble software designer.” At the recent Free and Open Source software Conference (FrOSCon), she gave a clear-eyed talk about the problems people typically encounter when trying to adopt “inner source” and how to get past them.

For years, she says, people active in OSS projects have tried to bring the collaboration practices to their in-house teams. The resulting practice has been called by a number of names: open development, internal open source, inner source.

Now an open-source strategist at Europace, for the past year she’s been piloting inner-source efforts at this Berlin-based fin tech start up. As part of that campaign, Drost-Fromm has been following the development of  and contributing to – a global, cross-organization initiative to further practices that “apply the lessons of open source to all software engineering, using collaboration and transparency to increase quality, speed and developer joy.”

Don’t touch my cheese

But before joy, she says, often comes pain. For starters, even what seems the most obvious step can backfire. It’s good practice to make everything public, for example. You might use GitHub enterprise or hosted Git, so that everything’s discoverable.  You need a means to submit code, make changes to projects, so the solution seems to be to put it all on GitHub or whatever your source forge is.

“Except that’s not exactly how it works. It’s a big cheese problem,” she says, attributing the concept to fellow inner-source thinker Denise Cooper.

Let’s say you have two teams team,  A and B. Team B wants to reuse what Team A created. They find some issue, start fixing it and then submit a pull request. Team A says, “Hey, this pull request doesn’t follow our coding guidelines…And we don’t need this pull request, it’s not in our prioritization chain.” Suddenly everything has to be rewritten. Then Team A and B, both in large companies, escalate to management.

“So suddenly you are you’re back to where you were before, maybe even the inverse situation where everyone thinks that inner source is bullshit,” she adds.

Uncover the assumptions

Behind the code, there’s a hidden framework that Drost-Fromm says can also create roadblocks and misunderstandings. A couple of important ones include different working styles and undocumented working best practices within teams. “Even if you have rolled out a common coding standard across your company there’s different structures and those structures are often not documented,” she explains. “New team members enter a mentorship phase, but there’s no like mentorship phase for someone from another team contributing just a single patch.” There can be a fear of contributing, she says, because suddenly you’re no longer in your “safe team” where everyone knows your quirks.

Reach out and ping someone

To short-circuit misunderstandings and time-to-merge, try a ping. “We have found that pull request take ages, pull request reviews however are faster after a direct ping,” she says, adding that for those working in open source it like sounds familiar advice. “You want to publish your work very early and make transparent what want to work on to get early feedback.”

Mi casa es su casa

There is a way to deal with the “I-don’t-know-how-these-people-are-working”problem, she says: Create some house rules in the form of a public document. “I have to accept the rules of the other person, if I’m visiting you at home. You tell me how to behave. If I’m visiting, you’ll tell me what what kind of problems to fix, what to expect. Like if you stay in my flat for awhile, I will tell you not to turn on the dishwasher and the washing machine at the same time or the fuse will blow,” she says.

It can be a bit more complicated than because the community guidelines could potentially cover coding conventions, testing conventions, branching conventions, commit message conventions, steps for creating good pull requests, etc. “If you write all of that down, will anyone ever read it? Probably not,” she admits. That’s why the inner source guidelines were written whenever something went really wrong, “when something fell on the floor” she says.

Catch the whole 37-minute talk on YouTube.

Photo // CC BY NC

The post Making open source work from the inside out at your company appeared first on Superuser.

by Superuser at August 30, 2018 02:03 PM


Community Blog Round Up: 30 August

There have only been four articles in the past month? YES! But brace yourself. Release is HERE! Today is the official release day of OpenStack’s latest version, Rocky. And, sure, while we only have four articles for today’s blogroll, we’re about to get a million more posts as everyone installs, administers, uses, reads, inhales, and embraces the latest version of OpenStack. Please enjoy John’s personal system for running TripleO Quickstart at home as well as how to update ceph-ansible in a containerized undercloud, inhale Gonéri’s introduction to distributed CI and InfraRed, a tool to deploy and test OpenStack, and experience Jiří’s instructions to upgrade ceph and OpenShift Origin with TripleO.

Photo by Anderson Aguirre on Unsplash

PC for tripleo quickstart by John

I built a machine for running TripleO Quickstart at home.


Distributed-CI and InfraRed by Gonéri Le Bouder

Red Hat OpenStack QE team maintains a tool to deploy and test OpenStack. This tool can deploy different types of topologies and is very modular. You can extend it to cover some new use-case. This tool is called InfraRed and is a free software and is available on GitHub.


Updating ceph-ansible in a containerized undercloud by John

In Rocky the TripleO undercloud will run containers. If you’re using TripleO to deploy Ceph in Rocky, this means that ceph-ansible shouldn’t be installed on your undercloud server directly because your undercloud server is a container host. Instead ceph-ansible should be installed on the mistral-executor container because, as per config-download, That is the container which runs ansible to configure the overcloud.–


Upgrading Ceph and OKD (OpenShift Origin) with TripleO by Jiří Stránský

In OpenStack’s Rocky release, TripleO is transitioning towards a method of deployment we call config-download. Basically, instead of using Heat to deploy the overcloud end-to-end, we’ll be using Heat only to manage the hardware resources and Ansible tasks for individual composable services. Execution of software configuration management (which is Ansible on the top level) will no longer go through Heat, it will be done directly. If you want to know details, i recommend watching James Slagle’s TripleO Deep Dive about config-download.


by Rain Leander at August 30, 2018 11:00 AM


The Network is the Computer. Part 3 – Cloud Computing is Born

Aptiira: The Network is the Computer. Part 3 – Cloud Computing is born

We saw at the end of the last post that the concept of cloud computing emerged in the late 90’s. However, to bring the Cloud into viable existence still required technological step-changes. That’s what we’ll look at in this post. 

We start at the turn of the 21st century, in the wreckage of the dot-com bust. Even though most of the speculative capital had been lost, large investments had been made in building a high-speed network backbone and server infrastructure, which was largely intact. 

And there were still many large and viable internet-based businesses with growing markets.  After a relatively short contraction, innovation brought new business ideas to market every day. 

These organisations were building super-dense computing platforms and were intimately familiar with internet connectivity, but building out infrastructure in this way propagated the operational problems we outlined last post. 

Two capabilities emerged to help resolve these problems: the convergence of server virtualisation and hyperconnectivity and the creation of the Representational State Transfer (REST) architecture for Application Programming Interfaces (API’s). 

The concept of server virtualisation is one physical machine supporting multiple virtualised machines.  But if you abstract this concept to its most essential then virtualisation can be described as: 

An abstraction layer that turns all resources a computer needs into resource queues … Storage, memory, networking, and CPUs are all presented to compute via managed queues

Patrick Hubbard

With the availability of ubiquitous and high-speed networking, we can easily conceive of architectures where these resource queue endpoints are almost anywhere. They certainly can exist on other computing nodes in a dense platform. But with low network latency these resource endpoints could also be placed at remote locations. 

From this perspective, Cloud Computing is just server virtualisation writ large by managing a broader set of dense and modular infrastructure components. 

To produce a viable Cloud, it turns out we need one more thing: a modular and decoupled software architecture. (We’ll cover software in future posts, but it’s important to mention here in the context of enabling Cloud computing). 

In 2000, the concept of an API was fragmented and challenging.  As well as many proprietary interfaces, the two “standards” were SOAP and CORBA, both notoriously difficult to implement and maintain. 

In his doctoral dissertation, Roy Fielding proposed the REST architecture, which promoted easy-to-use API’s based on http. This approach was rapidly adopted, and the number of public REST API’s exploded. 

Not only did the introduction of REST promoted simplicity of interconnection, but it also increased the focus on applications decoupling from infrastructure. 

The ingredients for Cloud Computing are in place, and along comes Amazon, Inc. 

In around 2000, Amazon realized that they have developed core skills in the delivery of infrastructure to service their internal needs, but that infrastructure delivery was still a major bottleneck for themselves and their customers. 

Amazon realises the need to deliver infrastructure services much faster and more efficiently at the global web scale.  They also realise that utilising this infrastructure to deliver capabilities to their partners and customers means they need to decouple their code from infrastructure much better, with cleaner interfaces and access API’s. Amazon was a very early adopter of REST. 

Multiple initiatives emerged at Amazon that finally produce the initial offerings of Elastic Compute Cloud and Simple Storage Service. Developers flock to the platform, and AWS is fully launched in 2006. 

From Amazon, the public cloud is born, enabling companies to build infrastructure services quickly and scale globally for a fraction of what it cost during the dot-com boom. 

Amazon takes Sun’s vision to a global level: to leverage the internet to integrate web-scale applications. 

If you believe developers will build applications from scratch using web services as primitive building blocks, then the operating system becomes the Internet

Andy JassyAWS Lead and SVP and one of the original founders of AWS

The post The Network is the Computer. Part 3 – Cloud Computing is Born appeared first on Aptira.

by Adam Russell at August 30, 2018 05:01 AM

Michael Still

What’s missing from the ONAP community — an open design process


I’ve been thinking a fair bit about ONAP and its future releases recently. This is in the context of trying to implement a system for a client which is based on ONAP. Its really hard though, because its hard to determine how various components of ONAP are intended to work, or interoperate.

It took me a while, but I’ve realised what’s missing here…

OpenStack has an open design process. If you want to add a new feature to Nova for example, the first step is you need to write down what the feature is intended to do, how it integrates with the rest of Nova, and how people might use it. The target audience for that document is both the Nova development team, but also people who operate OpenStack deployments.

ONAP has no equivalent that I can find. So for example, they say that in Casablanca they are going to implement a “AAI Enricher” to ease lookup of data from external systems in their inventory database, but I can’t find anywhere where they explain how the integration between arbitrary external systems and ONAP AAI will work.

I think ONAP would really benefit from a good hard look at their design processes and how approachable they are for people outside their development teams. The current use case proposal process (videos, conference talks, and powerpoint presentations) just isn’t great for people who are trying to figure out how to deploy their software.


The post What’s missing from the ONAP community — an open design process appeared first on Made by Mikal.

by mikal at August 30, 2018 01:43 AM

August 29, 2018

John Likes OpenStack

PC for tripleo quickstart

I built a machine for running TripleO Quickstart at home.

My complete part list is on pcpart picker with the exception of the extra Noctua NM-AM4 Mounting Kit and video card (which I only go to install the OS)

I also have photos from when I built it.

My nodes.yaml gives me:

  • Three 9GB 2CPU controller nodes
  • Three 6GB 2CPU ceph storage nodes
  • One 3GB 2CPU compute node (that's enough to spawn one nested VM for a quick test)
  • One 13GB 8CPU undercloud node
That leaves less than 2GB of RAM for the hypervisor and all 16 vCPUs (8 cores * 2 threads) are marked for a VM so I'm pushing it a little.

When using this system with the same ndoes.yaml my run times are as follows for Rocky RC1:

  • unercloud install of rocky: 43m44.118s
  • overcloud install of rocky: 49m51.369s

by John ( at August 29, 2018 09:09 PM

OpenStack Superuser

How to manage OpenStack security and networks with Ansible 2

Learn how to automate OpenStack security and network with Ansible 2 in this tutorial from Aditya Patawari, a systems engineer and dev ops practitioner and Vikas Aggarwal, an infrastructure engineer.

OpenStack for cloud

One of the biggest advantages of using a third-party cloud provider is that you can get started in minutes, provisioning the actual hardware is no longer your problem. However, there is a major drawback: the cloud is not customized to your needs. It may lack flexibility, which you can gain if you control the underlying hardware and network. OpenStack can help you address this problem.

OpenStack is software that can help you build a system similar to popular cloud providers, such as Amazon Web Services or Google Cloud Platform. OpenStack provides an API and a dashboard to manage the resources that it controls. Basic operations, such as creating and managing virtual machines, block storage, object storage, identity management, and so on, are supported out of the box.

With OpenStack, you can control the underlying hardware and network, which comes with its own pros and cons.
Keep the following points in mind while managing the OpenStack setup:

  • You need to maintain the underlying hardware. Consequently, you’ve to do capacity planning since your ability to scale resources would be limited by underlying hardware. However, since you control the hardware, you can get customized resources in virtual machines.
  • You can use custom network solutions. You can use economical equipment or high-end devices, depending upon the actual need. This can help you get the features that you want and may end up saving money.
  • You need to regularly update the hypervisor and other OpenStack dependencies, especially in the case of a security-related issue. This can be a time-consuming task because you might need to move the running virtual machines around to ensure that users do not face a lot of trouble.
  • OpenStack can be helpful in cases where strict compliance requirements might not allow you to use a third-party cloud provider. A typical example of this is that certain countries require financial and medical data to stay inside their jurisdiction. If any third-party cloud is not able to fulfill this condition, then OpenStack is a great choice.

This article will focus on the security and network solutions aspect of OpenStack and help you manage them. It’s also worth noting that although OpenStack can be hosted on premises, several cloud providers provide OpenStack as a service. Sometimes these cloud providers may choose to turn off certain features or provide add-on features. Sometimes, even while configuring OpenStack on a self-hosted environment, you may choose to toggle certain features or configure a few things differently. Therefore, inconsistencies may occur. All the code examples in this article are tested on a self-hosted OpenStack released in August 2017, named Pike. The underlying operating system was CentOS 7.4.

Managing OpenStack security groups

Security groups are the firewalls that can be used to allow or disallow the flow of traffic. They can be applied to virtual machines. Security groups and virtual machines have a many-to-many relationship. A single security group can be applied to multiple virtual machines and a single virtual machine can have multiple security groups.

  •  To get started, create a security group as follows:
- name: create a security group for web servers
 name: web-sg
 state: present
 description: security group for web servers

The name parameter has to be unique. The description parameter is optional, but it is recommended to use it to state the purpose of the security group. The preceding task will create a security group for you, but there are no rules attached to it. A firewall without any rules is of little use. So, go ahead and add a rule to allow access to port 80 as follows:

 - name: allow port 80 for http
 security_group: web-sg
 protocol: tcp
 port_range_min: 80
 port_range_max: 80
  • You’ll also need an SSH access to this server, so you should allow port 22 as well:
 - name: allow port 80 for SSH
 security_group: web-sg
 protocol: tcp
 port_range_min: 22
 port_range_max: 22

You need to specify the name of the security group for this module. The rule that you create will be associated with this group. You’ve to supply the protocol and the port range information. If you just want to whitelist only one port, then that would be the upper and lower bound of the range. Lastly, you need to specify the allowed addresses in the form of CIDR. The address signifies that port 80 is open for everyone. This task will add an ingress type rule and allow traffic on port 80 to reach the instance. Similarly, you’ve to add a rule to allow traffic on port 22 as well.

In the next section, you’ll learn how to manage network resources

Managing Openstack network resources

A network is a basic building block of the infrastructure. Most cloud providers will supply a sample or default network. While setting up a self-hosted OpenStack instance, a single network is typically created automatically. However, if the network is not created, or if you want to create another network for the purpose of isolation or compliance, you can do so using the os_network module.

  • To get started, create an isolated network and name it private, as follows:
 - name: creating a private network
 state: present
 name: private
  • Now, a logical network with no subnets has been created. A network with no subnets is of little use, so the next step would be to create a subnet:
 - name: creating a private subnet
 state: present
 network_name: private
 name: app
 - destination:
 - destination:

The preceding task will create a subnet named app in the network called private. You’ll also need to supply a CIDR for the subnet, Google DNS has been used as a nameserver in the example here, but this information should be obtained from the IT department of the organization. Similarly, you’ll need to set up the example host routes, but this information should be obtained from the IT department as well.

After you’ve successfully completed this step, your network is ready to use.

If you found this article helpful, explore Ansible 2 Cloud Automation Cookbook to be able to deploy an application to demonstrate various usage patterns and utilities of resources. This book gives a recipe-based approach to install and configure cloud resources using Ansible.

The post How to manage OpenStack security and networks with Ansible 2 appeared first on Superuser.

by Superuser at August 29, 2018 02:21 PM

August 28, 2018

Chris Dent

TC Report 18-35

I didn't do a TC report last week because we spent much of the time discussing the issues surrounding extracting the placement service from nova. I've been working on making that happen—because that's been the intent from the start—for a couple of years now, so tend to be fairly central to those discussions. It felt inappropriate to use these reports as a bully pulpit and in any case I was exhausted, so took a break.

However, the topic was still a factor in the recent week's discussion so I guess we're stuck with it: leaving it out would be odd given that it has occupied such a lot of TC attention and these reports are expressly my subjective opinion.

Placement is in the last section, in case you're of a mind to skip it.

The Tech Vision

There's been a bit of discussion on the Draft Technical Vision. First, generally what it is trying to do and how do dependencies fit in. This eventually flowed into questioning how much voice and discretion individual contributors have with regard to OpenStack overall, as opposed to merely doing what their employers say. There were widely divergent perspectives on this.

The truth is probably that everyone has a different experience on a big spectrum.

TC Elections and Campaigning

As announced, TC Election Season approaches. We had some discussion Friday about making sure that the right skills were present in candidates and that any events we held with regard to campaigning, perhaps at the PTG, were not actively exclusionary.

That Placement Thing

The links below are for historical reference, for people who want to catch up. The current state of affairs and immediate plans are being worked out in this thread, based on a medium term plan of doing the technical work to create a separate and working repo and then get that repo working happily with nova, devstack, grenade and tempest. Technical consensus is being reached, sometimes slowly, sometimes quickly, but discussion is working and several people are participating. The questions about governance are not yet firmly resolved, but the hope is that before the end of the Stein cycle placement ought to be its own official project.

In case you're curious about why the TC is involved in this topic at all, there are two reasons: a) Eric asked for advice, b) it is relevant to the TC's role as ultimate appeals board.

The torrid story goes something like this: While working on a PTG planning etherpad for extracting placement from nova, there were some questions about the eventual disposition of placement: a project within or beside nova. That resulted in a huge email thread.

In the midst of that thread, the nova scheduler meeting raised the question of how do we decide? That got moved to the TC IRC channel and mutated from "how do we decide" to many different topics and perspectives. Thus ensued several hours of argument. Followed by a "wow" reprise on Tuesday morning.

By Thursday a potential compromise was mooted in the nova meeting. However, in the intervening period, several people in the TC, notably Doug and Thierry had expressed a desire to address some of the underlying issues (those that caused so much argument Monday and elsewhere) in a more concrete fashion. I wanted to be sure that they had a chance to provide their input before the compromise deal was sealed. The conversation was moved back to the TC IRC channel asking for input. This led to yet more tension. It's not yet clear if that is resolved.

All this must seem pretty ridiculous to observers. As is so often the case in community interactions, the tensions that are playing out are not directly tied to any specific technical issues (which, thankfully, are resolving in the short term for placement) but are from the accumulation and aggregation over time of difficulties and frustrations associated with unresolved problems in the exercise and distribution of control and trust, unfinished goals, and unfulfilled promises. When changes like the placement extraction come up, they can act as proxies for deep and lingering problems that we have not developed good systems for resolving.

What we do instead of investigating the deep issues is address the immediate symptomatic problems in a technical way and try to move on. People who are not satisfied with this have little recourse. They can either move elsewhere or attempt to cope. We've lost plenty of good people as a result. Some of those that choose to stick around get tetchy.

If you have thoughts and feelings about these (or any other) deep and systemic issues in OpenStack, anyone in the TC should be happy to speak with you about them. For best results you should be willing to speak about your concerns publicly. If for some reason you are not comfortable doing so, that is itself an issue that needs to be addressed, but starting out privately is welcomed.

The big goal here is for OpenStack to be good, as a technical production and as a community.

by Chris Dent at August 28, 2018 06:04 PM

OpenStack Superuser

How AI and IoT drive open infrastructure in Taiwan

As artificial intelligence becomes more common in industrial applications and everyday life, it could provide momentum to the tech industry in Taiwan just as PCs did in the past.

That’s the prediction Taiwan’s Minister of Science and Technology Chen Liang-gee made when signing the contract to build a national artificial intelligence cloud-computing platform that will pack seven petaflops of computing power with a storage capacity of 50 petabytes.

The National Center for High Performance Computing (NCHC) is at the center of these efforts. At OpenInfra Days Taiwan, the team shared their progress building the HPC datacenter for AI workloads. It’s one of the most exciting projects happening for OpenStack and open infrastructure in Taiwan and across Asia.

Organizing committee chairman Brian Chen agreed that AI and IoT are the big trends in Taiwan. The vast majority of the technology industry in Taiwan are hardware companies and the new computing intensive workloads that require GPUs and specialized hardware are a strong fit for the local industry.

The NCHC is a government-funded project that serves more than 150 universities across Taiwan with one of the largest AI environments, running 2,000 nodes with the latest Nvidia GPUs. The organization’s goals are to accelerate AI technology development and cultivate tech talent in Taiwan.

Dr. August Chao from NCHC speaking at OpenInfra Days Taiwan.

The NCHC project is a unique collaboration among vendors in the open infrastructure ecosystem, including Gemini Open Cloud, SUSE and QCT. The data center is being constructed this year; the first workloads are expected to come online early in 2019.

Chen said the OpenInfra Days branding helped them bring new organizations and participants to the event who are working on a wider range of open infrastructure technologies. One example was Xu Wang from who gave a keynote on Kata Containers, a new secure containers project that had its first release in May.

Hung-Ming Chen, professor of computer science and information engineering at the National Taichung University of Science and Technology brought 12 students on a high speed train to attend the event. They operate a 100-square meter data center on campus powered by OpenStack. He’s brought a group of students to the event four years in a row and many of his students have now started working in the OpenStack ecosystem.

The future looks bright: Students at OpenInfra Days Taiwan.


Photo // CC BY NC

The post How AI and IoT drive open infrastructure in Taiwan appeared first on Superuser.

by Lauren Sell at August 28, 2018 01:05 AM

August 27, 2018

OpenStack Superuser

Expanding the horizons of the Community Contributor Awards

So many folks work tirelessly behind the scenes to make the community great, whether they’re fixing bugs, contributing code, helping newbies on IRC or just making everyone laugh at the right moment.

Now these quirky honors are open to anyone involved in Airship, Kata Containers, OpenStack, StarlingX and Zuul.

You can help them get recognized (with a very shiny medal!) by nominating them for the next Contributor Awards given out at the upcoming Berlin Summit.

Winners in previous editions included the “Duct Tape” award and the “Don’t Stop Believin’ Cup,” shining a light on the extremely valuable work that makes the larger community excel.

There are so many different areas worthy of celebration, but there are a few kinds of community members who deserve a little extra love:

  • They are undervalued
  • They don’t know they are appreciated
  • They bind the community together
  • They keep it fun
  • They challenge the norm
  • Other: (write-in)

As in previous editions, rather than starting with a defined set of awards the community is asked to submit names.

The OSF community team then has a little bit of fun on the back end, massaging the award titles to devise something worthy of their underappreciated efforts.

Get your nominations in today!

Photo // CC BY NC

The post Expanding the horizons of the Community Contributor Awards appeared first on Superuser.

by Superuser at August 27, 2018 02:05 PM

August 24, 2018

OpenStack Superuser

How to run a Kubernetes cluster in OpenStack

Learn how to run a Kubernetes cluster to deploy a simple application server running WordPress in this tutorial by Omar Khedher.

Kubernetes in OpenStack

Kubernetes is a container deployment and management platform that aims to strengthen the Linux container orchestration tools. The growth of Kubernetes comes from its long experience journey, led by Google for several years before offering it to the open source community as one of the fastest-growing container-based application platforms. Kubernetes is packed with several overwhelming features, including scaling, auto deployment, and resource management across multiple clusters of hosts.

Magnum makes Kubernetes available in the OpenStack ecosystem. Like Swarm, users can use the Magnum API to manage and operate Kubernetes clusters, objects, and services. Here’s a summary of the major player components in the Kubernetes architecture:

The Kubernetes architecture is modular and exposes several services that can be spread across multiple nodes. Unlike Swarm, Kubernetes uses different terminologies as follows:

  • Pods: A collection of containers forming the application unit and sharing the networking configuration, namespaces, and storage
  • Service: A Kubernetes abstraction layer exposing a set of pods as a service, typically, through a load balancer

From an architectural perspective, Kubernetes essentially defines the following components:

  • Master node: This controls and orchestrates the Kubernetes cluster. A master node can run the following services:
  1. API-server: This provides API endpoints to process RESTful API calls to control and manage the cluster
  2. Controller manager: This embeds several management services including:
  3. Replication controller: This manages pods in the cluster by creating and removing failed pods.
  4. Endpoint controller: This joins pods by providing cluster endpoints.
  5. Node controller: This manages node initialization and discovery information within the cloud provider.
  6. Service controller: This maintains service backends in Kubernetes running behind load balancers. The service controller configures load balancers based on the service state update.
  7. Scheduler: This decides the pod on which the service deployment should take place. Based on the node resource capacity, the scheduler makes sure that the desired service would run onto the nodes belonging to the same pod or across different ones.
  8. Key-value store: This stores REST API objects such as node and pod states, scheduled jobs, service deployment information, and namespaces. Kubernetes uses etcd as its main key-value store to share configuration information across the cluster.
  • Worker node: This manages the Kubernetes pods and containers runtime environment. Each worker node runs the following components:
  1. Kubelet: This is a primary node agent that takes care of containers running in their associated pods. The kubelet process reports the health status of pods and nodes to the master node periodically.
  2. Docker: This is the default container runtime engine used by Kubernetes.
  3. Kube-proxy: This is a network proxy to forward requests to the right container. Kube-proxy routes traffic across pods within the same service.

Example – Application server

With the following example, you can set up a new Magnum bay running the Kubernetes cluster as COE. The Kubernetes cluster will deploy a simple application server running WordPress and listening on port 8080 as follows:

  • Create a new Magnum cluster template to deploy a Kubernetes cluster:
# magnum cluster-template-create --name coe-swarm-template \
                                 --image-id fedora-latest \
                                  --keypair-id pp_key \
                                  --external-network-id pub-net\
                                  --dns-nameserver \
                                  --flavor-id m1.small \
                                  --docker-volume-size 4 \
                                  --network-driver docker \
                                  --coe swarm 
# magnum cluster-template-create --name coe-k8s-template \
                                  --image fedora-latest \
                                  --keypair-id pp_key \
                                  --external-network-id pub-net\
                                  --dns-nameserver \
                                  --flavor-id m1.small \
                                  --docker-volume-size 4 \
                                  --network-driver flannel \
                                  --coe kubernetes

Note that the previous command line assumes the existence of an image, fedora_atomic, in the Glance image repository, a key pair named pp_key, and an external Neutron network named pub-net. Ensure that you adjust your parameters with the correct command-line arguments.

  • Initiate a Kubernetes cluster with one master and two worker nodes by executing the following magnum command:
# magnum cluster-create --name kubernetes-cluster \ 
-cluster-template coe-k8s-template \
--master-count 1 \
--node-count 2 
| Property            | Value 
| status              | CREATE_IN_PROGRESS 
| cluster_template_id | 258e44e3-fe33-8892-be3448cfe3679822 
| uuid                | 3c3345d2-983d-ff3e-0109366e342021f4 
| stack_id            | dd3e3020-9833-477c-cc3e012ede5f5f0a 
| status_reason       | - 
| created_at          | 2017-12-11T16:20:08+01:00 
| name                | kubernetes-cluster 
| updated_at          | - 
| api_address         | - 
| coe_version         | - 
| master_addresses    | [] 
| create_timeout      | 60 
| node_addresses      | [] 
| master_count        | 1 
| container_version   | - 
| node_count          | 2 
  • Verify the creation of the new Swarm cluster. You can check the progress of the cluster deployment from the Horizon dashboard by pointing to the stack section that visualizes different events of each provisioned resource per stack:
# magnum cluster-show kubernetes-cluster
| Property            | Value 
| status              | CREATE_COMPLETE 
| cluster_template_id | 258e44e3-fe33-8892-be3448cfe3679822 
| uuid                | 3c3345d2-983d-ff3e-0109366e342021f4 
| stack_id            | dd3e3020-9833-477c-cc3e012ede5f5f0a 
| status_reason       | Stack CREATE completed successfully 
| created_at          | 2017-12-11T16:20:08+01:00 
| name                | kubernetes-cluster 
| updated_at          | 2017-12-11T16:23:22+01:00 
| discovery_url       | 
| api_address         | tcp:// 
| coe_version         | 1.0.0 
| master_addresses    | [''] 
| create_timeout      | 60 
| node_addresses      | ['', '']     | master_count | 1 
| container_version   | 1.9.1 
| node_count          | 2 
  • Generate the signed client certificates to log into the deployed Kubernetes cluster. The following command line will place the necessary TLS files and key in the cluster directory, kubernetes_cluster_dir:
# magnum cluster-config kubernetes-cluster \
--dir kubernetes_cluster_dir
{'tls': True, 'cfg_dir': kubernetes_cluster_dir '.', 'docker_host': u'tcp://'}
  • The generated certificates are in the kubernetes_cluster_dir directory:
    # ls kubernetes_cluster_dirca.pem   cert.pem key.pem
  • Access the master node through SSH and check the new Kubernetes cluster info:
# ssh fedora@
fedora@...rage8z2 ~]$ kubectl cluster-info 
Kubernetes master is running at
KubeUI is running at http://
  • To get your WordPress up and running, you’ll need to deploy your first pod in the Kubernetes worker nodes. For this purpose, you can use a Kubernetes package installer called Helm. The brilliant concept behind Helm is to provide an easy way to install, release, and version packages. Make sure to install Helm in the master node by downloading the latest version as follows:
$ wget
  • Unzip the Helm archive and move the executable file to the bin directory:
$ gunzip helm-v2.7.2-linux-amd64.tar.gz$ mv linux-amd64/helm /usr/local/bin
  • Initialize the Helm environment and install the server component. The server portion of Helm is called Tiller; it runs within the Kubernetes cluster and manages the releases:
$ helm init
Creating /root/.helm 
Creating /root/.helm/repository 
Creating /root/.helm/repository/cache 
Creating /root/.helm/repository/local 
Creating /root/.helm/plugins 
Creating /root/.helm/starters 
Creating /root/.helm/repository/repositories.yaml 
Writing to /root/.helm/repository/cache/stable-index.yaml
$HELM_HOME has been configured at /root/.helm.
Tiller (the helm server side component) has been instilled into your Kubernetes Cluster.
Happy Helming!
  • Once installed, you can enjoy installing applications and managing packages by firing simple command lines. You can start by updating the chart repository:
$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
Writing to /root/.helm/.helm/repository/cache/stable-index.yaml
...Successfully got an update from the "stable" chart repository
Update Complete. Happy Helming!
  • Installing WordPress is straightforward with Helm command line tools:
$ helm install stable/wordpress 

NAME: wodd-breest
LAST DEPLOYED: Fri Dec 15 16:15:55 2017
NAMESPACE: default

The helm install command line output shows the successful deployment of the default release. Different items that refer to the configurable resources of the WordPress chart are listed during the deployment. The application install is configurable through template files defined by the application chart. The default WordPress release can be configured by providing default parameters passed in the Helm install command-line interface.

The default templates for your deployed WordPress application can be found at Specifying a different default set of values for the WordPress install is just a matter of issuing the same command line and specifying the name of the new WordPress release and updated values as follows:

$ helm install --name private-wordpress \
-f values.yaml \

NAME: private-wordpress
LAST DEPLOYED: Fri Dec 15 17:17:33 2017
NAMESPACE: default

You can find the stable charts ready to deploy using Kubernetes Helm package manager at

  • Now that you’ve deployed the first release of WordPress in Kubernetes in the blink of an eye, you can verify the Kubernetes pod status:
# kuberctl get pods
NAME                      READY STATUS RESTARTS  AGE
my-priv-wordpress-…89501  1/1 Running 0  6m
my-priv-mariadb-…..25153  1/1 Running 0  6m
  • To provide access to the WordPress dashboard, you’ll need to expose it locally by forwarding default port 8080 in the master node to port 80 of the instance running the pod:
$ kubectl port-forward my-priv-wordpress-42345-ef89501 8080:80
  • WordPress can be made accessible by pointing to the Kubernetes IP address and port 8080:

You have successfully now run a Kubernetes cluster to deploy a simple application server running WordPress.
If you enjoyed reading this article,  explore OpenStack with Omar Khedher’s
Extending OpenStack. This book will guide you through new features of the latest OpenStack releases and how to bring them into production in an agile way.

The post How to run a Kubernetes cluster in OpenStack appeared first on Superuser.

by Superuser at August 24, 2018 02:06 PM

Michael Still

Learning from the mistakes that even big projects make


The following is a blog post version of a talk presented at pyconau 2018. Slides for the presentation can be found here (as Microsoft powerpoint, or as PDF), and a video of the talk (thanks NextDayVideo!) is below:


OpenStack is an orchestration system for setting up virtual machines and associated other virtual resources such as networks and storage on clusters of computers. At a high level, OpenStack is just configuring existing facilities of the host operating system — there isn’t really a lot of difference between OpenStack and a room full of system admins frantically resolving tickets requesting virtual machines be setup. The only real difference is scale and predictability.

To do its job, OpenStack needs to be able to manipulate parts of the operating system which are normally reserved for administrative users. This talk is the story of how OpenStack has done that thing over time, what we learnt along the way, and what I’d do differently if I had my time again. Lots of systems need to do these things, so even if you never use OpenStack hopefully there are things to be learnt here.

DividerThat said, someone I respect suggested last weekend that good conference talks are actionable. A talk full of OpenStack war stories isn’t actionable, so I’ve spent the last week re-writing this talk to hopefully be more of a call to action than just an interesting story. I apologise for any mismatch between the original proposal and what I present here that might therefore exist.DividerBack to the task in hand though — providing control of virtual resources to untrusted users. OpenStack has gone through several iterations of how it thinks this should be done, so perhaps its illustrative to start by asking how other similar systems achieve this. There are lots of systems that have a requirement to configure privileged parts of the host operating system. The most obvious example I can think of is Docker. How does Docker do this? Well… its actually not all that pretty. Docker presents its API over a unix domain socket by default in order to limit control to local users (you can of course configure this). So to provide access to Docker, you add users to the docker group, which owns that domain socket. The Docker documentation warns that “the docker group grants privileges equivalent to the root user“. So that went well.

Docker is really an example of the simplest way of solving this problem — by not solving it at all. That works well enough for systems where you can tightly control the users who need access to those privileged operations — in Docker’s case by making them have an account in the right group on the system and logging in locally. However, OpenStack’s whole point is to let untrusted remote users create virtual machines, so we’re going to have to do better than that.Divider

The next level up is to do something with sudo. The way we all use sudo day to day, you allow users in the sudoers group to become root and execute any old command, with a configuration entry that probably looks a little like this:

# Allow members of group sudo to execute any command

Now that config entry is basically line noise, but it says “allow members of the group called sudo, on any host, to run any command as root”. You can of course embed this into your python code using or similar. On the security front, its possible to do a little bit better than a “nova can execute anything” entry. For example:

%sudo ALL=/bin/ls

This says that the sudo group on all hosts can execute /bin/ls with any arguments. OpenStack never actually specified the complete list of commands it executed. That was left as a job for packagers, which of course meant it wasn’t done well.


So there’s our first actionable thing — if you assume that someone else (packagers, the ops team, whoever) is going to analyse your code well enough to solve the security problem that you can’t be bothered solving, then you have a problem. Now, we weren’t necessarily deliberately punting here. Its obvious to me how to grep the code for commands run as root to add them to a sudo configuration file, but that’s unfair. I wrote some of this code, I am much closer to it than a system admin who just wants to get the thing deployed.


We can of course do better than just raw sudo. Next we tried a thing called rootwrap, which was mostly an attempt to provide a better boundary around exactly what commands you can expect an OpenStack binary to execute. So for example, maybe its ok for me to read the contents of a configuration file specific to a virtual machine I am managing, but I probably shouldn’t be able to read /etc/shadow or whatever. We can do that by doing something like this:

sudo nova-rootwrap /etc/nova/rootwrap.conf /bin/ls /etc

Where nova-rootwrap is a program which takes a configuration file and a command line to run. The contents of the configuration file are used to determine if the command line should be executed.

Now we can limit the sudo configuration file to only needing to be able to execute nova-rootwrap. I thought about putting in a whole bunch of slides about exactly how to configure rootwrap, but then I realised that this talk is only 25 minutes and you can totally google that stuff.


So instead, here’s my second actionable thing… Is there a trivial change you can make which will dramatically improve security? I don’t think anyone would claim that rootwrap is rocket science, but it improved things a lot — deployers didn’t need to grep out the command lines we executed any more, and we could do things like specify what paths we were allowed to do things in. Are there similarly trivial changes that you can make to improve your world?


But wait! Here’s my third actionable thing as well — what are the costs of your design? Some of these are obvious — for example with this design executing something with escalated permissions causes us to pay to fork a process. In fact its worse with rootwrap, because we pay to fork, start a python interpreter to parse a configuration file, and then fork again for the actual binary we wanted in the first place. That cost adds up if you need to execute many small commands, for example when plugging in a new virtual network interface. At one point we measured this for network interfaces and the costs were in the tens of seconds per interface.

There is another cost though which I think is actually more important. The only way we have with this mechanism to do something with escalated permissions is to execute it as a separate process. This is a horrible interface and forces us to do some really weird things. Let’s checkout some examples…


Which of the following commands are reasonable?

shred –n3 –sSIZE PATH
touch PATH
rm –rf PATH
mkdir –p PATH

These are just some examples, there are many others. The first is probably the most reasonable. It doesn’t seem wise to me for us to implement our own data shredding code, so using a system command for that seems reasonable. The other examples are perhaps less reasonable — the rm one is particularly scary to me. But none of these are the best example…


How about this one?

              ('/sys/class/net/%s/bridge/multicast_snooping' %
              check_exit_code=[0, 1])

Some commentary first. This code existed in the middle of a method that does other things. Its one of five command lines that method executes. What does it do?

Its actually not too bad. Using root permissions, it writes a zero to the multicast_snooping sysctl for the network bridge being setup. It then checks the exit code and raises an exception if its not 0 or 1.

That said, its also horrid. In order to write a single byte to a sysctl as root, we are forced to fork, start a python process, read a configuration file, and then fork again. For an operation that in some situations might need to happen hundreds of times for OpenStack to restart on a node.


This is how we get to the third way that OpenStack does escalated permissions. If we could just write python code that ran as root, we could write this instead:

with open(('/sys/class/net/%s/bridge/multicast_snooping' %
           br_name), 'w') as f:

Its not perfect, but its a lot cheaper to execute and we could put it in a method with a helpful name like “disable multicast snooping” for extra credit. Which brings us to…


Hire Angus Lees and make him angry. Angus noticed this problem well before the rest of us. We were all lounging around basking in our own general cleverness. What Angus proposed is that instead of all this forking and parsing and general mucking around, that we just start a separate process at startup with special permissions, and then send it commands to execute.

He could have done that with a relatively horrible API, for example just sending command lines down the pipe and getting their responses back to parse, but instead he implemented a system of python decorators which let us call a method which is marked up as saying “I want to run as root!”.


So here’s the destination in our journey, how we actually do that thing in OpenStack now:

def disable_multicast_snooping(bridge):
    path = ('/sys/class/net/%s/bridge/multicast_snooping' %
    if not os.path.exists(path):
        raise exception.FileNotFound(file_path=path)
    with open(path, 'w') as f:

The decorator before the method definition is a bit opaque, but basically says “run this thing as root”, and the rest is a method which can be called from anywhere within our code.

There are a few things you need to do to setup privsep, but I don’t have time in this talk to discuss the specifics. Effectively you need to arrange for the privsep helper to start with escalated permissions, and you need to move the code which will run with one of these decorators to a sub path of your source tree to stop other code from accidentally being escalated. privsep is also capable of running with more than one set of permissions — it will start a helper for each set. That’s what this decorator is doing, specifying what permissions we need for this method.


And here we land at my final actionable thing. Make it easy to do the  right thing, and hard to do the wrong thing. Rusty Russell used to talk about this at when he was going through a phase of trying to clean up kernel APIs — its important that your interfaces make it obvious how to use them correctly, and make it hard to use them incorrectly.

In the example used for this talk, having command lines executed as root meant that the prevalent example of how to do many things was a command line. So people started doing that even when they didn’t need escalated permissions — for example calling mkdir instead of using our helper function to recursively make a set of directories.

We’ve cleaned that up, but we’ve also made it much much harder to just drop a command line into our code base to run as root, which will hopefully stop some of this problem re-occuring in the future. I don’t think OpenStack has reached perfection in this regard yet, but we continue to improve a little each day and that’s probably all we can hope for.


privsep can be used for non-OpenStack projects too. There’s really nothing specific about most of OpenStack’s underlying libraries in fact, and there’s probably things there which are useful to you. In fact the real problem is working out what is where because there’s so much of it.

One final thing — privsep makes it possible to specify the exact permissions needed to do something. For example, setting up a network bridge probably doesn’t need “read everything on the filesystem” permissions. We originally did that, but stepped back to using a singled escalated permissions set that maps to what you get with sudo, because working out what permissions a single operation needed was actually quite hard. We were trying to lower the barrier for entry for doing things the right way. I don’t think I really have time to dig into that much more here, but I’d be happy to chat about it sometime this weekend or on the Internet later.


So in summary:

  • Don’t assume someone else will solve the problem for you.
  • Are there trivial changes you can make that will drastically improve security?
  • Think about the costs of your design.
  • Hire smart people and let them be annoyed about things that have always “just been than way”. Let them fix those things.
  • Make it easy to do things the right way and hard to do things the wrong way.Divider

I’d be happy to help people get privsep into their code, and its very usable outside of OpenStack. There are a couple of blog posts about that on my site at, but feel free to contact me at if you’d like to chat.


The post Learning from the mistakes that even big projects make appeared first on Made by Mikal.

by mikal at August 24, 2018 04:35 AM

August 23, 2018


Distributed-CI and InfraRed


Red Hat OpenStack QE team maintains a tool to deploy and test OpenStack. This tool can deploy different types of topologies and is very modular. You can extend it to cover some new use-case. This tool is called InfraRed and is a free software and is available on GitHub.

The purpose of Distributed-CI (or DCI) is to help OpenStack partners to test new Red Hat OpenStack (RHOSP) releases before they are published. This allows them to train on new releases, identify regression or prepare new driver ahead of time. In this article, we will explain how to integrate InfraRed with another too called Distributed-CI, or DCI.


InfraRed has been designed to be flexible and it can address numerous different use-cases. In this article, we will use it to prepare a virtual environment and driver a regular Red Hat OpenStack Platform 13 (OSP13) deployment on it.

InfraRed is covered by a complete documentation that we won’t copy past here. To summarize, once it’s installed, InfraRed exposes a CLI. This CLI gives the user the ability to create a workspace that will trace the state of the environment. The user can then trigger all the required steps to ultimately get a running OpenStack. In addition, InfraRed offers additional features through a plug-in system.


Global diagram of DCI

The partners use DCI to validate OpenStack on their labs. It’s a way to validate that they will still be able to use their gear with the next release. A DCI agent runs the deployment and is in charge of the communication with Red Hat. They then have to provide a set of scripts to deploy OpenStack on it automatically. These scripts will be used during the deployment.

DCI can be summarized with the following list of actions:

  1. Red Hat exposes the last internal snapshots of the product on the DCI
  2. Partner’s DCI agent pulls the last snapshot and deploys it internally using the local configuration and deployment scripts
  3. Partner’s DCI agent runs the tests and sends back the final result to DCI.

Deployment of the lab

For this article, we will use a libvirt hypervisor to virtualize our lab. The hypervisor can be based either on RHEL7 or a CentOS7.

The network configuration

In this tutorial, we will rely on libvirt ‘default’ network. This network uses the range. is our hypervisor. The IP addresses of the other VM will by dynamical and InfraRed will create some additional networks for you. We also use the hypervisor public IP which is `

Installation of the Distributed-CI agent for OpenStack

The installation of DCI agent is covered by its own documentation. All the steps are rather simple as soon as the partner has a host to run the agent that matches DCI requirements. This host is called the jumpbox in DCI jargon. In this document, the jumpbox is also the hypervisor host.

In the rest of this document will assume you have an admin access to a DCI project, that you created the remoteci on and that you have deployed the agent on your jumpbox with the help if its installation guide. To validate everything, you should be able to list the remoteci of your tenant with the following command.

# source /etc/dci-ansible-agent/
# dcictl remoteci-list
|                  id                  |     name     | state  |                            api_secret                            | public |               role_id                |               team_id                |
| e86ab5ba-695c-4437-b163-261e20b20f56 | FutureTown | active | something |  None  | e5e20d68-bbbe-411c-8be4-e9dbe83cc74e | 2517154c-46b4-4db9-a447-1c89623cc00a |

So far so good, we can now start the agent for the very first time with:

# systemctl start dci-ansible-agent --no-block
# journalctl -exf -u dci-ansible-agent

The agent pulls the bits from Red Hat and uses the jumpbox to expose them. Technically speaking, it’s a Yum repository in /var/www/html and a image registry on port 5000. These resources need to be consumed during the deployment. Since we don’t have any configuration yet, the run will fail. It’s time to fix that and prepare our integration with InfraRed.

One of the crucial requirement is the set of scripts that will be used to deploy OpenStack. Those scripts are maintained by the user. They will be called by the agent through a couple of Ansible playbooks:

  • hooks/pre-run.yml: This playbook is the very first one to called on the jumpbox. It’s the place where the partner can, for instance, fetch the last copy of the configuration.
  • hooks/running.yml: This is the place where the automation will be called. Most of the time, it’s a couple of extra Ansible tasks that will call a script or include another playbook.

Preliminar configuration

Security, firewall and SSH keypair

Some services like Apache will be exposed without any restriction. This is why we assume the hypervisor is on a trusted network.

We take the freedom to disable firewalld to simplify the whole process. Please do:

# systemctl stop firewalld
# systemctl disable firewalld

InfraRed interacts with the hypervisor using SSH. Just a reminder, in our case, the hypervisor is the local machine. To keep the whole setup simple, we share the same SSH key for the root and dci-ansible-agent users:

# ssh-keygen
# mkdir -p /var/lib/dci-ansible-agent/.ssh
# cp /root/.ssh /var/lib/dci-ansible-agent/.ssh
# chown -R dci-ansible-agent:dci-ansible-agent /var/lib/dci-ansible-agent/.ssh
# chmod 700 /var/lib/dci-ansible-agent/.ssh
# chmod 600 /var/lib/dci-ansible-agent/.ssh/*
# restorecon /var/lib/dci-ansible-agent/.ssh

You can validate everything work fine with:

# su - dci-ansible-agent
$ ssh root@localhost id


We will deploy OpenStack on our libvirt hypervisor with the Virsh provisioner.

# yum install libvirt
# systemctl start libvirtd
# systemctl enable libvirtd

Red Hat Subscription Manager configuration (RHSM)

InfraRed uses the RHSM during the deployment to register the nodes and pull the last RHEL updates. It loads the credentials from a little YAML file that you can store in the /etc/dci-ansible-agent directory with the other files:

# cat /etc/dci-ansible-agent/cdn_creds.yml
password: 9328878db3ea4519912c36525147a21b
autosubscribe: yes

RHEL guest image

InfraRed needs a RHEL guest image to prepare the nodes. It tries hard to download it by itself, thanks InfraRed… But the default location is which is unlikely to match your environment. Got on and download the last RHEL guest image. The file should be stored here on your hypervisor: /var/lib/libvirt/images/rhel-guest-image-7-5-146-x86-64-qcow2. The default image name will probably change in the future, you can list the default values for the driver with the infrared (or ir) command:

# su - dci-ansible-agent
$ source .venv/bin/activate
$ ir virsh --help

Configure the agent for InfraRed

All the configuration files of this example are available on GitHub.

Run bootstrap (pre-run.yml)

First, we want to install InfraRed dependencies and prepare a virtual environment. These steps will be done with the pre-run.yml.

- name: Install the RPM that InfraRed depends on
    name: '{{ item }}'
    state: present
  - git
  - python-virtualenv
  become: True

We pull InfraRed directly from its Git repository using Ansible’s git module:

- name: Wipe any existing infrared virtualenv
    path: ~/infrared
    state: absent

- name: Pull the last InfraRed version
    dest: /var/lib/dci-ansible-agent/infrared
    version: master

Finally, we prepare a Python virtual environment to preserve the integrity of the system and we install InfraRed in it.

- name: Wipe any existing infrared virtualenv
  file:  ~/infrared/.venv
    state: absent
- name: Install InfraRed in a fresh virtualenv
  shell: |
    cd ~/infrared
    virtualenv .venv &amp;&amp; source .venv/bin/activate
    pip install --upgrade pip
    pip install --upgrade setuptools
    pip install .

As mentioned above, the agent is called by the dci-ansible-agent user, we have to ensure everything is done in its home directory.

- name: Enable the InfraRed plugins that we will use during the deployment
  shell: |
    cd ~/infrared
    source .venv/bin/activate
    infrared plugin add plugins/virsh
    infrared plugin add plugins/tripleo-undercloud
    infrared plugin add plugins/tripleo-overcloud

Before we start anything, we do a cleanup of the environment. For that, we rely on InfraRed. Its virsh plugin can remove all the existing resources thinks to the --cleanup argument:

- name: Clean the hypervisor
  shell: |
    cd ~/infrared
    source .venv/bin/activate
    infrared virsh \
      --host-address \
      --host-key $HOME/.ssh/id_rsa \
      --cleanup True

Be warned, InfraRed removes all the existing VM, network and storages from your hypervisor.

Hosts deployment (running.yml)

As mentioned before, the running.yml is actually the place where the deployment is actually done. We ask InfraRed to prepare our hosts:

- name: Prepare the hosts
  shell: |
    cd ~/infrared
    source .venv/bin/activate
    infrared virsh \
      --host-address \
      --host-key $HOME/.ssh/id_rsa \
      --topology-nodes undercloud:1,controller:1,compute:1

Undercloud deployment (running.yml)

We can now deploy the Undercloud:

- name: Install the undercloud
  shell: |
    cd ~/infrared
    source .venv/bin/activate
    infrared tripleo-undercloud \
      --version 13 \
      --images-task rpm \
      --cdn /etc/dci-ansible-agent/cdn_creds.yml \
      --repos-skip-release True \

At this stage, our libvirt virtual machines are ready and one of them host the undercloud. All these machines have a floating IP. InfraRed keeps the machines names up to date in /etc/hosts. We rely on that to get the undercloud IP address:

- name: Registry InfraRed's undercloud-0 IP
  set_fact: undercloud_ip="{{ lookup('pipe', 'getent hosts undercloud-0').split()[0]}}"

You can also use InfraRed to interact with all these hosts with a dynamic IP:

# su - dci-ansible-agent
$ cd ~/infrared
$ source .venv/bin/activate
$ ir ssh undercloud-0

Here ir is an alias for the infrared command. In both cases, it’s pretty cool, InfraRed did all the voodoo for us.

Overcloud deployment (running.yml)

It’s time to run the final step of our deployment.

- name: Deploy the overcloud
 shell: |
     cd ~/infrared
    source .venv/bin/activate
    infrared tripleo-overcloud \
      --deployment-files virt \
      --version 13 \
      --introspect yes \
      --tagging yes \
      --deploy yes \
      --post yes \
      --containers yes \
      --registry-skip-puddle yes \
      --registry-undercloud-skip yes \
      --registry-mirror \
      --registry-tag latest \
      --registry-namespace rhosp13 \
      --registry-prefix openstack- \
      --vbmc-host undercloud \

Here we pass some extra arguments to accommodate InfraRed:

  • --registry-mirror: we don’t want to use the images from Red Hat. Instead, we will pick the ones delivered by DCI. Here is the first IP address of our jumpbox. It’s the one the agent use when it deploys the image registry. Use the following command to validate you use the correct address: cat /etc/docker-distribution/registry/config.yml|grep addr
  • --registry-namespace and --registry-prefix: our images name start with /rhosp13/openstack-.
  • --vbmc-host undercloud: During the Overcloud installation, TripleO uses Ironic for the node provisioning. Ironic interacts with the nodes through a Virtual BMC server. By default InfraRed install it on the hypervisor, in our case we prefer to keep it clean. This is why we target the undercloud instead.

The virtual BMC instances will look like that on the undercloud:

[stack@undercloud-0 ~]$ ps aux|grep bmc
stack     4315  0.0  0.0 426544 15956 ?        Sl   13:19   0:00 /usr/bin/python2 /usr/bin/vbmc start controller-2
stack     4383  0.0  0.0 426544 15952 ?        Sl   13:19   0:00 /usr/bin/python2 /usr/bin/vbmc start controller-1
stack     4451  0.0  0.0 426544 15952 ?        Sl   13:19   0:00 /usr/bin/python2 /usr/bin/vbmc start controller-0
stack     4520  0.0  0.0 426544 15936 ?        Sl   13:19   0:00 /usr/bin/python2 /usr/bin/vbmc start compute-1
stack     4590  0.0  0.0 426544 15948 ?        Sl   13:19   0:00 /usr/bin/python2 /usr/bin/vbmc start compute-0
stack    10068  0.0  0.0 112708   980 pts/0    S+   13:33   0:00 grep --color=auto bmc

DCI lives

Let’s start the beast!

Ok, at this stage, we can start the agent. The standard way to trigger a DCI run is through systemd:

# systemctl start dci-ansible-agent --no-block

A full run takes more than 2 hours, the --no-block argument above tells systemctl to give back the control to the shell. Even if the unit’s start-up is not completed yet.

You can follow the progress of your deployment either on the web interface: or with journalctl:

# journalctl -exf -u dci-ansible-agent


DCI also comes with CLI interface that you can use directly on the hypervisor.

# source /etc/dci-ansible-agent/
# dcictl job-list

This command can also give you an output in the JSON format. It’s handy when you want to reuse the DCI results in some script:

# dcictl --format json job-list --limit 1 | jq .jobs[].status

To conclude

I hope you enjoyed the article and this will help you to prepare your own configuration. Please, don’t hesitate to contact me if you have any question.

I would like to thanks François Charlier and the InfraRed team. François started the DCI InfraRed integration several months ago. He did a great job to resolve all the issues one by one with the help of the InfraRed team.

by Gonéri Le Bouder at August 23, 2018 05:13 PM

OpenStack Superuser

Contribution—not just consumption—is the message from OpenStack Days Tokyo

Akihiro Hasegawa kicked off OpenStack Days Tokyo with a strong message about how to engage with open source: it’s about contribution, not just consumption.

He credits the strong community of contributors in Japan for organizing the most successful events to date, with more than 1,000 registrants despite the fact that it was their first-ever paid event.

Another success factor was co-locating a Cloud Native Days event and focusing on the larger picture of how open infrastructure supports cloud native applications. Open source is the ethos that brings these communities together says Hasegawa, a board member of the Japan OpenStack User Group and a chairman/founder of the Tokyo event.

That sentiment was echoed in a keynote from Abby Kearns, executive director of the Cloud Foundry Foundation, who talked about how open source is a choice that delivers control for her users. That means that users can get involved and contribute directly to the project roadmap or the software itself, hitting on the theme of gaining more value from contribution than simple consumption. Kearns also highlighted several joint Cloud Foundry and OpenStack case studies, including Rakuten and Yahoo! Japan.

Deepak Kumar Gupta, general manager of NEC’s open-source office in India, continued the themes of contribution by touting his company’s technical leadership, commits and reviews to OpenStack, as well as many other open source projects. A major focus has been enabling smart cities using the FIWARE IoT platform, an open-source project to which NEC is a major backer. Gupta’s team has put significant work into integrating FIWARE with OpenStack and Kubernetes, benefiting their customers as well as upstream communities.

Shintaro Mizuno moderated the sixth OpenStack ops meetup in Japan, where users who are running clouds got together for unconference-style discussions ranging from GPUs to serverless to the changing storage landscape. There were 29 attendees from 17 companies who attended the 12 ops sessions over two days. Topics included storage, serverless tech, OpenStack+GPU, containers, edge computing and how to integrate ops into the larger community.

The ops meetup sessions are another great example of contribution in the form of knowledge sharing among users, as well as providing feedback to the upstream community to help guide the future roadmap.  The group took extensive notes in Etherpads which will be summarized and shared back to the ops mailing list.

The 8th birthday cake is unveiled at the booth crawl!

At the evening reception, community members celebrated the 8th birthday of OpenStack and many early community members reflected on how OpenStack has grown and the impact it’s had on the industry and open source.

When asked about their wish for the future, many said they hope the open source values and collaboration systems are studied and adopted by more projects and teams.

Bringing together the open infrastructure and cloud native communities was a big success in Japan. In addition to OpenStack, Cloud Foundry and Fiware, many other open source projects were featured at the event, including Ansible, Ceph, KubeFlow, Prometheus, Spinnaker, TensorFlow and Zuul.


It’s clear that open source is a positive sum game and that users care about the value of integrating these open technologies together.



NEC demonstrates Fiware smart city application running on OpenStack and Kubernetes

The post Contribution—not just consumption—is the message from OpenStack Days Tokyo appeared first on Superuser.

by Lauren Sell at August 23, 2018 02:05 PM


What’s new in Mirantis Cloud Platform

Today we released the newest version of Mirantis Cloud Platform. Let me tell you why we think MCP is great.

by Boris Renski at August 23, 2018 12:47 PM

SUSE Conversations

SUSE CaaS Platform 3 Kubernetes Cloud Provider Integration

  Introduction   The efforts to manage the fleet of nodes powering a Kubernetes cluster are significantly reduced by SUSE CaaS Platform, however there’s always room for improvement. By taking advantage of the flexibility provided by modern Infrastructure as a Service platforms (also known as IaaS or  “clouds”), it is possible to automate even more […]

The post SUSE CaaS Platform 3 Kubernetes Cloud Provider Integration appeared first on SUSE Communities.

by mjura at August 23, 2018 06:46 AM

John Likes OpenStack

Updating ceph-ansible in a containerized undercloud

In Rocky the TripleO undercloud will run containers. If you're using TripleO to deploy Ceph in Rocky, this means that ceph-ansible shouldn't be installed on your undercloud server directly because your undercloud server is a container host. Instead ceph-ansible should be installed on the mistral-executor container because, as per config-download, That is the container which runs ansible to configure the overcloud.

If you install ceph-ansible on your undercloud host it will lead to confusion about what version of ceph-ansible is being used when you try to debug it. Instead install it on the mistral-executor container.

So this is the new normal in Rocky on an undercloud that can deploy Ceph:

[root@undercloud-0 ~]# rpm -q ceph-ansible
package ceph-ansible is not installed
[root@undercloud-0 ~]#

[root@undercloud-0 ~]# docker ps | grep mistral
0a77642d8d10 "kolla_start" 4 hours ago Up 4 hours (healthy) mistral_api
c32898628b4b "kolla_start" 4 hours ago Up 4 hours (healthy) mistral_engine
c972b3e74cab "kolla_start" 4 hours ago Up 4 hours (healthy) mistral_event_engine
d52708e0bab0 "kolla_start" 4 hours ago Up 4 hours (healthy) mistral_executor
[root@undercloud-0 ~]#

[root@undercloud-0 ~]# docker exec -ti d52708e0bab0 rpm -q ceph-ansible
[root@undercloud-0 ~]#

So what happens if you're in a situation where you want to try a different ceph-ansible version on your unercloud?

In the next example I'll update my mistral-executor container from ceph-ansible rc18 to rc21. These commands are just variations of the upstream documentationbut with a focus on updating the undercloud, not overcloud, container. Here's the image I want to update:

[root@undercloud-0 ~]# docker images | grep mistral-executor 2018-08-20.1 740bb6f24755 2 days ago 1.05 GB
[root@undercloud-0 ~]#
I have a copy of ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm in my current working directory

[root@undercloud-0 ~]# mkdir -p rc21
[root@undercloud-0 ~]# cat > rc21/Dockerfile <
> USER root
> COPY ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm .
> RUN yum install -y ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm
> USER mistral
[root@undercloud-0 ~]#
So again that file is (for copy/paste later):

[root@undercloud-0 ~]# cat rc21/Dockerfile
USER root
COPY ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm .
RUN yum install -y ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm
USER mistral
[root@undercloud-0 ~]#
Build the new container

[root@undercloud-0 ~]# docker build --rm -t ~/rc21
Sending build context to Docker daemon 221.2 kB
Step 1/5 : FROM
---> 740bb6f24755
Step 2/5 : USER root
---> Using cache
---> 8d7f2e7f9993
Step 3/5 : COPY ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm .
---> 54fbf7185eec
Removing intermediate container 9afe4b16ba95
Step 4/5 : RUN yum install -y ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm
---> Running in e80fce669471

Examining ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm: ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch
Marking ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm as an update to ceph-ansible-3.1.0-0.1.rc18.el7cp.noarch
Resolving Dependencies
--> Running transaction check
---> Package ceph-ansible.noarch 0:3.1.0-0.1.rc18.el7cp will be updated
---> Package ceph-ansible.noarch 0:3.1.0-0.1.rc21.el7cp will be an update
--> Finished Dependency Resolution

Dependencies Resolved

Arch Version Repository Size
noarch 3.1.0-0.1.rc21.el7cp /ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch 1.0 M

Transaction Summary
Upgrade 1 Package

Total size: 1.0 M
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Updating : ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch 1/2
Cleanup : ceph-ansible-3.1.0-0.1.rc18.el7cp.noarch 2/2
Verifying : ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch 1/2
Verifying : ceph-ansible-3.1.0-0.1.rc18.el7cp.noarch 2/2

ceph-ansible.noarch 0:3.1.0-0.1.rc21.el7cp

---> 41a804e032f5
Removing intermediate container e80fce669471
Step 5/5 : USER mistral
---> Running in bc0db608c299
---> f5ad6b3ed630
Removing intermediate container bc0db608c299
Successfully built f5ad6b3ed630
[root@undercloud-0 ~]#
Upload the new container to the registry:

[root@undercloud-0 ~]# docker push
The push refers to a repository []
606ffb827a1b: Pushed
fc3710ffba43: Pushed
4e770d9096db: Layer already exists
4d7e8476e5cd: Layer already exists
9eef3d74eb8b: Layer already exists
977c2f6f6121: Layer already exists
00860a9b126f: Layer already exists
366de6e5861a: Layer already exists
2018-08-20.1: digest: sha256:50aae064d930e8d498702673c6703b70e331d09e966c6f436b683bb152e80337 size: 2007
[root@undercloud-0 ~]#
Now we see new the f5ad6b3ed630 container in addition to the old one:

[root@undercloud-0 ~]# docker images | grep mistral-executor 2018-08-20.1 f5ad6b3ed630 4 minutes ago 1.09 GB 740bb6f24755 2 days ago 1.05 GB
[root@undercloud-0 ~]#
The old container is still running though:

[root@undercloud-0 ~]# docker ps | grep mistral
373f8c17ce74 "kolla_start" 6 hours ago Up 6 hours (healthy) mistral_api
4f171deef184 "kolla_start" 6 hours ago Up 6 hours (healthy) mistral_engine
8f25657237cd "kolla_start" 6 hours ago Up 6 hours (healthy) mistral_event_engine
a7fb6df4e7cf 740bb6f24755 "kolla_start" 6 hours ago Up 6 hours (healthy) mistral_executor
[root@undercloud-0 ~]#
Merely updating the image doesn't restart the container and neither does `docker restart a7fb6df4e7cf`. Instead I need to stop it and start it but there's a lot that goes into starting these containers with the correct parameters.

The upstream docs section on Debugging with Paunch shows me a command to get the exact command that was used to start my container. I just needed to use `paunch list | grep mistral` first to know I need to look at the tripleo_step4.

[root@undercloud-0 ~]# paunch debug --file /var/lib/tripleo-config/docker-container-startup-config-step_4.json --container mistral_executor --action print-cmd
docker run --name mistral_executor-glzxsrmw --detach=true --env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS --net=host --health-cmd=/openstack/healthcheck --privileged=false --restart=always --volume=/etc/hosts:/etc/hosts:ro --volume=/etc/localtime:/etc/localtime:ro --volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro --volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume=/etc/pki/tls/certs/ --volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume=/dev/log:/dev/log --volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro --volume=/etc/puppet:/etc/puppet:ro --volume=/var/lib/kolla/config_files/mistral_executor.json:/var/lib/kolla/config_files/config.json:ro --volume=/var/lib/config-data/puppet-generated/mistral/:/var/lib/kolla/config_files/src:ro --volume=/run:/run --volume=/var/run/docker.sock:/var/run/docker.sock:rw --volume=/var/log/containers/mistral:/var/log/mistral --volume=/var/lib/mistral:/var/lib/mistral --volume=/usr/share/ansible/:/usr/share/ansible/:ro --volume=/var/lib/config-data/nova/etc/nova:/etc/nova:ro
[root@undercloud-0 ~]#
Now that I know the command I can see my six-hour old conatiner:

[root@undercloud-0 ~]# docker ps | grep mistral_executor
a7fb6df4e7cf 740bb6f24755 "kolla_start" 6 hours ago Up 12 minutes (healthy) mistral_executor
[root@undercloud-0 ~]#
stop it

[root@undercloud-0 ~]# docker stop a7fb6df4e7cf
[root@undercloud-0 ~]#
ensure it's gone

[root@undercloud-0 ~]# docker rm a7fb6df4e7cf
Error response from daemon: No such container: a7fb6df4e7cf
[root@undercloud-0 ~]#
and then run the command I got from above to start the container and finally see my new container

[root@undercloud-0 ~]# docker ps | grep mistral-executor
d8e4073441c0 "kolla_start" 14 seconds ago Up 13 seconds (health: starting) mistral_executor-glzxsrmw
[root@undercloud-0 ~]#
Finally I confirm that my container has the new ceph-ansible package:

(undercloud) [stack@undercloud-0 ~]$ docker exec -ti d8e4073441c0 rpm -q ceph-ansible
(undercloud) [stack@undercloud-0 ~]$
I was then able to deploy my overcloud and see that the rc21 version fixed a bug.

by John ( at August 23, 2018 03:30 AM

August 22, 2018

Aija Jauntēva

Outreachy: Redfish Message registry III and license for standard registry files

Outreachy ended last week, here is what I worked on in those last days.

I continued working on Redfish Message Registry. A quick recap, Message registry is necessary to determine severity and provide additional information about the error and possible ways to resolve the error when updating BIOS (and potentially other resources in future). It was clear that it will not be possible to finish it by the end of internship, especially as it all had to go through code reviews. Nevertheless I wanted to have at least initial version of code working from A to Z and to see if there are any more unexpected obstacles to get this to a working stage.

I ended up splitting the Message Registry functionality in 8 patches where each patch tackles something new to sushy library. At first I was avoiding creating too many patches, but I think in the end this makes review easier for everyone as from my experience so far there can be long discussions in reviews - definitely easier that each patch has only 1 (main-ish) thing to discuss rather than creating long threads about, let's say, 3 things and try to follow what's done and what's not.

The patches toward the end of the chain I have set in 'Work in Progress' mode, while they are functionally working, it will be necessary to take a closer look when their dependent patches are reviewed for some edge cases and unhappy paths. I did this it to avoid committing too much while the base of the patches might need re-write or different approach, thus these would require re-write too. But having some code working I can be more sure that there are no unexpected limitations or other things that would require changes in sushy to implement Message Registry support.

At the moment 3 patches have received one +2 each, thus there is some likelihood that they will not require major further changes. While the internship has ended, I like to finish what I have started so I'm keeping an eye on the patches and making updates at my now hobbyist capacity. Anyway, I don't mind that anyone else takes over, however at the moment these patches are missing reviews more.

Given that this is the third post where Message Registry is the main topic, it might be obvious that while I started working on implementing BIOS support in sushy, my biggest addition to sushy turns out to be Message Registry. When started working on this, it was not anticipated or, in other words, the scope and effort was not evaluated, but was discovered along the way. This is how it went - started working on BIOS -> encountered @Redfish.Settings, then -> encountered Message Registry and then -> encountered licensing issue.

Speaking of the licensing issue, it has some updates too. I already had implemented a fall-back scenario where user has to download the standard message registry files and if user has not downloaded them, then there is fall-back for fall-back - giving very limited information about the BIOS updates - only to say if it was successful or not, but no other information which attributes failed or any hints how to resolve the failure. I had this implemented so that there is at least something working while license thing is being resolved. In the legal mailing list it looked that they prefer that no 3rd party files are included in the repository, but DMTF were expecting that by having CC BY license it would be sufficient to include the files in sushy repository. But for this OpenStack technical committee has to have a case-by-case review to allow this. CC BY compability with OpenStack projects is not automatically granted, because OpenStack automatically allows only OSI-approved licenses, but as CC BY is not software license it is not covered by OSI. Anyway, it still sounded like an extra effort and serious matter, but I had no idea how big effort this could be and if sushy really needs the files included. Talked with my mentor if he has experience with technical committee and should we ask tech committee to review this case. As the mentor did not have experience with this, decided to ask this in #openstack-ironic IRC channel and there other contributors were strongly preferring that the files are included in sushy repository and included in installation package as indeed there are many customers who test Ironic without access to the Internet and this would damage user-friendliness if they have to package or download the files manually. Then in the mailing list I asked if this case can be reviewed and it was done very quickly, initial review happened within the same hours in #openstack-tc channel, and final approval happened next day resolving this in less than 24 hours. Very appreciate the fast response and even don't know why I was concerned about the effort involved. The only two things left now are that DMTF release the files under CC BY (which was necessary anyway also in the fall-back scenario so users can download the files and use them) and have to update the patch to use packaged files for standard registries and add license notices.

That's all, but stay tuned for another blog post summarizing my Outreachy internship (no sooner than next week though).

by ajya at August 22, 2018 07:30 PM

OpenStack Superuser

Getting to know StarlingX: The high-performance edge cloud software stack

StarlingX is both a development project and an integration project that combines new services with many other open source projects into an overall edge cloud software stack. Based on code contributed by Intel and Wind River and hosted by the OpenStack Foundation, it combines components of its own with leading open-source projects including OpenStack, Ceph and OVS.

StarlingX aims to support the most demanding applications in edge, industrial IOT and telecom use cases. And there are already a lot of cutting-edge ideas for it. According to a recent talk from Intel dev-ops lead Hazzim Anaya, they include small-cell services for stadiums and other high-density locations, augmented reality, virtual reality and tactile Internet applications, vehicle-to-everything (V2X) communication, virtual network functions (VNFs) and more.

At the Vancouver Summit, Ian Jolliffe, Brent Rowsell and Dean Troyer offered a deep dive into the StarlingX architecture and features.  “OpenStack is the base layer, but there’s a number of extra things that we layer on top of that,” said Jollife. “These are going to form the basis of some new projects within Starling X, so things like fault management, service management — managing and monitoring all the different services on the platform so that you can have a synchronous process failure and automatic recovery software management.”

The small, agile structure is definitely by design. “We want to be able to have a very small footprint edge solution — storage control and compute —  with virtual machines on a single server and we call that the gross configuration…this is not going to provide you hardware level redundancy but if a VM fails you can automatically recover it and it goes back into service very quickly in less than a second,” Joliffe adds.

There are two more “sizes” available, called Grouse and Robson after mountain ranges in the Vancouver area. The next scale up is a two-node configuration giving users hardware redundancy and allowing users move virtual machines around from one node to another, but still offering storage control and compute all in a single node. Lastly, the larger largest scale, Robson, offers nodes for the control plane and dedicated nodes for compute as well as dedicated nodes for storage. Check out the whole 38-minute session on YouTube.

What’s next

At the upcoming Berlin Summit, there are three sessions focusing on StarlingX.

  • “Comparing Open Edge Projects”  offers  a detailed look into the architecture of Akraino, StarlingX and OpenCord and compares them with ETSI MEC RA. Speakers include 99cloud’s Li Kai, Shuquan Huang and Intel’s Jianfeng JF Ding.
  • “StarlingX CI, from zero to Zuul” Intel’s Hazzim Anaya and Elio Martinez will go over how CI works and how to create new automated new environments to extend functionality and cover new features and test cases that are not covered inside the OSF.
  • “StarlingX Enhancements for Edge Networking” this session will cover the current state of the art as well as gaps in edge networks as well as go over StarlingX’s core projects, core networking features and enhancements for edge.

Get involved

Keep up with what’s happening with the mailing lists:
There are also weekly calls you can join:
Or for questions hop on Freenode IRC: #starlingx.
You can also read up on project documentation:

Photo // CC BY NC

The post Getting to know StarlingX: The high-performance edge cloud software stack appeared first on Superuser.

by Nicole Martinelli at August 22, 2018 02:04 PM

Galera Cluster by Codership

Codership is hiring Galera Cluster product specialist/community manager

Galera Cluster is the leading high availability solution for MySQL and MariaDB. We are always looking for bright and driven people who have a passion for technology, especially for database replication and clustering. If you are interested in challenging work, being part of a innovative team and world’s leading MySQL clustering company-then Codership is right place for you!  This is a remote job.


Job Description and Responsibilities

  • Write Galera Cluster technical marketing material (blogs, white papers, benchmarks)
  • Prepare presentations and present at webinars and conferences
  • Be active in social media channels and forums
  • Consult customers for Galera Cluster best practices
  • Pre-sales support
  • Interface with the development team to provide user feedback
  • Assist in writing Galera documentation


Galera product specialist/community manager qualifications

An ideal candidate must possess the following skills:

  • Deep understanding of Galera Cluster, replication technologies, databases generally
  • Experienced speaker at webinars and conferences
  • Ability to write technical blogs and white papers
  • Technical customer consulting/presales experience
  • Excellent verbal and written communication skills
  • Strong networking and social media skills
  • Ability to engage with a wide range of audiences from customers up to senior management
  • Experience working across different cultures and business environments
  • Organized, detail-oriented, and self-driven
  • Willingness to travel to conferences and customer meetings


Send your applications to or

by Sakari Keskitalo at August 22, 2018 01:09 PM

August 21, 2018

StackHPC Team Blog

BeeGFS - High Performance Storage for Research Platforms

Monasca provides monitoring as a service for OpenStack. It’s scalable, fault tolerant and supports multi-tenancy with Keystone integration. You can bolt it on to your existing OpenStack distribution and it will happily go about collecting logs and metrics, not just for your control plane, but for tenant workloads too.

So how do you get started? Errr... well, one of the drawbacks of Monasca’s microservice architecture is the complexity of deploying and managing the services within it. Sound familiar? On the other hand this microservice architecture is one of Monasca’s strengths. The deployment is flexible and you can horizontally scale out components as your ingest rate increases. But how do you do all of this?

Enter OpenStack Kolla. Back in 2017, Steven Dake, the founder of the Kolla project, wrote about the significant human resource costs of running an OpenStack managed cloud, and how the Kolla project offers a pathway to reduce them. By providing robust deployment and upgrade mechanisms, Kolla helps to keep OpenStack competitive with proprietary offerings, and at StackHPC we want to bring the same improvements in operational efficiency to the Monasca project. In doing so we’ve picked up the baton for deploying Monasca with Kolla and we don't expect to put it down until the job is finished. Indeed, since Kolla already provides many required services and support for deploying the APIs has just been merged, we're hoping that this isn't too long.

So what else is new in the world of Monasca? One of the key things that we believe differentiates Monasca is support for multi-tenancy. By allowing a single set of infrastructure to be used for monitoring both the control plane and tenant workloads, operational efficiency is increased. Furthermore, because the data is all in one place, it becomes easy to augment tenant data with what are typically admin only metrics. We envisage a tenant being able to log in and see something like this:

Cluster overview

By providing a suitable medium for thought, the tenant no longer has to sift through streams of data to understand that their job was running slow because Ceph was heavily loaded, or the new intern had saturated the external gateway. Of course, exposing such data needs to be done carefully and we hope to expand more upon this in a later blog post.

So how else can we help tenants? A second area that we've been looking at is logging. Providing a decent logging service which can quickly and easily offer insight into the complex and distributed jobs that tenants run can save them a lot of time. To this effect we've been adding support for querying tenant logs via the Monasca Log API. After all tenants can POST logs in, so why not support getting them out? One particular use case that we've had is to monitor jobs orchestrated by Docker Swarm. As part of this work we knocked up a proof of concept Compose file which deploys the Monasca Agent and Fluentd as global services across the Swarm cluster. With a local instance of Fluentd running the Monasca plugin, container stdout can be streamed directly into Monasca by selecting the Fluentd Docker log driver. The tenant can then go to Grafana and see both container metrics and logs all in one place, and with proper tenant isolation. Of course, we don't see this as a replacement for Kibana, but it has its use cases.

Thirdly, a HPC blog post wouldn't be complete without mentioning Slurm. As part of our work to provide intuitive visualisations we've developed a Monasca plugin which integrates with the Discrete plugin for Grafana. By using the plugin to harvest Slurm job data we can present the overall state of the Slurm cluster to anyone with access to see it:

Slurm Ganntt chart

The coloured blocks map to Slurm jobs, and as a cluster admin I can immediately see that there’s been a fair bit of activity. So as a user running a Slurm job, can I easily get detailed information on the performance of my job? It’s a little bit clunky at the moment, but this is something we want to work on. Both on the scale of the visualisation; we’re talking thousands of nodes not 8, and in the quality of the interface. As an example of what we have today here’s the CPU usage and some Infiniband stats for 3 jobs running on nodes 0 and 1:

Slurm drill down

Finally, we'll finish up with a summary. We've talked about helping to drive forward progress in areas such as deployment, data visualisation and logging within the Monasca project. Indeed, we're far from the only people with a goal for bettering Monasca, and we're very grateful for the others that share it with us. However, we don't want you to think that we're living in a bubble. In fact, speaking of driving, we see Monasca as an old car. Not a bad one, rather a potential classic. One where you can still open the bonnet and easily swap in and out parts. It's true that there is a little rust. The forked version of Grafana with Keystone integration prevents users from getting their hands on shiny new Grafana features. The forked Kafka client means that we can't use the most recent version of Kafka, deployable out of the box with Kolla. Similar issues exist with InfluxDB. And whilst the rust is being repaired (and it is being repaired) newer, more tightly integrated cars are coming out with long life servicing. One of these is Prometheus, which compared to Monasca is exceptionally easy to deploy and manage. But with tight integration comes less flexibility. One size fits all doesn't fit everyone. Prometheus doesn't officially support multi-tenancy, yet. We look forward to exploring other monitoring and logging frameworks in future blog posts.

by Doug Szumski at August 21, 2018 10:30 PM

Doug Hellmann

Planting Acorns

This post is based on the closing keynote I gave for PyTennessee in February 2018, where I talked about how the governance of an open source project impacts  the health of the project, and some lessons we learned in building the OpenStack community that can be applied to other projects. OpenStack is a cloud computing system …

by doug at August 21, 2018 04:05 PM

OpenStack Superuser

Expanding the horizons of the Community Contributor Awards

So many folks work tirelessly behind the scenes to make the community great, whether they’re fixing bugs, contributing code, helping newbies on IRC or just making everyone laugh at the right moment.

Now these quirky honors are open to anyone involved in Airship, Kata Containers, OpenStack, StarlingX and Zuul.

You can help them get recognized (with a very shiny medal!) by nominating them for the next Contributor Awards given out at the upcoming Berlin Summit.

Winners in previous editions included the “Duct Tape” award and the “Don’t Stop Believin’ Cup,” shining a light on the extremely valuable work that makes the larger community excel.

There are so many different areas worthy of celebration, but there are a few kinds of community members who deserve a little extra love:

  • They are undervalued
  • They don’t know they are appreciated
  • They bind the community together
  • They keep it fun
  • They challenge the norm
  • Other: (write-in)

As in previous editions, rather than starting with a defined set of awards the community is asked to submit names.

The OSF community team then has a little bit of fun on the back end, massaging the award titles to devise something worthy of their underappreciated efforts.

Photo // CC BY NC

The post Expanding the horizons of the Community Contributor Awards appeared first on Superuser.

by Superuser at August 21, 2018 02:11 PM

Superuser Award nominations open for the Berlin Summit

Nominations for the OpenStack Summit Berlin Superuser Awards are open and will be accepted through midnight (Pacific Daylight Time) Friday, October 5.

All nominees will be reviewed by the community and the Superuser editorial advisors will determine the winner that will be announced onstage at the Summit in November.

The Superuser Awards recognize teams using open source to meaningfully improve business and differentiate in a competitive industry, while also contributing back to the community.

This is the ninth edition of the Superuser Awards and teams of all sizes are encouraged to apply. If you fit the bill, or know a team that does, we encourage you to submit a nomination here.

The program was launched at the Paris Summit in 2014, and the community has continued to award winners at every Summit to users who show how OpenStack and open source overall is making a difference and provide strategic value in their organization. Past winners include CERN, Comcast, NTT GroupAT&T, and Tencent TStack.  The most recent team to take the honors was  The Ontario Institute for Cancer Research (OICR) at the Vancouver Summit.

The community then has the chance to review the list of nominees, how they are running OpenStack, what open source technologies they are using and the ways they are contributing back to the OpenStack community.

Then, the Superuser editorial advisors will review the submissions, narrow the nominees down to four finalists and review the finalists to determine the winner based on the submissions.

When evaluating winners for the Superuser Award, judges take into account the unique nature of use case(s), as well as integrations and applications of OpenStack performed by a particular team.

Additional selection criteria includes how the workload has transformed the company’s business, including quantitative and qualitative results of performance as well as community impact in terms of code contributions, feedback, knowledge sharing and the number of Certified OpenStack Administrators (COAs) on staff.

Winners will take the stage at the OpenStack Summit in Berlin. Submissions are open now until Friday, October 5, 2018. You’re invited to nominate your team or nominate a Superuser here.

For more information about the Superuser Awards, please visit

The post Superuser Award nominations open for the Berlin Summit appeared first on Superuser.

by Ashlee Ferguson at August 21, 2018 02:07 PM

SUSE Conversations

SUSE Manager Technical Overview – Part 1 Discover SUSE Manager

In this 3 part blog series we take a deeper look into SUSE Manager 3.2, looking at what the product does, how it does and how it is set up. Blog Part 1 – Discover SUSE Manager Blog Part 2 – How SUSE Manager Works Blog Part 3 – SUSE Manager Configuration Management Discover SUSE […]

The post SUSE Manager Technical Overview – Part 1 Discover SUSE Manager appeared first on SUSE Communities.

by Jason Phippen at August 21, 2018 09:59 AM

August 20, 2018

OpenStack Superuser

Igniting the great serverless debate

Everyone wants to go serverless these days.
Whether you define serverless as dynamically shifting resources as needed, outsourcing the data center or simply as a no maintenance strategy, the term gets lobbed around a lot in today’s tech circles.

For Clay Smith, New Relic developer advocate, the term is both at peak hype cycle and worth keeping an eye on.

“I think what makes serverless interesting and where the debate gets heated is that it effectively stamps, in a big, big way, ‘deprecated!’ on a ton of technology,” he says, speaking with Fredric Paul on the Modern Software podcast. “I empathize with people who may have spent the past three years building out a container platform because when you consider the emergence of serverless, it definitely puts that kind of investment at risk. So I understand it, but I don’t think that’s an excuse to ignore it.”

Looking at Google searches in the United States in the last five years, it would be hard to ignore the uptick. But going sans servers is also a question of the bottom line, says author of “Archictecting for Scale” Lee Atchison.


“The cost of running an infrastructure is a real cost and the promise of serverless is to reduce or eliminate that cost,” he says. “But that promise is a hype promise. It’s never going to go away. There’s always going to be infrastructure costs.” The real benefit of serverless is about more management and control over the inevitable investments in infrastructure plus the capacity to scale independently from that infrastructure.

While Smith believes that going serverless frees devs up to focus on code and scaling, Atchison says that those aren’t  necessarily advantages brought about by serverless, but simply by the cloud.

In any case, serverless won’t be sounding the death knell for data centers — or virtual machines — any time soon. “The virtual machine industry is still in the billions of dollars. The life of this stuff seems relatively long and I don’t think serverless fundamentally changes that,” Smith says. ” I think when someone makes a tribute music video to their favorite server we’ll know we’ve hit peak nostalgia, but we’re stuck with them for a long time.” Amen?

Catch the entire 24-minute episode here.

The post Igniting the great serverless debate appeared first on Superuser.

by Superuser at August 20, 2018 02:12 PM

August 19, 2018

Maish Saidel-Keesing

Installing OpenStack CLI clients on Mac OSX

I usually have a Linux VM that I use to perform some of my remote management tasks, such a OpenStack CLI commands.

But since I now have a Mac (and yes I am in enjoying it!!) I thought why not do it natively on my Mac. The official documentation on installing clients is on the OpenStack site.

This is how I got it done.

Firstly install pip

easy_install pip

Now to install the clients (keystone, glance, heat, nova, neutron, cinder, swift and the new OpenStack client)

pip install python-keystoneclient python-novaclient python-heatclient python-swiftclient python-neutronclient python-cinderclient python-glanceclient python-openstackclient

First problem – was no permissions

No Permissions

Yes you do need sudo for some things…

sudo –H pip install python-keystoneclient python-novaclient python-heatclient python-swiftclient python-neutronclient python-cinderclient python-glanceclient python-openstackclient



Or so I thought…

Maybe not...

Google led me here -

sudo –H pip uninstall six

uninstall six

And then

sudo –H easy_install six

reinstall six

And all was good

nova works

nova list

Quick and Simple!! Hope this is useful!

by Maish Saidel-Keesing ( at August 19, 2018 09:19 AM

OpenStack Israel CFP Voting is Open

I would like to bring to your attention that the voting for the sessions for the upcoming OpenStack Israel Summit on June 15th, 2015 is now open.

Make your voice heard and participate in setting the agenda for the event!


You can find more information and the presentation that I gave last year in this post Recap - Openstack Israel 2014 #OpenStackIL and for your convenience I have embedded the recording below.

by Maish Saidel-Keesing ( at August 19, 2018 09:18 AM

Why I Decided to Run for the OpenStack Technical Committee

As of late I have been thinking long and hard about if I can in some way contribute in a more efficient way into the OpenStack community.

Almost all of my focus today is on OpenStack, on its architecture and how to deploy certain solutions on top such an infrastructure.

What is the Technical Committee?

It is a group of 13 elected people by the OpenStack ATC’s (Active Technical contributors – a.k.a the people that are actively contributing code to the projects over the last year). There are seven spots up for election for this term, in addition to the six TC members that were chosen 6 months ago for a term of one year.

The TC’s Mission is defined as follows:

The Technical Committee (“TC”) is tasked with providing the technical leadership for OpenStack as a whole (all official projects, as defined below). It enforces OpenStack ideals (Openness, Transparency, Commonality, Integration, Quality...), decides on issues affecting multiple projects, forms an ultimate appeals board for technical decisions, and generally has technical oversight over all of OpenStack.

On Thursday I decided to take the plunge. Here is the email where I announced my candidacy.

This is not a paid job, if anything it more of a “second” part-time job – a voluntary part-time job. There are meetings, email discussions on a regular basis.

There are a number of reasons that I am running for a spot on the TC.


In my post The OpenStack Elections - Another Look, I noted that operators were not chosen to for the board. This is something that I think is lacking in the OpenStack community today. The influence that the people who are actually using and deploying the software is minimal if at all. The influence they have is mostly after the fact (at best) and not much of an input of what they would like to have put into the product.

openstackI am hoping to bring in a new perspective to the TC, to help  them understand the needs of those who actually deploy the software and have to deal with it day in and day out. There are valid pain points that they have, and in my honest opinion they feel they are not being heard or not being taken into consideration, at least not enough in their eyes.

Acceptance of others

The people who vote are only those who contribute code. Those who have committed a patch to the OpenStack code repositories. That is the definition of an ATC.

It is not easy to get a patch committed. Not at all (at least that is my opinion). You have to learn how to use the tools that the OpenStack community has in place. That takes time. I tried to ease the process with a Docker container to help you along. But even with that, it still seems (to me) that to get into this group of contributors takes time.

It is understandable. There is a standard of doing things (and rightfully so) so the chances of you getting your change accepted the first time are slim, for a number of reasons that I will not go into in this post.

I think that the definition of contributor should be expanded and not only limited to a those who write the code. There are a number of other ways to contribute.

I know that this will not be an easy “battle to win”. I am essentially asking the people to relinquish the way they have been doing things for the past 5 years and allow those who are not developers, those who do not write the code, to steer the technical direction of OpenStack.

I do think this will be in the best interest of everyone to extend the reach of OpenStack community, to branch out.

More information on the actual election that will run until April 30th can be found here. If you are one of the approximate 1,800 people who is an ATC, you should have received a ballot for voting.

It will be interesting to see the results which should be out in a few days.

As always your thoughts and comments are appreciated, please feel free to leave them below.

by Maish Saidel-Keesing ( at August 19, 2018 09:18 AM

Get Ready for the OpenStack Summit

The OpenStack community is converging on Vancouver next week for the bi-annual summit for all things OpenStack.


I am glad to be joining the event and I would like to share with you a short outline of what public events and activities I will participating in.

The rest of my time will be spread out over the Cross-Project workshops, the Ops sessions, other sessions and activities.

I am really looking forward to this event and please feel free to come and say hello.

by Maish Saidel-Keesing ( at August 19, 2018 09:18 AM

Some Vendors I Will Visit at the OpenStack Summit

At all technology conference I always like to go on to Floor / Marketplace / Solutions Exchange – where vendors try to get your attention and market their product.

Going over the list of vendors from Summit site, the list below are some of the less know companies (at least to me) that caught my eye and I would like to go over during the summit and see what they have to say.

** The blurb I posted is something that I found on each of the respective sites, and does not necessarily provide a comprehensive overview of what each company offers **


Stackato allows agile enterprises to develop and deploy software solutions faster than ever before and manage them more effectively. Stackato provides development teams with built-in languages, frameworks and services on one single cloud application platform, while providing enterprise-level security and world-class support.

Akanda is the only open source network virtualization solution built by OpenStack operators for real OpenStack clouds. Akanda eliminates the need for complex SDN controllers, overlays and multiple plugins for cloud networking by providing a simple integrated networking stack (routing, firewall, load balancing) for connecting and securing multi-tenant OpenStack environments.

Appcito Cloud Application Front-End™ (CAFE) is an easy-to-deploy, unified and cloud-native service that enables cloud application teams to innovate faster and improve user experiences with their applications.

Operators and developers can use AppFormix’s versatile software to remove and prevent resource contention among applications from the infrastructure without being invasive to applications. The real-time, state driven control provided by AppFormix’s intuitive dashboard allows efficient management of all I/O resources. For deeper control and customization, access to API driven controls are also easily accessible. Plan infrastructure intelligently and remove the guess work involved in managing finite server resources to create fully optimized data center infrastructure.

Caringo Swarm leverages simple and emergent behavior with decentralized coordination to handle any rate, flow or size of data. Swarm turns standard hardware into a reliable pool of resources that adapts to any workload or use case while offering a foundation for new data services.

Cleversafe’s decentralized, shared-nothing storage architecture enables performance and capacity to scale independently, reaching petabyte levels and beyond.

Covering all the traffic inside datacenters, GuardiCore offers the only solution combining real-time detection of threats based on deep analysis of actual traffic, real time understanding, mitigation and remediation.

One Convergence Network Virtualization and Service Delivery (NVSD) Solution takes a policy driven approach and brings in the innovative concept of “Service Overlays” to go along with “Network Overlays” to virtualize networks and services. The solution innovates and extends SDN with Service Overlays for delivering L4 to L7 services with higher-level abstractions that are application friendly.

Quobyte turns your servers into a horizontal software-defined storage infrastructure. It is a complete storage product that can host any application out-of-the-box. Through fault-tolerance, flexible placement and integrated automation, Quobyte decouples configuration and operations from hardware.

The RING is a software-based storage that is built to scale to petabytes with performance, scaling and protection mechanisms appropriate for such scale. It enables your business to grow without limitations and extra overhead, works across 80% of your applications, and protects your data over 200% more efficiently at 50–70% lower cost.

The Scalr Cloud Management Platform packages all the cloud best practices in an extensible piece of software, giving your engineers the head start they need to finally focus on creating customer value, not on solving cloud problems.

StorPool Storage
StorPool is storage software. It runs on standard hardware – servers, drives, network – and turns them into high-performance storage system. StorPool replaces traditional storage arrays, all-flash arrays or other inferior storage software (SDS 1.0 solutions).

Stratoscale’s software transforms standard x86 servers into a hyper-converged infrastructure solution
combining high-performance storage with efficient cloud services, while supporting both
containers and virtualization on the same platform.

Core, storage and compute nodes connecting via Extreme Networks

Security Policy Orchestration for the World's Largest Enterprises.
Managing security policies on multi-vendor firewalls & cloud platforms.

by Maish Saidel-Keesing ( at August 19, 2018 09:17 AM

Integrating OpenStack into your Jenkins workflow

This is a re-post of my interview with Jason Baker of

Continuous integration and continuous delivery are changing the way software developers create and deploy software. For many developers, Jenkins is the go-to tool for making CI/CD happen. But how easy is it to integrate Jenkins with your OpenStack cloud platform?

Meet Maish Saidel-Keesing. Maish is a platform architect for Cisco in Israel focused on making OpenStack serve as a platform upon which video services can be deployed. He works to integrate a number of complementary solutions with the default out-of-the-box OpenStack project and to adapt Cisco's projects to have a viable cloud deployment model.

At OpenStack Summit in Vancouver next week, Maish is giving a talk called: The Jenkins Plugin for OpenStack: Simple and Painless CI/CD. I caught up with Maish to learn a little more about his talk, continous integration, and where OpenStack is headed.


Without giving too much away, what can attendees expect to learn from your talk?

The attendees will learn about the journey that we went through 6-12 months ago, when we looked at using OpenStack as our compute resource for the CI/CD pipeline for several of our products. I'll cover the challenges we faced, why other solutions were not suitable, and how we overcame these challenges with a Jenkins plugin that we developed for our purposes, which we are open sourcing to the community at the summit.


What affects has CI/CD had on the development of software in recent years?

I think that CI/CD has allowed software developers to provide a better product for their customers. In allowing them to continuously deploy and test their software, they can provide better code. In addition, it has brought the developers closer to the actual deployments in the field. In the past, there was a clear disconnect between the people writing the software and those who deployed and supported it at the customer.

How can a developer integrate OpenStack into their Jenkins workflow?

Using the plugin we developed it is very simple to integrate an OpenStack cloud as part of the resources that can be consumed in your Jenkins workflow. All the users will need is to provide a few parameters, such as endpoints, credentials, etc., and they will be able to start deploying to their OpenStack cloud.

How is the open source nature of this workflow an advantage for the organizations using it?

An open source project always has the benefit of having multiple people contributing and improving the code. It is always a good thing to have another view on a project with a fresh outlook. It improves the functionality, the quality and the overall experience for everyone.

Looking more broadly to the OpenStack Summit, what are you most excited about for Vancouver?

First and foremost, I look forward to networking with my peers. It is a vibrant and active community.

I would also like to see some tighter collaboration between the operators, the User Committee, the Technical Committee, and the projects themselves to understand what the needs are of those deploying and maintaining OpenStack in the field and to help them to achieve their goals.

One of the major themes I think we will see from this summit will be the spotlight on companies, organizations and others using the products. We'll see why they moved, and how OpenStack solves their problems. Scalability is no longer in question: scaling is a fact.

Where do you see OpenStack headed, in the Liberty release and beyond?

The community has undergone a big change in the last year, trying to define itself in a clearer way: what is OpenStack, and what it is not.

I hope that all involved continue to contribute, and that the projects focus more on features and problems that are fed to them from the field. It is fine line to define, and usually not a clear one, but something that OpenStack (and all those who consider themselves part of the OpenStack community) have to address and solve, together.

by Maish Saidel-Keesing ( at August 19, 2018 09:17 AM

The OpenStack Summit Kilo Summit - Recap

I have been home for just over a week from my trip to Vancouver for the OpenStack Kilo Summit (or Liberty Design Summit – take your pick).

It was a whirlwind of week, jam packed with sessions, conversations, meetings, presentations and community events.

There were a number of insights that I took with me and I would like to share with you in this post, and also in some upcoming posts in the future.

1. OPs is a real thing

There was a dedicated OPs track at the summit. Before we go into what actually happened there – I would like to clarify what I mean as OPs and if this is any different to the perception of how it is defined by the OpenStack community.

For me an operator is primarily the person that has to actually maintain the cloud infrastructure. This could be a number of things:
  • Create packages for installing OpenStack
  • Actually installing OpenStack
  • Making sure that OpenStack keeps up and running
  • Monitors the infrastructure
  • Upgrades the infrastructure

There are also end-users, and these are the people that actually use OpenStack:
  • Provision instances
  • Deploy applications on top of those instances
  • Create tenants/users/projects

For the OpenStack Community – sometimes these two groups are one and the same, and in my honest opinion they definitely are not – and should not be treated as such. It is taking time, but I think that the community is starting to understand that there are two distinct groups here, which have very different sets of needs, and should be catered to quite differently

There was a significant amount of discussion of how Operators can get more involved, and honestly I must say that the situation has improved – drastically – in comparison to the situation 12 months ago.

There are a number of working groups, the Win The Enterprise WG, the Product WG, the Monitoring and Logging WG, all of these have been meeting over the last year to try and hash out ways of getting more involved.

One of the interesting discussions that came out as a result (well perhaps not directly but I assume that I had something to do with it) of me running for the OpenStack TC was how does the OpenStack Community want to acknowledge those people that are not committing code but are contributing back into the community. To get some more background – I refer you to these threads on the mailing list.

Something that kept on coming up again and again in sessions was that the Project groups are looking for more and more feedback from the people operating and deploying OpenStack, but the process of getting that feedback is broken/not working/problematic.

I do understand that the TC (and OpenStack) would like to protect the most valued resource that OpenStack has – and that of course is the people writing the code.

But there has to be an easier way of allowing people to submit the feedback – and perhaps there is…
A way for Operators/Users to submit feature requests.

2. Vendors are involved in OpenStack – and they are here to stay

They are not there for the good of their hearts. They are there because they want to make money, and a lot of it. That is one (but not the only) reason why they contribute to open source projects.

OpenStack is no different. Each and every one of the vendors involved (and I will not name companies – because the sheer size of the list is just too long) are there to increase their market share, their revenue, their influence.

And that is a difficult dance to master. They are the ones providing resources to commit code and there are times where the agenda behind that is not purely community driven. This post – sums it up pretty well.

As OpenStack has grown he says its turned into a corporate open source project, not a community-driven one. He spent a day walking around the show-floor at the recent OpenStack Summit in Vancouver and said he didn’t find anyone talking about the original mission of the project. "Everyone’s talking about who’s making money, who’s career is advancing, how much people get paid, how many workloads are in production," McKenty says. "The mission was to do things differently."

OpenStack is not a small community project any more – where everyone knows each other by name/face/IRC handle. It has grown up, come of age.

For better or for worse. Stay tuned for more.

As always please feel free to add your thoughts and comments below.

by Maish Saidel-Keesing ( at August 19, 2018 09:17 AM

Downloading all sessions from the #OpenStack Summit

A question was just posted to the OpenStack mailing list – and this is not the first time I have seen this request.

Can openstack conference video files be downloaded?

A while back I wrote a post about how you can download all the vBrownBag sessions from the past OpenStack summit.

Same thing applies here, with a slight syntax change.

You can use the same tool – youtube-dl (just the version has changed since that post – and therefore some of the syntax is different as well).

Download youtube-dl and make the file executable

curl \
-o /usr/local/bin/youtube-dl
chmod a+rx /usr/local/bin/youtube-dl

The videos are available on the OpenStack Youtube channel.

What you are looking for is all the videos that were uploaded from the Summit, that would mean between May 18th, 2015 and May 30th, 2015.

The command to do that would be

youtube-dl -ci -f best --dateafter 20150518 \
--datebefore 20150529

The options I have used are:

-c - Force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.
-i  - Continue on download errors, for example to skip unavailable videos in a playlist.
-f best - Download the best quality video available.
--dateafter - Start after date
--datebefore – Up until date specified

Be advised.. This will take a while – and will use up a decent amount of disk space.

Happy downloading !!

by Maish Saidel-Keesing ( at August 19, 2018 09:16 AM


Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.


Last updated:
September 18, 2018 07:37 PM
All times are UTC.

Powered by: