September 22, 2017

OpenStack Blog

OpenStack Developer Mailing List Digest September 16-22

Summaries

PTG

Survey/polls

  • Should we have Upstream Institutes at the PTG? yay or nay

Summaries

Gerrit Upgrade Update From Infra

  • Gerrit emails are slow, because it’s sending one at a time
  • Web UI File Editor
    • Behaving oddly. Might be because of API time outs. Gertty is also having reported problems.
  • Message

Install Guide VS Tutorial

  • Since the doc-migration, people have been having questions regarding the usage of “Install Tutorial” and “Install Guide” in the OpenStack manuals repository and project specific repos.
    • The documentation team agrees this should be consistent.
    • Tutorial’s literal translation is “paper, book, film, or computer program that provide practical information about a specific subject.”
  • From PTG discussions, a distinction made was installation provides one of many possible ways to install the components.
  • Consistency is more important than bike shedding over the name.
    • Industry wise, what’s the trend?
  • Thread

Garbage Patches for simple typo fixes

  • Previous thread from 2016 on this
  • Various contributors are doing many patches that are typo, style changes.
    • It has been expressed that this can cause CI resource starvation.
  • TC created the Top 5 help wanted to help contributors know where the community needs the most identified help.
  • This is a social issue, not a technical issue. Arguing about what is useful and what isn’t is probably not worth the effort here.
  • Communication and education is probably the best solution here. For repeated offenders, off-list email could be fine to make sure the communication is clear. Communicating this in the new contributor portal and Upstream Institute would be helpful.
  • Thread

by Mike Perez at September 22, 2017 11:35 PM

OpenStack Superuser

Four ways to build open source community in your company

LOS ANGELES — The Open Community Conference was one of five tracks at the Linux Foundation’s recent Open Source Summit North America. Chaired by Jono Bacon, the tracks covered community management, open source office establishment and open source culture in organizations.

If you missed it, here are four takeaways worth applying to your work.

Find the right community manager

Community managers seem to come from all walks of life: the engineer; the marketer; the team lead. But what makes the best community manager? Alan Clark, director of industry initiatives, emerging standards and open source at SUSE, has been involved in open source communities for more than 15 years. He’s seen trends in community management come and go, but the constant success is when a community manager reflects the values of the community.
“Community management is a little different everywhere. It depends on the personalities of the managers and the personalities of the community.” Clark has been involved with both openSUSE and OpenStack, and offered the two as examples. “In the openSUSE community there’s very little marketing. It’s very developer-focused and less about promotion, social outreach and marketing. Most [community managers in openSUSE] come from engineering backgrounds, and that’s very appropriate, but in OpenStack, that wouldn’t reflect the needs of the overall community.”

“OpenStack has set core values that it has embraced, particularly transparency. The successful community manager has to personify those and be the enforcer of those. Community managers are the front line and they set the tone and the attitude of the community.”

Delete the bias against marketing

Marketing is often perceived as a lower status role in the tech industry, as Deirdré Straughan called out in her presentation, which means that emerging open source projects or organizations working with open source often think they don’t need or want marketing. This is a naive mistake, as good marketing––not the kind that insults your intelligence or abuses your information––is what communicates your open source work to the world. Toolkits let developers access your project, blueprints and roadmaps help people evaluate, training gives the skills to use and contribute, and yes, this is all “marketing.” Straughan challenged the audience to reconsider their bias against marketing and to recognize that great marketing is part of a successful open source strategy.

 

Identify the invested parties in your organization

Nithya Ruff, senior director of the open source practice at Comcast, and Duane O’Brien, open source programs evangelist at PayPal, urged people trying to start open source program offices in their organization to spend time identifying who at their organization has a vested interest in formally engaging with open source communities.

Is the legal team getting overwhelmed with requests for contribution information? Is engineering eagerly consuming open source? Is marketing desperate for help to clearly communicate the open-source basis for their products? Finding who has this special interest can act as the impetus to kickstart more formal engagement with open source communities.

Start with “inner source”

Shilla Saebi, open source community leader at Comcast, has found success by establishing open source practices internally, “inner source” as she’s dubbed it, which helps the organization and individuals develop policies around open source and become comfortable with contributing so they can successfully engage with open source communities.

Inner source looks similar to externally facing open source: licensing discussions with the legal team and meetups. At Comcast there are internal slack channels dedicated to open source projects: the OpenStack channel has more 1,200 members; Cloud Foundry, 900. Inner source acts as a practice for engaging with open source communities, growing familiarity and confidence that translates to external engagement.

Share your thoughts on what makes open source communities tick with @WhyHiAnnabelle and @Superuser on Twitter.

Cover Photo // CC BY NC

The post Four ways to build open source community in your company appeared first on OpenStack Superuser.

by Anne Bertucio at September 22, 2017 03:35 PM

September 21, 2017

Ed Leafe

Queens PTG Recap

Last week was the second-ever OpenStack Project Teams Gathering, or PTG. It’s still an awkward name for a very productive conference. This time the PTG was held in Denver, Colorado, at a hotel several miles outside of downtown Denver. It was clear that the organizers from the OpenStack Foundation took the comments from the attendees … Continue reading "Queens PTG Recap"

by ed at September 21, 2017 08:13 PM

Rich Bowen

Event report: OpenStack PTG

Last week I attended the second OpenStack PTG, in Denver. The first one was held in Atlanta back in February.

This is not an event for everyone, and isn’t your standard conference. It’s a working meeting – a developers’ summit at which the next release of the OpenStack software is planned. The website is pretty blunt about who should, and should not, attend. So don’t sign up without knowing what your purpose is there, or you’ll spend a lot of time wondering why you’re there.

I went to do the second installment of my video series, interviewing the various project teams about what they did in the just-released version, and what they anticipate coming in the next release.

The first of these, at the PTG in Atlanta, featured only Red Hat engineers. (Those videos are HERE.) However, after reflection, I decided that it would be best to not limit it, but to expand it to the entire community, focusing on cross-company collaboration, and trying to get as many projects represented as I could.

So, in Denver I asked the various project PTL (project technical leads) to do an interview, or to assemble a group of people from the project to do interviews. I did 22 interviews, and I’m publishing those on the RDO YouTube channel – http://youtube.com/RDOCommunity – just as fast as I can get them edited.

I also hosted an RDO get-together to celebrate the Pike release, and we had just over 60 people show up for that. Thank you all so much for attending! (Photos and video from that coming soon to a blog near you!)

So, watch my YouTube channel, and hopefully by the end of next week I’ll have all of those posted.

I love working with the OpenStack community because they remind me of Open Source in the old days, when developers cared about the project and the community at least as much, and often more, than about the company that happens to pay their paycheck. It’s very inspiring to listen to these brilliant men and women talking about their projects.

by rbowen at September 21, 2017 06:35 PM

OpenStack Superuser

Paths to autonomous workload management at the edge

It’s been estimated that in the next three to five years years the number of connected devices will reach a staggering 50 billion globally.

Even if that number sounds extreme, it’s undeniable that advancements in silicon technology (e.g. shrinking of computing and sensor components) and the evolution of 5G networks will definitely drive the rise of capable edge devices and create much more relevant use cases.

Given that scenario, when it comes to the technology to support it, several associated challenges need to be  identified and addressed first.

The recent OpenDev conference aimed to raise the awareness and foster collaboration in this domain. The topic of autonomous workload management at the edge was one of the working sessions at the conference. It focused on technical constraints, requirements and operational considerations when an application/workload has to be orchestrated at the edge. The main assumption is that several edge computing use cases (e.g. micro-edge/mobile edge such as set-top boxes, terminals etc.) will demand highly autonomous behavior due to connectivity constraints, latency, cost etc. The scale of edge infrastructure might also drive this autonomous behavior of the edge platform while the central management of all these edge devices will be an enormous task. To this end, several operational considerations were discussed as summarized below.

Workload orchestration

When it comes to autonomous workload management at the edge, effective orchestration is the most important issue. No matter if it is on bare metal, virtual machines or application containers, the need for automation and use of software-defined methodologies is apparent. In the NFVi world nowadays there’s a clear tendency towards model-driven declarative approaches such as TOSCA for addressing orchestration. The expectation is that the edge platform should include a functional component responsible not only for the orchestration and management of resources (e.g. VMs or container) but also the running applications or VNFs. Such an entity takes care of runtime healing and scaling as well as provisioning (edge-side or centrally triggered). Even if the goal is the autonomous operation of the workload orchestration, it’s expected that there will be some kind of central orchestration entity (or regional manager) that will still keep a reference of the edge state or drive provisioning of the edge. It feels like the absolute autonomous behavior of a mesh-like edge network is a bit futuristic and difficult to have in the short term, at least at large scale.

State and policy management

Autonomous workload orchestration also implies autonomous state management. In order to effectively orchestrate the state not only of the hosting platform (e.g. the virtual machine or container) has to be captured but also the services or application state should be monitored. Today, most of state management operations are handled by orchestration components (at the resources level) and the applications/VNF vendors themselves. However, there is no combined view of the state which results in pretty primitive fault handling: when the state of a workload is faulty, then the whole resource is restarted. In addition, the Service Function Chaining (SFC) or the applications’ micro-services paradigm introduce a composable state concept which potentially has to be considered. State abstraction is also important: a regional orchestrator might not need to keep the state of all components of an SFC but just an abstracted state of the whole SFC. On the other hand, the edge orchestrator must know the state of each service in the chain. The policy enforcement should also follow the same pattern with the state propagation. All-in-all, the above-mentioned points suggest the need for a more capable state management structure that can tackle these new requirements. Whether the existing orchestrators are responsible for these features or new modules and/or protocols have to be invented is up for discussion.

Managing packages of local repositories

Autonomous operation of the workload orchestration at the edge requires local repositories of images and software packages. Assuming that the connection of the edge device to the core network is either unreliable or very thin, only the control plane operations should be transferred over the air. It’s being suggested that the orchestration systems should explore cached or local repositories that would be synchronized on-demand by some central component. Multicast features for pushing updates to the edge repositories should be considered too, especially if the scale of edge devices increases exponentially.
Even if most of the issues and ideas discussed here are nor new neither unique, we can’t assume that the technologies and systems developed for data center operations can directly solve edge computing use cases. Perhaps it’s best to look at the problem with a fresh perspective and try to architect an edge computing platform that could serve all use cases (from micro to large edge), leveraging existing technology where and when possible but also investing in new.

About the author

Gregory Katsaros is a services, platforms and cloud computing expert who has been working for several years in research and development activities and projects related to the adoption of such technologies and transformation of the industry. Katsaros has special interest in services and resources orchestration as well as network function virtualization and software defined networking technologies.

He holds a Ph.D. from the National Technical University of Athens on “Resource monitoring and management in Service Oriented Infrastructures and Cloud Computing” and has been contributing to research communities with research and code.

In the last few years, he’s  been leading projects related to services and cloud computing, distributed systems, interoperability, orchestration, SDN, NFV transformation and more. He’s interested in transforming the telco and enterprise sector by embracing automation and orchestration technologies, as well as investigating edge computing.

Cover Photo // CC BY NC

The post Paths to autonomous workload management at the edge appeared first on OpenStack Superuser.

by Gregory Katsaros at September 21, 2017 03:19 PM

StackHPC Team Blog

Upgrade to Pike using Kolla and Kayobe

OpenStack Pike

We have previously described a new kind of OpenStack infrastructure, built to combine polymorphic flexibility with HPC levels of performance, in the context of our project with the Square Kilometre Array. To take advanage of OpenStack's latest capabilities, this week we upgraded that infrastructure from Ocata to Pike.

Early on, we took a design decision to base our deployments on Kolla, which uses Docker to containerise the OpenStack control plane, transforming it into something approximating a microservice architecture.

Kolla is in reality several projects. There is the project to define the composition of the Docker containers for each OpenStack service, and then there are the projects to orchestrate the deployment of Docker containers across one or more control plane hosts. This could be done using Kolla-Kubernetes, but our preference is for Kolla-Ansible.

Kolla-Ansible builds upon a set of hosts already deployed and configured up to a baseline level where Ansible can drive the Docker deployment. Given we are typically starting from pallets of new servers in a loading dock, there is a gap to be filled to get from one to the other. For that role, we created Kayobe, loosely defined as "Kolla on Bifrost", and intended to perform a similar role to TripleO, but using only Ironic for the undercloud seed and driven by Ansible throughout. This approach has enabled us to incorporate some compelling features, such as Ansible-driven configuration of BIOS and RAID firmware parameters and Network switch configuration.

There is no doubt that Kayobe has been a huge enabler for us, but what about Kolla? One of the advantages claimed for a containerised control plane is how it simplifies the upgrade process by severing the interlocking package dependencies of different services. This week we put this to the test, by upgrading a number of systems from Ocata to Pike.

This is a short guide to how we did it, and how it worked out...

Have a Working Test Plan

It may seem obvious but it may not an obvious starting point. Make a set of tests to ensure that your OpenStack system is working before you start. Then repeat these tests at any convenient point. By starting with a test plan that you know works, you'll know for sure if you've broken it.

Otherwise in the depths of troubleshooting you'll have a lingering doubt that perhaps your cloud was broken in this way all along...

Preparing the System for Upgrade

We brought the system to the latest on the stable/ocata branch. This in itself shakes out a number of issues. Just how healthy is the kernel and OS on the controller hosts? Is the Netron agents containers spinning looking for lost namespaces? Is the kernel blocking on most cores before spewing out reams of kernel:NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s!

A host in this state is unlikely to succeed in moving one patchset forward, let alone a major OpenStack release.

One of Kolla's strengths is the elimination of dependencies between services. It makes it possible to deploy different versions of OpenStack services without worrying about dependency conflicts. This can be a very powerful advantage.

The ability to update a kolla container forward along the same stable release branch establishes the basic procedure is working as expected. Getting the control plane migrated to the tip of the current release branch is a good precursor to making the version upgrade.

Staging the Upgrade

Take the leap on a staging or development system and you'll be more confident of landing in one piece on the other side. In tests on a development system, we identified and fixed a number of issues that would each have become a major problem on the production system upgrade.

Even a single-node staging system will find problems for you.

For example:

  • During the Pike upgrade, the Docker Python bindings package renames from docker_py to docker. They are mutually exclusive. The python environment we use for Kolla-Ansible must start the process with docker_py installed and at the appropriate point transition to docker. We found a way through and developed Kayobe to perform this orchestration.
  • We carrried forward a piece of work to enable our Kolla logs via Fluentd to go to Monasca, which just made its way upstream.
  • We hit a problem with Kolla-Ansible's RabbitMQ containers generating duplicate entries in /etc/hosts, which we work around while the root cause is investigated.
  • We found and fixed some more issues with Kolla-Ansible pre-checks for both Ironic and Murano.
  • We hit this bug with generating config for mariadb - easily fixed once the problem was identified.

Performing the Upgrade

On the day, at a production scale, new problems can occur that were not exposed at the scale of a staging system.

In a production upgrade, the best results come from bringing all the technical stakeholders together while the upgrade progresses. This enables a team to draw on all the expertise it needs to work through issues encountered.

In production upgrades, we worked through new issues:

That final point should have been found by our test plan, but was not covered (this time). Arguably it should have been found by Kolla-Ansible's CI testing too.

The Early Bird Gets The Worm

Being an early adopter has both benefits and drawbacks. Kolla, Ansible and Kayobe have made it possible to do what we did - successfully - with a small but talented team.

Our users have scientific work to do, and our OpenStack projects exist to support that.

We are working to deliver infrastructure with cutting-edge capabilities that exploit OpenStack's latest features. We are proud to take some credit for our upstream contributions, and excited to make the most of these new powers in Pike.

by Stig Telfer at September 21, 2017 11:00 AM

SUSE Conversations

Nagrania zajęć i prezentacje z Letniej Akademii SUSE

Za nami druga edycja Letniej Akademii SUSE. Podczas zajęć pokazywaliśmy najbardziej interesujące funkcje naszych rozwiązań, ale także jak zacząć z nich korzystać. Mówiliśmy m.in. o konteneryzacji (Kubernetes/MicroOS/Salt), wdrażaniu chmur (OpenStack) i pamięci masowej zdefiniowanej programowo (Ceph), zapewnieniu ciągłości działania (Live Patching) i odpowiedniej konfiguracji serwerów Linux (Salt). W organizację tegorocznej Akademii włączyli się konsultanci SUSE, …

+read more

The post Nagrania zajęć i prezentacje z Letniej Akademii SUSE appeared first on SUSE Blog. Rafal Kruschewski

by Rafal Kruschewski at September 21, 2017 10:37 AM

September 20, 2017

NFVPE @ Red Hat

Ghost Riding The Whip — A complete Kubernetes workflow without Docker, using CRI-O, Buildah & kpod

It is my decree that whenever you are using Kubernetes without using Docker you are officially “ghost riding the whip”, maybe even “ghost riding the kube”. (Well, I’m from Vermont, so I’m more like “ghost riding the combine”). And again, we’re running Kubernetes without Docker, but this time? We’ve got an entire workflow without Docker. From image build, to running container, to inspecting the running containers. Thanks to the good folks from the OCI project and Project Atomic, we’ve got kpod for working with running containers, and we’ve got buildah for building our images. And of course, don’t leave out CRI-O which makes the magic happen to get it all running in Kube without Docker. Fire up your terminals, because you’re about to ghost ride the kube.

by Doug Smith at September 20, 2017 08:00 PM

Major Hayden

Import RPM repository GPG keys from other keyservers temporarily

Keys, but not gpg keysI’ve been working through some patches to OpenStack-Ansible lately to optimize how we configure yum repositories in our deployments. During that work, I ran into some issues where pgp.mit.edu was returning 500 errors for some requests to retrieve GPG keys.

Ansible was returning this error:

curl: (22) The requested URL returned error: 502 Proxy Error
error: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x61E8806C: import read failed(2)

How does the rpm command know which keyserver to use? Let’s use the --showrc argument to show how it is configured:

$ rpm --showrc | grep hkp
-14: _hkp_keyserver http://pgp.mit.edu
-14: _hkp_keyserver_query   %{_hkp_keyserver}:11371/pks/lookup?op=get&search=0x

How do we change this value temporarily to test a GPG key retrieval from a different server? There’s an argument for that as well: --define:

$ rpm --help | grep define
  -D, --define='MACRO EXPR'        define MACRO with value EXPR

We can assemble that on the command line to set a different keyserver temporarily:

# rpm -vv --define="%_hkp_keyserver http://pool.sks-keyservers.net" --import 0x61E8806C
-- SNIP --
D: adding "63deac79abe7ad80e147d671c2ac5bd1c8b3576e" to Sha1header index.
-- SNIP --

Let’s verify that our new key is in place:

# rpm -qa | grep -i gpg-pubkey-61E8806C
gpg-pubkey-61e8806c-5581df56
# rpm -qi gpg-pubkey-61e8806c-5581df56
Name        : gpg-pubkey
Version     : 61e8806c
Release     : 5581df56
Architecture: (none)
Install Date: Wed 20 Sep 2017 10:17:11 AM CDT
Group       : Public Keys
Size        : 0
License     : pubkey
Signature   : (none)
Source RPM  : (none)
Build Date  : Wed 17 Jun 2015 03:57:58 PM CDT
Build Host  : localhost
Relocations : (not relocatable)
Packager    : CentOS Virtualization SIG (http://wiki.centos.org/SpecialInterestGroup/Virtualization) <security@centos.org>
Summary     : gpg(CentOS Virtualization SIG (http://wiki.centos.org/SpecialInterestGroup/Virtualization) <security@centos.org>)
Description :
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: rpm-4.11.3 (NSS-3)

mQENBFWB31YBCAC4dFmTzBDOcq4R1RbvQXLkyYfF+yXcsMA5kwZy7kjxnFqBoNPv
aAjFm3e5huTw2BMZW0viLGJrHZGnsXsE5iNmzom2UgCtrvcG2f65OFGlC1HZ3ajA
8ZIfdgNQkPpor61xqBCLzIsp55A7YuPNDvatk/+MqGdNv8Ug7iVmhQvI0p1bbaZR
0GuavmC5EZ/+mDlZ2kHIQOUoInHqLJaX7iw46iLRUnvJ1vATOzTnKidoFapjhzIt
i4ZSIRaalyJ4sT+oX4CoRzerNnUtIe2k9Hw6cEu4YKGCO7nnuXjMKz7Nz5GgP2Ou
zIA/fcOmQkSGcn7FoXybWJ8DqBExvkJuDljPABEBAAG0bENlbnRPUyBWaXJ0dWFs
aXphdGlvbiBTSUcgKGh0dHA6Ly93aWtpLmNlbnRvcy5vcmcvU3BlY2lhbEludGVy
ZXN0R3JvdXAvVmlydHVhbGl6YXRpb24pIDxzZWN1cml0eUBjZW50b3Mub3JnPokB
OQQTAQIAIwUCVYHfVgIbAwcLCQgHAwIBBhUIAgkKCwQWAgMBAh4BAheAAAoJEHrr
voJh6IBsRd0H/A62i5CqfftuySOCE95xMxZRw8+voWO84QS9zYvDEnzcEQpNnHyo
FNZTpKOghIDtETWxzpY2ThLixcZOTubT+6hUL1n+cuLDVMu4OVXBPoUkRy56defc
qkWR+UVwQitmlq1ngzwmqVZaB8Hf/mFZiB3B3Jr4dvVgWXRv58jcXFOPb8DdUoAc
S3u/FLvri92lCaXu08p8YSpFOfT5T55kFICeneqETNYS2E3iKLipHFOLh7EWGM5b
Wsr7o0r+KltI4Ehy/TjvNX16fa/t9p5pUs8rKyG8SZndxJCsk0MW55G9HFvQ0FmP
A6vX9WQmbP+ml7jsUxtEJ6MOGJ39jmaUvPc=
=ZzP+
-----END PGP PUBLIC KEY BLOCK-----

Success!

If you want to override the value permanently, create a ~/.rpmmacros file and add the following line to it:

%_hkp_keyserver http://pool.sks-keyservers.net

Photo credit: Wikipedia

The post Import RPM repository GPG keys from other keyservers temporarily appeared first on major.io.

by Major Hayden at September 20, 2017 03:24 PM

OpenStack Superuser

Takeaways from the first Diversity Empowerment Summit

LOS ANGELES — The expo hall booths were already packed up, but a few hundred Open Source Summit attendees returned to the JW Marriott to talk about some of the most persistent problems in tech at the first Diversity Empowerment Summit (DES).

The daylong get-together had a broad scope: diversity of gender, race, nationality and religion are all focus areas where initiatives aim to increase awareness and actions around increasing representation in open-source communities.

From game theory to chaos theory

Kate Ertmann kicked off DES by setting the stage for what’s happening when new generations enter the workplace.

“Millenials are making chaos in the workplace,” she said. “Generation Y and Generation Z will not tolerate when there is no diversity in the workplace and inauthentic attempts around being diverse. They will not tolerate when they are not reflected in the workplace.”

Ertmann said this illustrates a shift from game theory to chaos theory and the need to break down the 20th-century workplace to build anew.

Steps include reevaluating family leave, coverage for longer medical situations—particularly among all configurations of families—and flexible work hours. She applauded the onsite childcare offered at the Open Source Summit, which extended an opportunity to parents—and mothers in particular—who previously may have been unable to attend.

When asked by an attendee how to create a more welcoming workplace environment, Ertmann says “just ask. Creating feedback loops is the biggest opportunity to create a more inclusive workplace.”

Chasing Grace

A trailer for the upcoming documentary series “Chasing Grace” was played to a captivated audience. The series was created to share the stories of women who faced adversity and became change agents in the tech workplace. Inspired by pioneering computer scientist Grace Hopper, early sponsors include the Linux Foundation and the Cloud Foundry Foundation.

Producer Jennifer Cloer was joined in a panel by Comcast’s Nithya Ruff, the Cloud Foundry Foundation’s Abby Kearns and ScoutSavvy’s Kathryn Brown.

“My hope is that these stories will surface how women are navigating adversity and provide a blueprint for women who want to join the tech industry,” Cloer said.

When asked about the current culture of the tech industry, both Ruff and Brown encouraged attendees to include women in the conversation at the workplace around benefits, abandoning assumptions and processes that don’t engage women in the discussion.

Ruff did suggest that there has been a shift in conversation.

“The dialogue has changed from trying to change the woman to fit the workplace to changing the workplace to be welcoming for women,” she said.

Research from Chasing Grace indicates that women are twice as likely to leave the tech industry than men due to the lack of inclusivity in the industry, so Cloer asked panelists what has kept them in the industry.

Emphasizing the impact of having diverse voices represented, Ruff referenced a previous presentation by Amy Chen of Rancher Labs, who said that when she didn’t have a seat at the table, she brought her own table.

“I work with incredibly bright people and I get to change the world instead of just being a consumer of the world,” Ruff said. “Some of the biggest changes will come if our voices are just heard.”

Pay it back, pay it forward

Munira Tayabji introduced a concept early in the day that was repeated by several speakers and in the hallway track—remember to pay it back and pay it forward through mentoring and networking events.

Chen credited a female mentor who had helped her land an interview that changed her career.

“I would not be standing up here if I didn’t have others who publicly supported me,” said Chen, echoing Tayabji’s sentiment.

During the “Chasing Grace” panel, Brown credited this notion of mentorship and the opportunity to mentor as the driving force that keeps her dedicated to the tech industry.

“Getting into tech made me realize that I can be that person that can do good things at scale,” she said. “I feel the responsibility to the women who have invested in me, to younger women to be a role model and a responsibility to do something big that positively impacts the world.”

Join the conversation – get involved in mentoring activities through the Linux Foundation or the OpenStack Foundation and learn more about how you can get involved in the conversations around tech diversity.

 

 

Cover Photo // CC BY NC

The post Takeaways from the first Diversity Empowerment Summit appeared first on OpenStack Superuser.

by Allison Price at September 20, 2017 01:42 PM

James Page

OpenStack Charms @ Denver PTG

Last week, myself and a number of the OpenStack Charms team had the pleasure of attending the OpenStack Project Teams Gathering in Denver, Colorado.

The first two days of the PTG where dedicated to cross project discussions, with the last three days focused on project specific discussion and work in dedicated rooms.

Here’s a summary of the charm related discussion over the week.

Cross Project Discussions

Skip Level Upgrades

This topic was discussed at the start of the week, in the context of supporting upgrades across multiple OpenStack releases for operators.  What was immediately evident was this was really a discussion around ‘fast-forward’ upgrades, rather than actually skipping any specific OpenStack series as part of a cloud upgrade.  Deployments would still need to step through each OpenStack release series in turn, so the discussion centred around how to make this much easier for operators and deployment tools to consume than it has been to-date.

There was general agreement on the principles that all steps required to update a service between series should be supported whilst the service is offline – i.e. all database migrations can be completed without the services actually running;  This would allow multiple upgrade steps to be completed without having to start services up on interim steps. Note that a lot of projects all ready support this approach, but its never been agreed as a general policy as part of the ‘supports-upgrade‘ tag which was one of the actions resulting from this discussion.

In the context of the OpenStack Charms, we already follow something along these lines for minimising the amount of service disruption in the control plane during OpenStack upgrades; with implementation of this approach across all projects, we can avoid having to start up services on each series step as we do today, further optimising the upgrade process delivered by the charms for services that don’t support rolling upgrades.

Policy in Code

Most services in OpenStack rely on a policy.{json,yaml} file to define the policy for role based access into API endpoints – for example, what operations require admin level permissions for the cloud. Moving all policy default definitions to code rather than in a configuration file is a goal for the Queens development cycle.

This approach will make adapting policies as part of an OpenStack Charm based deployment much easier, as we only have to manage the delta on top of the defaults, rather than having to manage the entire policy file for each OpenStack release.  Notably Nova and Keystone have already moved to this approach during previous development cycles.

Deployment (SIG)

During the first two days, some cross deployment tool discussions where held for a variety of topics; of specific interest for the OpenStack Charms was the discussion around health/status middleware for projects so that the general health of a service can be assessed via its API – this would cover in-depth checks such as access to database and messaging resources, as well as access to other services that the checked service might depend on – for example, can Nova access Keystone’s API for authentication of tokens etc. There was general agreement that this was a good idea, and it will be proposed as a community goal for the OpenStack project.

OpenStack Charms Devroom

Keystone: v3 API as default

The OpenStack Charms have optionally supported Keystone v3 for some time; The Keystone v2 API is officially deprecated, so we had discussion around approach for switching the default API deployed by the charms going forwards; in summary

  • New deployments should default to the v3 API and associated policy definitions
  • Existing deployments that get upgraded to newer charm releases should not switch automatically to v3, limiting the impact of services built around v2 based deployments already in production.
  • The charms already support switching from v2 to v3, so v2 deployments can upgrade as and when they are ready todo so.

At some point in time, we’ll have to automatically switch v2 deployments to v3 on OpenStack series upgrade, but that does not have to happen yet.

Keystone: Fernet Token support

The charms currently only support UUID based tokens (since PKI was dropped from Keystone); The preferred format is now Fernet so we should implement this in the charms – we should be able to leverage the existing PKI key management code to an extent to support Fernet tokens.

Stable Branch Life-cycles

Currently the OpenStack Charms team actively maintains two branches – the current development focus in the master branch, and the most recent stable branch – which right now is stable/17.08.  At the point of the next release, the stable/17.08 branch is no longer maintained, being superseded by the new stable/XX.XX branch.  This is reflected in the promulgated charms in the Juju charm store as well.  Older versions of charms remain consumable (albeit there appears to be some trimming of older revisions which needs investigating). If a bug is discovered in a charm version from a inactive stable branch, the only course of action is to upgrade the the latest stable version for fixes, which may also include new features and behavioural changes.

There are some technical challenges with regard to consumption of multiple stable branches from the charm store – we discussed using a different team namespace for an ‘old-stable’ style consumption model which is not that elegant, but would work.  Maintaining more branches means more resource effort for cherry-picks and reviews which is not feasible with the currently amount of time the development team has for these activities so no change for the time being!

Service Restart Coordination at Scale

tl;dr no one wants enabling debug logging to take out their rabbits

When running the OpenStack Charms at scale, parallel restarts of daemons for services with large numbers of units (we specifically discussed hundreds of compute units) can generate a high load on underlying control plane infrastructure as daemons drop and re-connect to message and database services potentially resulting in service outages. We discussed a few approaches to mitigate this specific problem, but ended up with focus on how we could implement a feature which batched up restarts of services into chunks based on a user provided configuration option.

You can read the full details in the proposed specification for this work.

We also had some good conversation around how unit level overrides for some configuration options would be useful – supporting the use case where a user wants to enable debug logging for a single unit of a service (maybe its causing problems) without having to restart services across all units to support this.  This is not directly supported by Juju today – but we’ll make the request!

Cross Model Relations – Use Cases

We brainstormed some ideas about how we might make use of the new cross-model relation features being developed for future Juju versions; some general ideas:

  • Multiple Region Cloud Deployments
    • Keystone + MySQL and Dashboard in one model (supporting all regions)
    • Each region (including region specific control plane services) deployed into a different model and controller, potentially using different MAAS deployments in different DC’s.
  • Keystone Federation Support
    • Use of Keystone deployments in different models/controllers to build out federated deployments, with one lead Keystone acting as the identity provider to other peon Keystones in different regions or potentially completely different OpenStack Clouds.

We’ll look to use the existing relations for some of these ideas, so as the implementation of this feature in Juju becomes more mature we can be well positioned to support its use in OpenStack deployments.

Deployment Duration

We had some discussion about the length of time taken to deploy a fully HA OpenStack Cloud onto hardware using the OpenStack Charms and how we might improve this by optimising hook executions.

There was general agreement that scope exists in the charms to improve general hook execution time – specifically in charms such as RabbitMQ and Percona XtraDB Cluster which create and distribute credentials to consuming applications.

We also need to ensure that we’re tracking any improvements made with good baseline metrics on charm hook execution times on reference hardware deployments so that any proposed changes to charms can be assessed in terms of positive or negative impact on individual unit hook execution time and overall deployment duration – so expect some work in CI over the next development cycle to support this.

As a follow up to the PTG, the team is looking at whether we can use the presence of a VIP configuration option to signal to the charm to postpone any presentation of access relation data to the point after which HA configuration has been completed and the service can be accessed across multiple units using the VIP.  This would potentially reduce the number (and associated cost) of interim hook executions due to pre-HA relation data being presented to consuming applications.

Mini Sprints

On the Thursday of the PTG, we held a few mini-sprints to get some early work done on features for the Queens cycle; specifically we hacked on:

Good progress was made in most areas with some reviews already up.

We had a good turnout with 10 charm developers in the devroom – thanks to everyone who attended and a special call-out to Billy Olsen who showed up with team T-Shirts for everyone!

We have some new specs already up for review, and I expect to see a few more over the next two weeks!

EOM


by JavaCruft at September 20, 2017 10:51 AM

Lee Yarwood

OpenStack - Fast-forward upgrades - Report

http://lists.openstack.org/pipermail/openstack-dev/2017-September/122347.html My thanks again to everyone who attended and contributed to the skip-level upgrades track over the first two days of last weeks PTG. I’ve included a short summary of our discussions below with a list of agreed actions for Queens at the end. tl;dr s/skip-level/fast-forward/g https://etherpad.openstack.org/p/queens-PTG-skip-level-upgrades Monday Busy morning in the skip-level upgrades room! #OpenStackPTG #BOFF pic.twitter.com/AzPiBnn3Te — Lee Yarwood (@lyarwood_) September 11, 2017 During our first session we briefly discussed the history of the skip-level upgrades effort within the community and the various misunderstandings that have arisen from previous conversations around this topic at past events.

September 20, 2017 10:41 AM

September 19, 2017

RDO

Recent blog posts

It's been a few weeks since I did one of these blog wrapups, and there's been a lot of great content by the RDO community recently.

Here's some of what we've been talking about recently:

Project Teams Gathering (PTG) report - Zuul by tristanC

The OpenStack infrastructure team gathered in Denver (September 2017). This article reports some of Zuul's topics that were discussed at the PTG.

Read more at http://rdoproject.org/blog/2017/09/PTG-report-zuul/

Evaluating Total Cost of Ownership of the Identity Management Solution by Dmitri Pal

Increasing Interest in Identity Management: During last several months I’ve seen a rapid growth of interest in Red Hat’s Identity Management (IdM) solution. This might have been due to different reasons.

Read more at http://rhelblog.redhat.com/2017/09/18/evaluating-total-cost-of-ownership-of-the-identity-management-solution/

Debugging TripleO Ceph-Ansible Deployments by John

Starting in Pike it is possible to use TripleO to deploy Ceph in containers using ceph-ansible. This is a guide to help you if there is a problem. It asks questions, somewhat rhetorically, to help you track down the problem.

Read more at http://blog.johnlikesopenstack.com/2017/09/debug-tripleo-ceph-ansible.html

Make a NUMA-aware VM with virsh by John

Grégory showed me how he uses virsh edit on a VM to add something like the following:

Read more at http://blog.johnlikesopenstack.com/2017/09/make-numa-aware-vm-with-virsh.html

Writing a SELinux policy from the ground up by tristanC

SELinux is a mechanism that implements mandatory access controls in Linux systems. This article shows how to create a SELinux policy that confines a standard service:

Read more at http://rdoproject.org/blog/2017/09/SELinux-policy-from-the-ground-up/

Trick to test external ceph clusters using only tripleo-quickstart by John

TripleO can stand up a Ceph cluster as part of an overcloud. However, if all you have is a tripleo-quickstart env and want to test an overcloud feature which uses an external Ceph cluster, then can have quickstart stand up two heat stacks, one to make a separate ceph cluster and the other to stand up an overcloud which uses that ceph cluster.

Read more at http://blog.johnlikesopenstack.com/2017/09/trick-to-test-external-ceph-clusters.html

RDO Pike released by Rich Bowen

The RDO community is pleased to announce the general availability of the RDO build for OpenStack Pike for RPM-based distributions, CentOS Linux 7 and Red Hat Enterprise Linux. RDO is suitable for building private, public, and hybrid clouds. Pike is the 16th release from the OpenStack project, which is the work of more than 2300 contributors from around the world (source).

Read more at http://rdoproject.org/blog/2017/09/rdo-pike-released/

OpenStack Summit Sydney preview: Red Hat to present at more than 40 sessions by Peter Pawelski, Product Marketing Manager, Red Hat OpenStack Platform

The next OpenStack Summit will take place in Sydney, Australia, November 6-8. And despite the fact that the conference will only run three days instead of the usual four, there will be plenty of opportunities to learn about OpenStack from Red Hat’s thought leaders.

Read more at http://redhatstackblog.redhat.com/2017/08/31/openstack-summit-fall2017-preview/

Scheduled snapshots by Tim Bell

While most of the machines on the CERN cloud are configured using Puppet with state stored in external databases or file stores, there are a few machines where this has been difficult, especially for legacy applications. Doing a regular snapshot of these machines would be a way of protecting against failure scenarios such as hypervisor failure or disk corruptions.

Read more at http://openstack-in-production.blogspot.com/2017/08/scheduled-snapshots.html

Ada Lee: OpenStack Security, Barbican, Novajoin, TLS Everywhere in Ocata by Rich Bowen

Ada Lee talks about OpenStack Security, Barbican, Novajoin, and TLS Everywhere in Ocata, at the OpenStack PTG in Atlanta, 2017.

Read more at http://rdoproject.org/blog/2017/08/ada-lee-openstack-security-barbican-novajoin-tls-everywhere-in-ocata/

Octavia Developer Wanted by assafmuller

I’m looking for a Software Engineer to join the Red Hat OpenStack Networking team. I am presently looking to hire in Europe, Israel and US East. The candidate may work from home or from one of the Red Hat offices. The team is globally distributed and comprised of talented, autonomous, empowered and passionate individuals with a healthy work/life balance. The candidate will work on OpenStack Octavia and LBaaS. The candidate will write and review code while working with upstream community members and fellow Red Hatters. If you want to do open source, Red Hat is objectively where it’s at. We have an institutional culture of open source at all levels and this has a ripple effect on your day to day and your career at the company.

Read more at https://assafmuller.com/2017/08/18/octavia-developer-wanted/

by Rich Bowen at September 19, 2017 03:50 PM

OpenStack Superuser

Building a collaborative community around open source

LOS ANGELES — In an ant colony,  members work together to create a thriving superorganism that’s exponentially more powerful than any single contributor. The same can be said for open source where vast networks of global members are transforming technology, science and culture.  This is one of the most striking takeaways from the first Open Source Summit held recently in Los Angeles and hosted by the Linux Foundation.

Hundreds of open-source enthusiasts, developers and end users gathered to discuss the principles of open source and how they can be applied to foster collaboration, innovation and solve real- world problems like the amount of water bottles that end up in the ocean. Under the umbrella Open Source Summit, the conference organized content into five sub-events: CloudOpen, ContainerCon, LinuxCon, the Open Community Conference and the Diversity Empowerment Summit.

Jim Zemlin, executive director of the Linux Foundation, kicked off the event and acted as emcee for the keynotes. Setting the stage for the Summit’s theme “inspiration everywhere,”  Zemlin hit on the explosive growth of open source.

“Open source isn’t growing – it’s actually accelerating exponentially in terms of its influence in technology and society,” Zemlin said, citing 23 million open source developers worldwide and 41 billion lines of code to back up his point.

You can’t have such an explosive growth of open source without a few challenges, however. Zemlin said the biggest bottleneck to open source growth in organizations is that they don’t know how to participate in open source. Employees need training on how to thrive in the “colony.”

“Projects with sustainable ecosystems are the ones that matter,” Zemlin said. “Successful projects depend on members, developers, standards and infrastructure to develop products that the market will adopt.”

How to build and measure success in open source 

After establishing the what—organizations must be involved and contributing to open source in order for the project to be successful—Zemlin answered the how by introducing the Open Source Guides for the Enterprise. Developed with the TODO Group, this resource includes best practices on starting with open source in your organization, including how to create an open source program and tools for managing open source.

To gauge the success of these open source initiatives, Zemlin also announced the Linux Foundation’s newest project, CHAOSS, Community Health Analytics Open Source Software.

Achieved through collaboration of multiple organizations and communities like the OpenStack Foundation, Bitergia, Mozilla, the University of Missouri and more, CHAOSS will focus on creating analytics and metrics to help define community health.

Grow a community, not a crowd

On Tuesday, Joseph Gordon-Levitt, director, entrepreneur and most familiar to tech folks for his recent role in  “Snowden,” took the keynote stage to talk about the parallels between open source culture and his online collaborative production company HitRecord.

Although some of his insights were more focused on the entertainment industry, Gordon-Levitt made two key relatable points about growing a functioning community: Instead of thinking about the crowd, think about community, and instead of socializing, collaborate.

“Every member of a community is a unique individual,” he said. “The strength of a community is less about the quantity of people and more about the quality of people and their interactions.”

Around collaboration, he said the strength of people connecting online to work towards a common goal, whether it’s on HitRecord or GitHub—hitting on the importance of collaboration that Zemlin discussed the previous day.

“As we move forward in the future, what kind of impact will community have?” Gordon-Levitt said. “If people can do more than just connect but also collaborate, I start to feel optimistic about the future. It’s not about about the capabilities of the technology, it’s what the technology can help us do as human beings.”

Zemlin and Gordon-Levitt merely set the stage for a week centered around the power of open source, collaboration and the power that a community can have, but breakouts throughout the week continued to share real examples of how what open source collaboration can achieve.

Nithya Ruff from Comcast and Duane O’ Brien from PayPal presented the value of building open source departments within their own company, Joe Leaver from Mercedes-Benz Research and Development talked about the power of collaborating with a new open source partner, composure.ai to do research around autonomous driving and Chris Price from Ericsson discussed the value and impact of working with various open source communities like ONAP, OPNV and OpenStack to accelerate the shift from 4G to 5G networks.

For more about the inspiration of collaboration from the Open Source Summit, you can catch the lineup of keynote videos online and stay tuned on Superuser for more event coverage.

Cover Photo // CC BY NC

The post Building a collaborative community around open source appeared first on OpenStack Superuser.

by Allison Price at September 19, 2017 03:36 PM

RDO

Project Teams Gathering (PTG) report - Zuul

The OpenStack infrastructure team gathered in Denver (September 2017). This article reports some of Zuul's topics that were discussed at the PTG.

For your reference, I highlighted some of the new features comming in the Zuul version 3 in this article.

Cutover and jobs migration

The OpenStack community grew a complex set of CI jobs over the past several years, that needs to be migrated. A zuul-migrate script has been created to automate the migration from the Jenkins-Jobs-Builder format to the new Ansible based job definition. The migrated jobs are prefixed with "-legacy" to indicate they still need to be manually refactored to fully benefit from the ZuulV3 features.

The team couldn't finish the migration and disable the current ZuulV2 services at the PTG because the jobs migration took longer than expected. However, a new cutover attemp will occur in the next few weeks.

Ansible devstack job

The devstack job has been completely rewritten to a fully fledged Ansible job. This is a good example of what a job looks like in the new Zuul:

A project that needs a devstack CI job needs this new job definition:

- job:
    name: shade-functional-devstack-base
    parent: devstack
    description: |
      Base job for devstack-based functional tests
    pre-run: playbooks/devstack/pre
    run: playbooks/devstack/run
    post-run: playbooks/devstack/post
    required-projects:
      # These jobs will DTRT when shade triggers them, but we want to make
      # sure stable branches of shade never get cloned by other people,
      # since stable branches of shade are, well, not actually things.
      - name: openstack-infra/shade
        override-branch: master
      - name: openstack/heat
      - name: openstack/swift
    roles:
      - zuul: openstack-infra/devstack-gate
    timeout: 9000
    vars:
      devstack_localrc:
        SWIFT_HASH: "1234123412341234"
      devstack_local_conf:
        post-config:
          "$CINDER_CONF":
            DEFAULT:
              osapi_max_limit: 6
      devstack_services:
        ceilometer-acentral: False
        ceilometer-acompute: False
        ceilometer-alarm-evaluator: False
        ceilometer-alarm-notifier: False
        ceilometer-anotification: False
        ceilometer-api: False
        ceilometer-collector: False
        horizon: False
        s-account: True
        s-container: True
        s-object: True
        s-proxy: True
      devstack_plugins:
        heat: https://git.openstack.org/openstack/heat
      shade_environment:
        # Do we really need to set this? It's cargo culted
        PYTHONUNBUFFERED: 'true'
        # Is there a way we can query the localconf variable to get these
        # rather than setting them explicitly?
        SHADE_HAS_DESIGNATE: 0
        SHADE_HAS_HEAT: 1
        SHADE_HAS_MAGNUM: 0
        SHADE_HAS_NEUTRON: 1
        SHADE_HAS_SWIFT: 1
      tox_install_siblings: False
      tox_envlist: functional
      zuul_work_dir: src/git.openstack.org/openstack-infra/shade

This new job definition simplifies a lot the devstack integration tests and projects now have a much more fine grained control over their integration with the other OpenStack projects.

Dashboard

I have been working on the new zuul-web interfaces to replace the scheduler webapp so that we can scale out the REST endpoints and prevent direct connections to the scheduler. Here is a summary of the new interfaces:

  • /tenants.json : return the list of tenants,
  • /{tenant}/status.json : return the status of the pipelines,
  • /{tenant}/jobs.json : return the list of jobs defined, and
  • /{tenant}/builds.json : return the list of builds from the sql reporter.

Moreover, the new interfaces enable new use cases, for example, users can now:

  • Get the list of available jobs and their description,
  • Check the results of post and periodic jobs, and
  • Dynamically list jobs' results using filters, for example, the last tripleo periodic jobs can be obtained using:
$ curl ${TENANT_URL}/builds.json?project=tripleo&pipeline=periodic | python -mjson.tool
[
    {
        "change": 0,
        "patchset": 0,
        "id": 16,
        "job_name": "periodic-tripleo-ci-centos-7-ovb-ha-oooq",
        "log_url": "https://logs.openstack.org/periodic-tripleo-ci-centos-7-ovb-ha-oooq/2cde3fd/",
        "pipeline": "periodic",
		...
    },
    ...
]

OpenStack health

The openstack-health service is likely to be modified to better interface with the new Zuul design. It is currently connected to an internal gearman bus to receive job completion events before running the subunit2sql process.

This processing could be rewritten as a post playbook to do the subunit processing as part of the job. Then the data could be pushed to the SQL server with the credencials stored in a Zuul's secret.

Roadmap

The last day, even though most of us were exhausted, we spend some time discussing the roadmap for the upcoming months. While the roadmap is still being defined, here are some hilights:

  • Based on new user's walkthrough, the documentation will be greatly improved, For example see this nodepool contribution.
  • Jobs will be able to return structured data to improve the reporting. For example a pypi publisher may return the published url. Similarly, a rpm-build job may return the repository url.
  • Dashboard web interface and javascript tooling,
  • Admin interface to manually trigger unique build or cancel a buildset,
  • Nodepool quota to improve performances,
  • Cross source dependencies, for example a github change in Ansible could depends-on a gerrit change in shade,
  • More Nodepool drivers such as Kubernetes or AWS, and
  • Fedmsg and mqtt zuul driver for message bus repporting and trigger source.

In conclusion, the ZuulV3 efforts were extremly fruitful and this article only covers a few of the design sessions. Once again, we have made great progress and I'm looking forward to further developments. Thanks you all for the great team gathering event!

by tristanC at September 19, 2017 12:42 PM

September 18, 2017

OpenStack Blog - Swapnil Kulkarni

OpenStack PTG Denver – Day 5

Day 5 of PTG started as day for hackathons, general project/cross-project discussion for most project teams with many people left from PTG and few preparing for their travel plans or site-seeing in Colarado. The kolla team started the day with alternate Dockerfile build tool review. Later in the day was something everything in OpenStack and containers community was looking forward to the OpenStack – Kubernets SIG with Chris Hodge leading the effort to get everyone interested in same room. Some key targets for the release were identified including contributors interested. We then had most pressing issue for all deployment projects based on containers, the build and publishing pipeline for kolla images with openstack-infra team. Most of the current requirements, needs and blocking points were identified for rolling this feature. The kolla team and openstack infra team will work together to get this rolling in the starting phase of this cycle once zuul v3 rollout stabelizes. The kolla team ended day early for some much needed buzz for the whole week’s work at Station 26.

 

This is all from this edition of PTG see you next at Dublin.

by Swapnil Kulkarni at September 18, 2017 05:02 PM

OpenStack Superuser

Interested in reading up on OpenStack? Check out these books

The OpenStack Marketplace — your one-stop shop for training, distros, private-cloud-as-a-service and more — is now offering a selection of technical publications, too. The listings are not affiliate links, but offered as a way to highlight the efforts of community members.

Under the new “books” heading, you’ll find titles by Stackers including “Mastering OpenStack,” “OpenStack for Architects,” “OpenStack Networking Essentials,” and “OpenStack: Building a Cloud Environment.”

Are you the author of a book on OpenStack? To have your book included in the Marketplace, email ecosystemATopenstack.org and remember to reach out to editorATopenstack.org for a profile of your work — we love to feature books!

The post Interested in reading up on OpenStack? Check out these books appeared first on OpenStack Superuser.

by Superuser at September 18, 2017 11:02 AM

September 15, 2017

OpenStack Superuser

How to rock dirty clouds done cheap

Matthew Treinish is part of IBM’s developer advocacy team and has been an active contributor to OpenStack since the Folsom cycle. He was previously a member of the OpenStack TC (technical committee) and a OpenStack QA program PTL (project technical lead). This post first appeared on his blog.

I gave a presentation at the OpenStack Summit in Boston with the same title. You can find a video of the talk here: https://www.openstack.org/videos/boston-2017/dirty-clouds-done-dirt-cheap.

This blog post will cover the same project, but it will go into a lot more detail, which I couldn’t cover during the presentation.

Just a heads up, this post is long!  I try to cover every step of the project with all the details I could remember. It probably would have made sense to split things up into multiple posts, but I wrote it in a single sitting and doing that felt weird. If you’re looking for a quicker overview, I recommend watching the video instead.

The project scope

When I was in college, I had a part time job as a sysadmin at a HPC research lab in the aerospace engineering department. In that role I was responsible for all aspects of the IT in the lab, from the workstations and servers to the HPC clusters. In that role I often had to deploy new software with no prior knowledge about it. I managed to muddle through most of the time by reading the official docs and frantically google searching when I encountered issues.

Since I started working on OpenStack I often think back to my work in college and wonder if I had been tasked with deploying an OpenStack cloud back then would I have been able to? As a naive college student who had no knowledge of OpenStack would I have been successful in trying to deploy OpenStack by myself? Since I had no knowledge  of configuration management (like puppet or chef) back then I would have gone about it by installing everything by hand. Basically the open question from that idea is how hard is it actually to install OpenStack by hand  using the documentation and google searches?

Aside from the interesting thought exercise I also have wanted a small cloud at home for a couple of reasons. I maintain a number of servers at home that run a bunch of critical infrastructure. For some time I’ve wanted to virtualize my home infrastructure mainly just for the increased flexibility and potential reliability improvements.  Running things off a residential ISP and power isn’t the best way to run a server with a decent uptime.  Besides virtualizing some of my servers it would be nice to have the extra resources for my upstream OpenStack development, I often do not have the resources available to me for running devstack or integration tests locally and have to rely on upstream testing.

So after the Ocata release I decided to combine these 2 ideas and build myself a small cloud at home. I would do it by hand (ie no automation or config management) to test out how hard it would be. I set myself a strict budget of $1500 USD (the rough cost of my first desktop computer in middle school, an IBM Netvista A30p that I bought with my Bar Mitzvah money) to acquire hardware. This was mostly just a fun project for me so I didn’t want to spend an obscene amount of money. $1500 USD is still a lot of money, but it seemed like a reasonable amount for the project.

However, I decided to take things a step further than I originally planned and build the cloud using the release tarballs from http://tarballs.openstack.org/. My reasoning behind this was to test out how hard it would be to take the raw code we release as a community and turn that into a working cloud. It basically invalidated the project as a test for my thought exercise of deploying the cloud if I was back in college (since I definitely would have just used my Linux distro’s packages back then) but it made the exercise more relevant for me personally as an upstream OpenStack developer. It would give me insight as to where what we’re there are gaps in our released code and how we could start to fix them.

Building the Cloud

Acquiring the Hardware

The first step for building the cloud was acquiring the hardware. I had a very tight budget and it basically precluded buying anything new. The cheapest servers you can buy from a major vendor would pretty much eat up my budget for a single machine. I also considered building a bunch of cheap desktops for the project and putting those together as a cloud. (I didn’t actually need server class hardware for this cloud) But for the cost the capacity was still limited. Since I was primarily building a compute cloud to provide me with a pool of servers to allocate My first priority was the number of CPU cores in the cloud. This would give me the flexibility to scale any applications I was running on it. With that in mind I decided on the priority list for the hardware of:

  1. Number of Cores
  2. Amount of RAM
  3. Speed of CPU

The problem with building with desktop CPUs is (at the time I was assembling pieces) the core count / USD was not really that high for any of the desktop processors. Another popular choice for home clouds is the Intel NUCs, but these suffer from the same problem. The NUCs use laptop processors and while reasonably priced you’re still only getting a dual or quad core CPU for a few hundred dollars.

It turns out the best option I found for my somewhat bizarre requirements was to buy used hardware. A search of eBay shows a ton of servers from 8 or 9 years ago that are dirt cheap. After searching through my various options I settled on old Dell PowerEdge R610, which was a dual socket machine. The one I ordered came with 2 Intel Xeon E5540 CPUs in it. This gave me a total of 8 physical cores (or 16 virtual cores if you count HyperThreading/SMT) The machines also came with 32 GB of RAM and 2x 149GB SAS hard drives. The best part though was that each machine was only $215.56 USD. This gave me plenty of room in the budget, so I bought 5 of them. After shipping this ended up costing only $1,230.75. That gave me enough wiggle room for the other parts I’d need to make everything working. The full hardware specs from the eBay listing was:

Although, the best part about these servers were that I actually had a rack full of basically the same exact servers at the lab in college. The ones I had back in 2010  were a little bit slower and had half the RAM, but were otherwise the same. I configured those servers as a small  HPC cluster my last year at college, so I was very familiar with them. Although back then those servers were over 10x the cost as what I was paying for them on eBay now.

The only problem with this choice was the hardware, the Xeon E5540 is incredibly slow by today’s standards. But, because of my limited budget speed was something I couldn’t really afford.

Assembling the Hardware

After waiting a few days the servers were delivered.  That was a fun day, the FedEx delivery person didn’t bother to ring the door bell. Instead I heard big thud outside and found that they had left all the  boxes in front of my apartment door. Fortunately I was home and heard them throw the boxes on the ground, because it was raining that day. Leaving my “new” servers out in the rain all day would have been less than an ideal way to start the project . It also made quite the commotion and several of my neighbors came out to see what was going on and watched me as I took the boxes inside.

After getting the boxes inside my apartment and unboxed, I stacked them on my living room table:

My next problem with this was where to put the servers and how to run them. I looked at buying a traditional rack, however they were a bit too pricey. (even on eBay) Just a 10U rack looked like it would cost over $100 USD and after shipping that wouldn’t leave me too much room if I needed something else. So I decided not to go that route. Then I remembered hearing about something called a LackRack a few years ago. It turns out the IKEA Lack table has a 19 inch width between the legs which is the same as a rack. They also only cost $9.99 which made it a much more economical choice compared to a more traditional rack. However, while I could just put the table on the floor and be done with it, I was planning to put the servers in my “data closet” (which is just a colorful term for my bedroom closet where I store servers and clothing) but I didn’t want to deal with having to pick up the “rack” every time I needed to move it. So I decided to get some casters and mount them to the table so I could just push the server around.

Once I got the table delivered, which took a surprisingly long time, I was able to mount he casters and rack the servers. As I put each server on the table I was able to test each of them out. (I only had a single power cable at the time, so I went one at a time) It turns out that each server was slightly different from the description and had several issues:

  • 4x8GB of RAM not 8x4GB
  • Memory installed in wrong slots
  • Dead RAID controller battery
  • Came with 15k RPM hard drives not 10k RPM

Also, the company that is “refurbishing” these old servers from whatever datacenter threw them away totally strips the servers down to the minimum possible unit. For example, the management controller was removed, as was the redundant power supply. Both of these were standard feature from Dell when these servers were new. Honestly, it makes sense, the margins on reselling old servers can’t be very high so the company is trying to make a little profit. I also really didn’t need anything they took out as long as the servers still booted.  (although that management controller would have been nice)

Once I put all 5 servers on the rack:
After getting everything mounted on the rack it turns out I also needed a bunch of cables and another power strip to power all 5 at once. So I placed an order with Monoprice for the necessary bits and once they arrived I wired everything up in the data closet:

After everything was said and done the final  bill of materials for all the hardware was:

Installing the Operating System

After getting the working set of hardware the next step was to install the operating system on the servers.  As I decided in the original project scope I was planning to follow the official install guide as much as possible.  My operating system choice would therefore be dictated by those covered in the guide, the 3 Linux distributions documented were OpenSUSE/SLES, RHEL/CentOS, and Ubuntu. Of those the 3 my personal choice was Ubuntu which I personally find the easiest to deal with out of the choices. Although, looking back on it now if I were to do an install during job college I definitely would of have used RHEL. Georgia Tech had a site license for RHEL and a lot of software we had commercial licenses for only had support on RHEL. But, my preference today between those 3 options is to use Ubuntu.

I created a boot usb stick for Ubuntu Server 16.04 and proceeded to do a basic install on each server. (one at a time) The install itself just used the defaults, the only option I made sure was present was the OpenSSH server. This way once I finished the initial install I didn’t have to sit in front of the server to do anything.  I would just install any other packages I needed after the install from the comfort of my home office.  For the hostname I picked altocumulus because I think clouds should be named after clouds. Although, after I finished the project I got a bunch of better suggestions for the name like closet-cloud or laundry-cloud.

It’s worth pointing out that if the servers had come with the management controller installed this step would have been a lot easier. I could have just used that to mount the installer image and ran everything from the virtual console. I wouldn’t have had to sit in front of each server to start the install. But despite this it only took an hour or so to perform the install on all the servers. With the installs complete it was time to start the process of putting OpenStack on each server and creating my cloud.

Installing OpenStack

With the operating system installed it’s time to start the process of building the servers out. Given my limited hardware capacity, just 40 physical cores and 160GB of RAM, I decided that I didn’t want to sacrifice 1/5 of that capacity for a dedicated controller node. So I was going to setup the controller as a compute node as well.  My goal for this project was to build a compute cloud, so all I was concerned about was installing the set of OpenStack projects required to achieve this. I didn’t have a lot of storage (the 2 149GB disks came configured out of the box with RAID 1 and I never bothered to change that) so providing anything more than ephemeral storage for the VMs wasn’t really an option.

OpenStack is a large project with a ton of different projects, (the complete list of official projects can be found here) But, I find some people have trouble figuring out exactly where to get started or for configuration X where to get started. The OpenStack Foundation actually has a page with a bunch of sample service selections by application. The OpenStack Technical Committee also maintains a list of projects needed for the compute starter kit which was exactly what I was looking for. The only potential problem is the discoverability of that information.  It kinda feels like a needle in the haystack if you don’t know where to look.

It also turns out the install guide is mostly concerned with building a basic compute cloud (it also includes using cinder for block storage, but I just skipped that step) so even if I didn’t know the components I needed I would have been fine just reading the docs The overview section of the docs covers this briefly, but doesn’t go into much detail.

The basic service configuration I was planning to go with was:

With a rough idea of how I was planning to setup the software I started following the install guide on setting up the server.  https://docs.openstack.org/ocata/install-guide-ubuntu/environment.html# walks you through setting up all the necessary Operating System level pieces like configuring the networking interfaces and NTP. It also goes over installing and configuring  the service prerequisites like MySQL, RabbitMQ, and memcached. For this part I actually found the docs really easy to follow and very useful. Things were explained clearly and mostly it was just copy and paste the commands to set things up. But, I never felt like I was blindly doing anything for the base setup.

Installing and Configuring the OpenStack Components

After getting the environment for running OpenStack configured it was time to start installing the OpenStack components. Keystone is a requirement for all the other OpenStack services so you install this first. This is where I hit my first issue because I decided to use the release tarballs for the install. The install guide assumes you’re using packages from your Linux distribution to install OpenStack. So when I got to the second step in the Installing Keystone section of the install guide it said run “apt install keystone” which I didn’t want to do. (although it definitely would have made my life easier if I did)

Installing From Tarballs

It turns out there isn’t actually any documentation anywhere that concisely explains the steps required to installing an OpenStack component on your system from source. I started doing searching on Google to try and find any guides. The first hit was a series of blog posts on the Rackspace  developer blog on installing OpenStack from source. However, a quick look at this showed this was quite out of date, especially for the latest version of OpenStack, Ocata, which I was deploying. Also, some of the steps documented there conflicted with the configuration recommended in the install guide. The other searches I found recommended that you look at devstack or use automation project X to accomplish this goal.  Both of these were outside the scope of what I wanted to do for this project. So for the tarball install step I decided to ignore the premise of just following the install guide and just used my experience working on OpenStack to do the following steps to install the projects:

  1. Download the service tarballs. I found the releases page has a good index by project to get the latest tarball for each project. Then extract that tarball to an easy remember location. (I created a tarballs directory in my home directory to store them all)
  2. Create the service user for each project. I ran:
    useradd -r -M $service
  3. Create the /etc and /var/lib directories for the service. For example on the controller node I used the following for loop in bash to do this for all the services:
    for proj in keystone glance nova neutron ; do
        sudo mkdir /etc/$proj
        sudo mkdir /var/lib/$proj
        sudo chown R $proj:$proj /etc/$proj /var/lib/$proj

    done

  4. Install the binary package requirements for the project. This is things like libvirt for nova, or libssl. Basically anything you need to have installed to either build the python packages or to run the service. The problem here is that for most projects this is not documented anywhere. Most of the projects include a bindep.txt which can be used with the bindep project (or just manually read like I did) to show the distro package requirements on several distributions, but few projects use it for this. Instead it’s often just used for just the requirements for setting up a unit (or functional) test environment. I also didn’t find it in any of the developer documentation for the projects. This means you’re probably stuck with trial and error here. When you get to step 6 below it will likely fail with an error that a library header is missing and you’ll need to find the package and install that. Or when you run the service something it’s calling out to is missing and you’ll have errors in the service log until you install that missing dependency and restart the service.
  5. Copy the data files from etc/  in the tarball into /etc/$service for the project. The python packaging ecosystem does not provide a way for packages to install anything outside of the python lib/ directories. This means that to install the required configuration files (like policy.json files or api paste.ini files) have to be copied manually from the tarball.
  6. After you do all of those steps you can use pip to install the tarball. One thing to note here is that you want to use constraints when you run pip. This is something I forgot installing the first few services and it caused me a ton of headaches later in the process. You can avoid all of those potential problems up front by just running:

    pip install -U -c “https://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=stable/ocata” $PATH_TO_EXTRACTED_TARBALL

If you’re using a different OpenStack release just replace “ocata” at the end of the url with that release name.

I wrote down these steps after I did the install mostly based on all of the issues I had during the install process. As you read through the rest of this post most of the issues I encountered could have been completely avoided if I did all of these up front.

It’s also worth noting that all of these steps are provided by the distro packages for OpenStack. This is exactly the role that packaging plays for users, and I was just going through the motions here because I decided to use tarballs. Python packages aren’t really designed for use in systems software and have a lot of limitations beyond the basic case of: put my python code in the place where python code lives. If you want more details on this Clark Boylan gave a good talk on this topic at the OpenStack summit in Boston.

I have also been trying to make a push to start documenting these things in the project developer docs so it’s not an exercise in misery for anyone else wanting to install from source. But, I’ve been getting push back on this because most people seem to feel like it’s a low priority and most people will just use packages. (and packagers seem to have already figured out the pattern for building things)

Creating systemd unit files

One thing that isn’t strictly a requirement when installing from source is creating systemd unit files. (or init scripts if you’re lucky enough to have a distro that still supports using SysV init) Creating a systemd unit file for each daemon process you’ll be running is helpful so you don’t have to manually run the command for each daemon. When I built the cloud I created a unit file for each daemon I ran on both the controller as well as all of the compute nodes. This enabled me to configure each service to start automatically on boot, but also encode the command for starting the daemons, so I could treat it like any other service running on the system. This is another thing that distro packages provide for you, but you’ll have to do yourself when building from source.

For an example this is the contents of my nova-api systemd unit file which I put in /etc/systemd/system/nova-api.service:

[Unit]
Description=OpenStack Nova API
After=network.target
[Service]
ExecStart=/usr/local/bin/novaapi configfile /etc/nova/nova.conf
User=nova
Group=nova
[Install]
WantedBy=multiuser.target
All the other service follow this same format, except for anything running under uwsgi (like keystone, more on that in the next section) , but you can refer to the uwsgi docs for more information on that.

Configuring Keystone

With the formula worked out for how to install from tarball I was ready to continue following the install guide.  The only other issue I had was setting up running the wsgi script under apache. By default keystone ships as a wsgi script that requires a web server to run it. The install guide doesn’t cover this because the distro packages will do the required setup for you. But, because I was installing from tarballs I had to figure out how to do this myself. Luckily the keystone docs provide a guide on how to do this, and include sample config files in the tarball. The rest of configuring keystone was really straightforward, the keystone.conf only required 2 configuration options. (one for the database connection info and the other for the token type) After setting those I had to run a handful of commands to update the database schema and then populate it with some initial data. It’s not worth repeating all the commands here, since you can just read the keystone section of the install guide. In my case I did encounter one issue when I first started the keystone service. I hit a requirements mismatch which prevent keystone from starting:

20170329 15:27:01.478 26833 ERROR keystone Traceback (most recent call last):
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/bin/keystone-wsgi-admin”, line 51, in <module>
20170329 15:27:01.478 26833 ERROR keystone     application = initialize_admin_application()
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/keystone/server/wsgi.py”, line 132, in initialize_admin_application
20170329 15:27:01.478 26833 ERROR keystone     config_files=_get_config_files())
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/keystone/server/wsgi.py”, line 69, in initialize_application
20170329 15:27:01.478 26833 ERROR keystone     startup_application_fn=loadapp)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/keystone/server/common.py”, line 50, in setup_backends
20170329 15:27:01.478 26833 ERROR keystone     res = startup_application_fn()
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/keystone/server/wsgi.py”, line 66, in loadapp
20170329 15:27:01.478 26833 ERROR keystone     ‘config:%s’ % find_paste_config(), name)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/keystone/version/service.py”, line 53, in loadapp
20170329 15:27:01.478 26833 ERROR keystone     controllers.latest_app = deploy.loadapp(conf, name=name)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 247, in loadapp
20170329 15:27:01.478 26833 ERROR keystone     return loadobj(APP, uri, name=name, **kw)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 272, in loadobj
20170329 15:27:01.478 26833 ERROR keystone     return context.create()
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 710, in create
20170329 15:27:01.478 26833 ERROR keystone     return self.object_type.invoke(self)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 144, in invoke
20170329 15:27:01.478 26833 ERROR keystone     **context.local_conf)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/util.py”, line 55, in fix_call
20170329 15:27:01.478 26833 ERROR keystone     val = callable(*args, **kw)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/urlmap.py”, line 31, in urlmap_factory
20170329 15:27:01.478 26833 ERROR keystone     app = loader.get_app(app_name, global_conf=global_conf)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 350, in get_app
20170329 15:27:01.478 26833 ERROR keystone     name=name, global_conf=global_conf).create()
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 362, in app_context
20170329 15:27:01.478 26833 ERROR keystone     APP, name=name, global_conf=global_conf)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 450, in get_context
20170329 15:27:01.478 26833 ERROR keystone     global_additions=global_additions)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 559, in _pipeline_app_context
20170329 15:27:01.478 26833 ERROR keystone     APP, pipeline[1], global_conf)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 454, in get_context
20170329 15:27:01.478 26833 ERROR keystone     section)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 476, in _context_from_use
20170329 15:27:01.478 26833 ERROR keystone     object_type, name=use, global_conf=global_conf)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 406, in get_context
20170329 15:27:01.478 26833 ERROR keystone     global_conf=global_conf)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 296, in loadcontext
20170329 15:27:01.478 26833 ERROR keystone     global_conf=global_conf)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 328, in _loadegg
20170329 15:27:01.478 26833 ERROR keystone     return loader.get_context(object_type, name, global_conf)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 620, in get_context
20170329 15:27:01.478 26833 ERROR keystone     object_type, name=name)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 640, in find_egg_entry_point
20170329 15:27:01.478 26833 ERROR keystone     pkg_resources.require(self.spec)
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py”, line 943, in require
20170329 15:27:01.478 26833 ERROR keystone     needed = self.resolve(parse_requirements(requirements))
20170329 15:27:01.478 26833 ERROR keystone   File “/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py”, line 834, in resolve
20170329 15:27:01.478 26833 ERROR keystone     raise VersionConflict(dist, req).with_context(dependent_req)
20170329 15:27:01.478 26833 ERROR keystone ContextualVersionConflict: (requests 2.13.0 (/usr/local/lib/python2.7/distpackages), Requirement.parse(‘requests!=2.12.2,!=2.13.0,>=2.10.0’), set([‘oslo.policy’]))

This was caused solely because I forgot to use pip constraints at first when I started installing the controller node (I remembered later). Pip doesn’t have a dependency solver and just naively installs packages in the order its told. This causes all sorts of conflicts if 2 packages have the same requirement with different versions. (even if there is overlap and a correct version can be figured out) Using constraints like I recommended before would have avoided this. But after resolving the conflict keystone worked perfectly and I was ready to move on to the next service.

Installing Glance

The next service to install by following the install guide is Glance. The process for configuring glance was pretty straightforward. Just as with keystone it’s not worth repeating all the steps from the install guide section on Glance.  But, at a high level you just create the database in mysql, configure glance with the details for connecting to MySQL, connecting to Keystone, and how to store images. After that you run the DB schema migrations to set the schema for the MySQL database, and create the endpoint and service users in keystone. After going through all the steps I did encounter one problem in Glance when I first started it up. The glance log had this traceback:

2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data Traceback (most recent call last):
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance/api/v2/image_data.py”, line 116, in upload
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data image.set_data(data, size)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance/domain/proxy.py”, line 195, in set_data
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data self.base.set_data(data, size)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance/notifier.py”, line 480, in set_data
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data _send_notification(notify_error, ‘image.upload’, msg)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py”, line 220, in __exit__
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data self.force_reraise()
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py”, line 196, in force_reraise
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data six.reraise(self.type_, self.value, self.tb)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance/notifier.py”, line 427, in set_data
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data self.repo.set_data(data, size)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance/api/policy.py”, line 192, in set_data
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data return self.image.set_data(*args, **kwargs)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance/quota/__init__.py”, line 304, in set_data
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data self.image.set_data(data, size=size)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance/location.py”, line 439, in set_data
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data verifier=verifier)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance_store/backend.py”, line 453, in add_to_backend
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data verifier)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance_store/backend.py”, line 426, in store_add_to_backend
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data verifier=verifier)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data File “/usr/local/lib/python2.7/dist-packages/glance_store/capabilities.py”, line 223, in op_checker
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data raise op_exec_map[op](**kwargs)
2017-03-29 16:21:52.038 29647 ERROR glance.api.v2.image_data StoreAddDisabled: Configuration for store failed. Adding images to this store is disabled.

I forgot to create the /var/lib/glance dir, so there was no directory to store the images in. Again something else which would have been fix if I followed the steps I outlined in the installing from tarballs section. But after creating the directory everything worked.

One thing I do want to note here is that I have small issue with the verification steps for Glance outlined in the install guide. The steps there don’t really go far enough to verify the image uploaded was actually stored properly, just that glance created the image. This was a problem I had later in the installation and I could have caught it earlier if the verification steps instructed you to download the image from glance and compare it to the source image.

Installing Nova

The next service in the install guide is Nova. Nova was a bit more involved compared to Glance or Keystone, but it has more moving parts so that’s understandable. Just as with the other services refer to the install guide section for Nova for all the step by step details. There are more steps for nova in general so it’s not worth even outlining the high level flow here. One thing you’ll need to be aware of is that Nova includes 2 separate API services that you’ll be running, the Nova API and the Placement API. The Placement API is a recent addition since Newton which is used to provide data for scheduling logic and is a completely self contained service. Just like keystone, the placement API only ships as a wsgi script. But unlike keystone there was no documentation (this has changed, or in progress at least) about the install process and no example config files provided. It’s pretty straightforward to adapt what you used to keystone, but this was another thing I had to figure out on my own.

After getting everything configured according to the install guide I hit a few little things that I needed to fix. The first was that I forgot to create a state directory that I specified in the config file:

2017-03-29 17:46:28.176 32263 ERROR nova Traceback (most recent call last):
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/bin/nova-api”, line 10, in <module>
2017-03-29 17:46:28.176 32263 ERROR nova sys.exit(main())
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/cmd/api.py”, line 59, in main
2017-03-29 17:46:28.176 32263 ERROR nova server = service.WSGIService(api, use_ssl=should_use_ssl)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/service.py”, line 311, in __init__
2017-03-29 17:46:28.176 32263 ERROR nova self.app = self.loader.load_app(name)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/wsgi.py”, line 497, in load_app
2017-03-29 17:46:28.176 32263 ERROR nova return deploy.loadapp(“config:%s” % self.config_path, name=name)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 247, in loadapp
2017-03-29 17:46:28.176 32263 ERROR nova return loadobj(APP, uri, name=name, **kw)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 272, in loadobj
2017-03-29 17:46:28.176 32263 ERROR nova return context.create()
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 710, in create
2017-03-29 17:46:28.176 32263 ERROR nova return self.object_type.invoke(self)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 144, in invoke
2017-03-29 17:46:28.176 32263 ERROR nova **context.local_conf)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/util.py”, line 55, in fix_call
2017-03-29 17:46:28.176 32263 ERROR nova val = callable(*args, **kw)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/openstack/urlmap.py”, line 160, in urlmap_factory
2017-03-29 17:46:28.176 32263 ERROR nova app = loader.get_app(app_name, global_conf=global_conf)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 350, in get_app
2017-03-29 17:46:28.176 32263 ERROR nova name=name, global_conf=global_conf).create()
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 710, in create
2017-03-29 17:46:28.176 32263 ERROR nova return self.object_type.invoke(self)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 144, in invoke
2017-03-29 17:46:28.176 32263 ERROR nova **context.local_conf)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/util.py”, line 55, in fix_call
2017-03-29 17:46:28.176 32263 ERROR nova val = callable(*args, **kw)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/auth.py”, line 57, in pipeline_factory_v21
2017-03-29 17:46:28.176 32263 ERROR nova return _load_pipeline(loader, local_conf[CONF.api.auth_strategy].split())
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/auth.py”, line 38, in _load_pipeline
2017-03-29 17:46:28.176 32263 ERROR nova app = loader.get_app(pipeline[-1])
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 350, in get_app
2017-03-29 17:46:28.176 32263 ERROR nova name=name, global_conf=global_conf).create()
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 710, in create
2017-03-29 17:46:28.176 32263 ERROR nova return self.object_type.invoke(self)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 146, in invoke
2017-03-29 17:46:28.176 32263 ERROR nova return fix_call(context.object, context.global_conf, **context.local_conf)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/util.py”, line 55, in fix_call
2017-03-29 17:46:28.176 32263 ERROR nova val = callable(*args, **kw)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/openstack/__init__.py”, line 218, in factory
2017-03-29 17:46:28.176 32263 ERROR nova return cls()
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/openstack/compute/__init__.py”, line 31, in __init__
2017-03-29 17:46:28.176 32263 ERROR nova super(APIRouterV21, self).__init__()
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/openstack/__init__.py”, line 243, in __init__
2017-03-29 17:46:28.176 32263 ERROR nova self._register_resources_check_inherits(mapper)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/openstack/__init__.py”, line 259, in _register_resources_check_inherits
2017-03-29 17:46:28.176 32263 ERROR nova for resource in ext.obj.get_resources():
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/openstack/compute/cloudpipe.py”, line 187, in get_resources
2017-03-29 17:46:28.176 32263 ERROR nova CloudpipeController())]
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/openstack/compute/cloudpipe.py”, line 48, in __init__
2017-03-29 17:46:28.176 32263 ERROR nova self.setup()
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/openstack/compute/cloudpipe.py”, line 55, in setup
2017-03-29 17:46:28.176 32263 ERROR nova fileutils.ensure_tree(CONF.crypto.keys_path)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/local/lib/python2.7/dist-packages/oslo_utils/fileutils.py”, line 40, in ensure_tree
2017-03-29 17:46:28.176 32263 ERROR nova os.makedirs(path, mode)
2017-03-29 17:46:28.176 32263 ERROR nova File “/usr/lib/python2.7/os.py”, line 157, in makedirs
2017-03-29 17:46:28.176 32263 ERROR nova mkdir(name, mode)
2017-03-29 17:46:28.176 32263 ERROR nova OSError: [Errno 13] Permission denied: ‘/usr/local/lib/python2.7/dist-packages/keys’

This was simple to fix and all I had to do was create the directory and set the owner to the service user. The second issue was my old friend the requirements mismatch:

2017-03-29 18:33:11.433 1155 ERROR nova Traceback (most recent call last):
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/bin/nova-api”, line 10, in <module>
2017-03-29 18:33:11.433 1155 ERROR nova sys.exit(main())
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/cmd/api.py”, line 59, in main
2017-03-29 18:33:11.433 1155 ERROR nova server = service.WSGIService(api, use_ssl=should_use_ssl)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/service.py”, line 311, in __init__
2017-03-29 18:33:11.433 1155 ERROR nova self.app = self.loader.load_app(name)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/wsgi.py”, line 497, in load_app
2017-03-29 18:33:11.433 1155 ERROR nova return deploy.loadapp(“config:%s” % self.config_path, name=name)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 247, in loadapp
2017-03-29 18:33:11.433 1155 ERROR nova return loadobj(APP, uri, name=name, **kw)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 272, in loadobj
2017-03-29 18:33:11.433 1155 ERROR nova return context.create()
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 710, in create
2017-03-29 18:33:11.433 1155 ERROR nova return self.object_type.invoke(self)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 144, in invoke
2017-03-29 18:33:11.433 1155 ERROR nova **context.local_conf)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/util.py”, line 55, in fix_call
2017-03-29 18:33:11.433 1155 ERROR nova val = callable(*args, **kw)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/urlmap.py”, line 31, in urlmap_factory
2017-03-29 18:33:11.433 1155 ERROR nova app = loader.get_app(app_name, global_conf=global_conf)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 350, in get_app
2017-03-29 18:33:11.433 1155 ERROR nova name=name, global_conf=global_conf).create()
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 710, in create
2017-03-29 18:33:11.433 1155 ERROR nova return self.object_type.invoke(self)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 203, in invoke
2017-03-29 18:33:11.433 1155 ERROR nova app = context.app_context.create()
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 710, in create
2017-03-29 18:33:11.433 1155 ERROR nova return self.object_type.invoke(self)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py”, line 146, in invoke
2017-03-29 18:33:11.433 1155 ERROR nova return fix_call(context.object, context.global_conf, **context.local_conf)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/paste/deploy/util.py”, line 55, in fix_call
2017-03-29 18:33:11.433 1155 ERROR nova val = callable(*args, **kw)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/wsgi.py”, line 270, in factory
2017-03-29 18:33:11.433 1155 ERROR nova return cls(**local_config)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/metadata/handler.py”, line 49, in __init__
2017-03-29 18:33:11.433 1155 ERROR nova expiration_time=CONF.api.metadata_cache_expiration)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/cache_utils.py”, line 58, in get_client
2017-03-29 18:33:11.433 1155 ERROR nova backend=’oslo_cache.dict’))
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/cache_utils.py”, line 96, in _get_custom_cache_region
2017-03-29 18:33:11.433 1155 ERROR nova region.configure(backend, **region_params)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/dogpile/cache/region.py”, line 413, in configure
2017-03-29 18:33:11.433 1155 ERROR nova backend_cls = _backend_loader.load(backend)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/dogpile/util/langhelpers.py”, line 40, in load
2017-03-29 18:33:11.433 1155 ERROR nova return impl.load()
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py”, line 2301, in load
2017-03-29 18:33:11.433 1155 ERROR nova self.require(*args, **kwargs)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py”, line 2324, in require
2017-03-29 18:33:11.433 1155 ERROR nova items = working_set.resolve(reqs, env, installer, extras=self.extras)
2017-03-29 18:33:11.433 1155 ERROR nova File “/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py”, line 859, in resolve
2017-03-29 18:33:11.433 1155 ERROR nova raise VersionConflict(dist, req).with_context(dependent_req)
2017-03-29 18:33:11.433 1155 ERROR nova ContextualVersionConflict: (pbr 1.10.0 (/usr/local/lib/python2.7/dist-packages), Requirement.parse(‘pbr>=2.0.0’), set([‘oslo.i18n’, ‘oslo.log’, ‘oslo.context’, ‘oslo.utils’]))

In this instance it was a pretty base requirement, pbr,  that was at the wrong version. When I saw this I realized that I forgot to use constraints (because pbr is used by everything in OpenStack) and I quickly reran pip install for nova with the constraints argument to correct this issue.

The final thing I hit was a missing sudoers file:

2017-03-29 18:29:47.844 905 ERROR nova Traceback (most recent call last):
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/bin/nova-api”, line 10, in <module>
2017-03-29 18:29:47.844 905 ERROR nova sys.exit(main())
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/cmd/api.py”, line 59, in main
2017-03-29 18:29:47.844 905 ERROR nova server = service.WSGIService(api, use_ssl=should_use_ssl)
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/service.py”, line 309, in __init__
2017-03-29 18:29:47.844 905 ERROR nova self.manager = self._get_manager()
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/service.py”, line 364, in _get_manager
2017-03-29 18:29:47.844 905 ERROR nova return manager_class()
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/api/manager.py”, line 30, in __init__
2017-03-29 18:29:47.844 905 ERROR nova self.network_driver.metadata_accept()
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/network/linux_net.py”, line 606, in metadata_accept
2017-03-29 18:29:47.844 905 ERROR nova iptables_manager.apply()
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/network/linux_net.py”, line 346, in apply
2017-03-29 18:29:47.844 905 ERROR nova self._apply()
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py”, line 271, in inner
2017-03-29 18:29:47.844 905 ERROR nova return f(*args, **kwargs)
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/network/linux_net.py”, line 366, in _apply
2017-03-29 18:29:47.844 905 ERROR nova attempts=5)
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/network/linux_net.py”, line 1167, in _execute
2017-03-29 18:29:47.844 905 ERROR nova return utils.execute(*cmd, **kwargs)
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/utils.py”, line 297, in execute
2017-03-29 18:29:47.844 905 ERROR nova return RootwrapProcessHelper().execute(*cmd, **kwargs)
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/nova/utils.py”, line 180, in execute
2017-03-29 18:29:47.844 905 ERROR nova return processutils.execute(*cmd, **kwargs)
2017-03-29 18:29:47.844 905 ERROR nova File “/usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py”, line 400, in execute
2017-03-29 18:29:47.844 905 ERROR nova cmd=sanitized_cmd)
2017-03-29 18:29:47.844 905 ERROR nova ProcessExecutionError: Unexpected error while running command.
2017-03-29 18:29:47.844 905 ERROR nova Command: sudo nova-rootwrap /etc/nova/rootwrap.conf iptables-save -c
2017-03-29 18:29:47.844 905 ERROR nova Exit code: 1
2017-03-29 18:29:47.844 905 ERROR nova Stdout: u”
2017-03-29 18:29:47.844 905 ERROR nova Stderr: u’sudo: no tty present and no askpass program specified\n

Nova needs root priveleges to perform some operations. To do this it leverages a program called rootwrap to do the privelege escalation. But it needs sudo to be able to leverage rootwrap. I was able to to fix this by creating a sudoers file for nova like:

nova ALL=(root) NOPASSWD: /usr/local/bin/novarootwrap /etc/nova/rootwrap.conf

After correcting those 3 issues I got Nova running without any errors (at least with the verification steps outlined in the install guide)

Installing Neutron

The last service I’m installing from the install guide (I skipped cinder because I’m not using block storage) is Neutron. By far this was the most complicated and most difficult service to install and configure. I had the most problems with neutron and networking in general both during the install phase and also later when I was debugging the operation of the cloud.  In the case of Neutron I started by reading the install guide section for neutron like the other services, but I also often needed to read the OpenStack Networking Guide to get a better grasp on the underlying concepts the install guide was trying to explain. Especially after getting to the section in the install guide where it asks you to pick between “Provider Networks” or “Self Service Networking”.

After reading all the documentation I decided that I wanted use provider networks because all I wanted was all my guests on a flat Layer 2 and for the guests  to come on my home network with an IP address I could reach from any of my other computer I have at home.  When I saw this diagram in the Networking Guide:

it made my decision simple. This networking topology was exactly what I wanted. I didn’t want to have to deal with creating a network, subnet, and router in neutron for each tenant to be able to access my guests. With this decision made I went about following the configuration guide lke for the previous services.

Unfortunately I hit an issue pretty early on. These were related to Neutron’s default configuration being spread across multiple files. It makes it very confusing to follow the install guide. For example, it says you want to write one set of config options into /etc/neutron/neutron.confthen a second set of config options into /etc/neutron/plugins/ml2/ml2_conf.ini and a third set of config options into /etc/neutron/plugins/ml2/linuxbridge_agent.ini, etc. This process continues for another 2 or 3 config files without any context on how these separate files are used. Then what makes it worse is when you actually go to launch the neutron daemons . Neutron itself consists of 4-5 different daemons running on the controller and compute nodes. But, there is no documentation anywhere on how all of these different config files are leveraged by the different daemons. For example, when launching linuxbridge-agent daemon which config files are you supposed to pass in? I ended up having to cheat  for this and look at the devstack soure code to see how it launched neutron there. After that I realized neutron is just leveraging oslo.config‘s ability to specify multiple config files and have them be concatenated together at runtime. This means that because there are no overlapping options that none of this complexity is required and a single neutron.conf could be used for everything. This is something I think we must change in Neutron, because as things are now are just too confusing.

After finally getting everything configured I encountered a number of other issues. The first was around rootwrap, just like nova, neutron need root privileges to perform some operations, and it leverages rootwrap to perform the privilege escalation.  However, neutron uses rootwrap as a separate daemon, and calls it over a socket interface. (this is done to reduce the overhead for creating a separate python process on each external call, which can slow things down significantly) When I first started neutron I hit a similar error to nova about sudo permissions. So I needed to create a sudoers file for neutron, in my case it looked like this:

neutron ALL=(root) NOPASSWD: /usr/local/bin/neutron-rootwrap /etc/neutron/rootwrap.conf *
neutron ALL=(root) NOPASSWD: /usr/local/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf

But it also turns out I needed to tell neutron how to call rootwrap. I found this bug on launchpad when I did a google search on my error and it told me about the config options I needed to set in addition to creating the sudoers file.  These weren’t in the install documentation as I expect by default the neutron distro packages set these config options.  After creating the sudoers file and setting the config flags I was able to get past this issue.

The next problem was also fairly cryptic. When I first started neutron after fixing the rootwrap issue I was greeted by this error in the logs:

2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent Traceback (most recent call last):
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py”, line 453, in daemon_loop
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent sync = self.process_network_devices(device_info)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py”, line 153, in wrapper
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent return f(*args, **kwargs)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py”, line 203, in process_network_devices
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent device_info.get(‘updated’))
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py”, line 277, in setup_port_filters
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent self.prepare_devices_filter(new_devices)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py”, line 131, in decorated_function
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent *args, **kwargs)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py”, line 139, in prepare_devices_filter
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent self._apply_port_filter(device_ids)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py”, line 157, in _apply_port_filter
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent security_groups, security_group_member_ips)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py”, line 173, in _update_security_group_info
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent remote_sg_id, member_ips)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/linux/iptables_firewall.py”, line 163, in update_security_group_members
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent self._update_ipset_members(sg_id, sg_members)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/linux/iptables_firewall.py”, line 169, in _update_ipset_members
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent sg_id, ip_version, current_ips)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/linux/ipset_manager.py”, line 83, in set_members
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent self.set_members_mutate(set_name, ethertype, member_ips)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py”, line 271, in inner
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent return f(*args, **kwargs)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/linux/ipset_manager.py”, line 93, in set_members_mutate
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent self._create_set(set_name, ethertype)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/linux/ipset_manager.py”, line 139, in _create_set
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent self._apply(cmd)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/linux/ipset_manager.py”, line 149, in _apply
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent check_exit_code=fail_on_errors)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/linux/utils.py”, line 128, in execute
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent execute_rootwrap_daemon(cmd, process_input, addl_env))
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/neutron/agent/linux/utils.py”, line 115, in execute_rootwrap_daemon
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent return client.execute(cmd, process_input)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/local/lib/python2.7/dist-packages/oslo_rootwrap/client.py”, line 129, in execute
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent res = proxy.run_one_command(cmd, stdin)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “<string>”, line 2, in run_one_command
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent File “/usr/lib/python2.7/multiprocessing/managers.py”, line 774, in _callmethod
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent raise convert_to_error(kind, result)
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent RemoteError:
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent —————————————————————————
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent Unserializable message: (‘#ERROR’, ValueError(‘I/O operation on closed file’,))
2017-03-30 11:57:05.182 4158 ERROR neutron.plugins.ml2.drivers.agent._common_agent —————————————————————————

Which isn’t helpful at all. It turns out that this error means that neutron can’t find the ipset command, but it’s not at all clear from the traceback. I was only able to figure this out after tracing through the neutron source code (by following the calls in the traceback) I realized that this error is being emitted after neutron calls the rootwrap daemon. I had to turn debug log level on in the separate rootwrap.conf (which is something packaged in the tarball) to get the rootwrap daemon to log the error message it’s encountering, which in this case was that ipset could not be found. After installing ipset this was corrected.

After all of these headaches I finally got neutron running. But, I quickly found that my choice for provider networks was causing issues with DHCP on my home network. I only have a single 24 port unmanaged switch at home and the bridge interfaces for the guests were on the same Layer 2 network as the rest of my home infrastructure, including my DHCP server. This meant that when I created a server in the cloud the DHCP request from the guest would go out and be recieved by both the neutron DHCP agent as well as my home DHCP server because being on the same Layer 2 meant they shared a broadcast domain. Luckily neutron’s default security group rules blocked the DHCP response from my home server, but there was still a lease record being created on my home server. Also if I ever loosened the security group rules and DHCP traffic was allowed then there would be a race condition between my server and the neutron agent. It turns out there was a small note (see step 3) on this potential problem in the networking guide. So my solution for this was to disable DHCP in neutron and also stop running the DHCP agent on my cloud. This had a ripple effect in that I couldn’t use the metadata service either because it depends on DHCP to set the route for the hardcoded ip address for the metadata server. (this will come up later) Luckily I was able to leverage the force_config_drive option in Nova to make sure the metadata service wasn’t necessary.

I modified the network diagram above for what I ended up with in my cloud:

(note I’m terrible at art, so you can clearly tell where I made changes)

If all of the above didn’t make it clear I still find Neutron the roughest part of the user experience for OpenStack. Besides complexity in configuration it also has a presumption of a decent understanding of networking concepts. I fully admit networking is hard, especially for clouds because you’re dealing with a lot of different pieces, but this is somwhere I feel we need to make improvements. Especially in my use case where my requirements were pretty straightforward. I just wanted to have a server come up on my home network when it was booted so I could log into it right after it booted.  In my opinion this is what the majority of cloud consumers (people using the API) care about. Just getting an IP address (v4 or v6 it doesn’t really matter) and being able to connect to that from their personal machines. After going through this process I’m pretty sure that my college student self who had a much more limited understanding of networking than I do now would have had a very difficult time figuring this out.

Booting the first server

After getting everything running on a single node it was time to boot my first server. I eagerly typed in the

openstack server create

command with all the parameters for my credentials the flavor and the image I had uploaded and waited for the server to go ACTIVE state by running:

openstack server list

a few times. Once the server went into the ACTIVE state I tried to login into the guest with ssh, and got nothing. The ssh connection just timed out and there wasn’t any indication why. Having debugged a ton of issues like this over the years my first guess was ok I screwed up the networking, let me look at the console log by running:

openstack console log show testserver

and it returned nothing.  I was a bit lost as to why, the console log should show the process of booting the operating system. I figured that I made a configuration mistake in the nova, so to double check I logged into the compute node and checked the libvirt state directory and confirmed that the console log file was empty. But this left me at an impasse, why would the guest not be logging anything to the console on boot? So I just started sanity checking everything I could find. When I looked at Nova’s local image cache and saw the cirros image was 0 bytes in size. A cirros image should be about 13MB in size, so 0 bytes was clearly wrong. From there I started tracing through the glance logs to figure out where the data was getting lost (was it nova downloading the image from glance, or did glance have an empty image) when I found:

DEBUG glance_store._drivers.filesystem [req3163a1a74ca947e89444cd8b865055fb 20f283024ffd4bf4841a8d33bdb4f385 6c3fc6392e0c487e85d57afe5a5ab2b7 default default] Wrote 0 bytes to /var/lib/glance/images/e673563643d94fb0a302f3710386b689 with checksum d41d8cd98f00b204e9800998ecf8427e add /usr/local/lib/python2.7/distpackages/glance_store/_drivers/filesystem.py:706

Which was the only hint I could find in the glance logs. It wasn’t even that useful all it said was that glance wrote 0 bytes to disk for the uploaded image. Which at least confirmed that glance wasn’t storing any data from the image upload. But, I couldn’t find any other information about this. So I decided to re-upload the image to glance and use tcpdump on both my desktop and the server to make sure the data was getting sent over the wire to glance. The output of the tcpdump showed all the data being sent and received. This at least meant that the data is getting to the glance api server, but it didn’t really help me figure out where the data was going.

With no other ideas I decided to “instrument” the glance code by manually adding a bunch of log statements to the installed python code in

/usr/local/lib/python2.7/sitepackages/glance

by hand to the trace the data flow through the glance code to find where the image goes from 13MB to 0 bytes. When I did this I was able to figure out that the image data was being lost outside of the glance code in one of it’s requirement libraries either webob, paste, or something like that. When I saw that I realized that I forgot to use constraints when installing glance. I quickly rushed to reinstall glance from the tarball using the constraints parameter and restarted the service. After doing this and re-uploading the image everything worked!

My only mistake in that process was in my over-eagerness to fix the problem I forgot to take notes of exactly what I reinstalled to see where the actual problem was. So all I can say for sure is that make sure you use constraints whenever you install from source, because clearly there was an issue with just using pip install by itself.

After getting glance working I was able to re-run the openstack command to create a server and this time I was able to get a console log, but ssh still didn’t work.

Networking woes

At this point I had the servers booting, but I wasn’t able to login to them. I’ve personally had to debug this kind of issues many times, so when I saw this my first step was to ping the IP address for the guest, just to rule out that it was an issue with the ssh daemon on the server. Since the ping didn’t work I wanted to see if there were was an entry in my arp table for the ip address. Again, there was nothing on that IP after running the arp command. So this either meant there was an issue with Layer 2 connectivity to the guest from my desktop, or the guest didn’t know it’s IP address. (I’ve personally seen both failure conditions) My next step was to check the console log to see if it was setting an IP address correctly. When I got to the cloud-init section of the console log it showed that the IP address was never getting assigned. Instead the server was timing out waiting for a DHCP lease. If you remember the neutron section above I had to disable DHCP on the guests because it was conflicting with my home’s DHCP server so this clearly wasn’t right.

It turns out that cloud-init doesn’t know how to deal with static networking configuration from a config drive. (it might work with a metadata server, but I was not able to check this) So when the guest boots it just ignores the static networking information in the config drive and then tries to get a DHCP lease. This meant that cirros, the recommended image for testing and what the install guide tells you to use, wasn’t going to work. Also the majority of cloud images you can download weren’t going to work either. The only cloud image I was able to get working was the official ubuntu cloud image. This was because Nova was doing file injection to write a the networking information directly into the guest file system. I found a useful blog post on this in my searching: http://blog.oddbit.com/2015/06/26/openstack-networking-without-dhcp/ (although the translation didn’t work on RHEL like that post indicates) But, even if I got ubuntu to work, having a cloud that was only able to boot a single type of image isn’t really that useful.

Luckily the OpenStack Infrastructure team has a similar problem on some public OpenStack clouds they run things on, and they created the Glean project to be an alternative for cloud-init that can properly use the static networking information from a config drive. All I had to do was leverage the Disk Image Builder project to create the images I uploaded into my cloud with glean instead of cloud-init. While not ideal solution, because you can’t take anyone’s random pre-existing cloud image, this worked well enough for me because I can remember to do this as the primary user of my cloud.

It’s also  worth pointing out that all of these networking issues would have been completely avoided if I chose self service networking back in the setting up neutron section. (because it creates a separate Layer 2 network for each tenant) But, given my goals with the cloud and the way the documentation lays out the options I had no way to know this. This connects back to my earlier complaints with neutron being too complex and presuming too much prior knowledge.

But, at this point I had a working single node cloud and could successfully boot guests. All that was left before I finished the cloud deployment was to replicate the installation on the remaining 4 servers.

Setting up the compute nodes

Once I confirmed to have a working configuration and got all the services figured out on the controller node (which included nova-compute and the necessary neutron services for a compute node because it was an all in one) and got everything running there, it was time to setup the compute nodes. This was pretty straightforward and just involved configuring nova-compute and neutron services. It was pretty formulaic and basically just copy and paste. The exact procedure that I wrote down in my notes for this process was:

  1. add provider network interface config
  2. disable apparmor
  3. reboot
  4. download tarballs
  5. create system users
  6. add nova user to libvirt group
  7. install all binaries (libvirt, qemu, ipset, mkisofs, libssl-dev, pip)
  8. make service dirs /etc/ /var/lib for both neutron and nova
  9. copy etc dirs from tarballs to /etc
  10. pip install code with upper-constraints
  11. write config files (basically just copy from another compute node)
  12. set permissions on /etc and /var/lib
  13. create sudoers files for nova and neutron
  14. create systemd unit files
  15. start services
  16. run nova discover_hosts

This is basically just copying and pasting things across the remaining 4 servers. But there were a couple of lessons I learned from the initial install were reflected in these. The only one I haven’t talked about before was disabling apparmor. (or SELinux on other linux distros) I learned the hard way that the default apparmor rules on Ubuntu prevent nova and libvirt from doing the necessary operations to boot a guest. The proper way to fix this issue (especially for better security) would be to create your own apparmor rules to allow the operations being blocked. But, I have always been confused by this, especially on SELinux and didn’t even bother trying. I just disabled apparmor and moved on.

After repeating these steps across the 4 compute nodes I had a fully operational cloud. Nova was showing me the full capacity of 80 vCPUs and I could interact with the cloud and launch guests across all of them. My project was complete! (at least for the first phase)

Conclusion

So after writing all of this down, I came to the realization that I likely give the impression that installing OpenStack by hand is an impossibly complex task. But, honestly it wasn’t that bad of an experience. Sure, OpenStack is complex software with a lot of moving pieces, but in total I got everything working in two-three days.  (And I wasn’t dedicating all my time during those days either). The majority of the issues were caused solely by my insistence on installing everything from tarballs. If I actually followed my original thought experiment and just followed the install guide, the only issue I probably would have hit was with networking. Once you understand what OpenStack is doing under the covers, the install is pretty straightforward. After doing my first OpenStack install a few years ago I found I had a better understanding of how OpenStack works which really helped me in my work on the project. It’s something I recommend that everyone does at least once if they’re planning on working on OpenStack in any capacity. Even just in a VM for playing around. (devstack doesn’t count)

For comparison, that rack of similar Dell servers I deployed back in college took me so much longer to get running. In that case I used xCAT for deployment automation. But, it still took me over a month to get the inifiband cards working with RDMA using OFED, setting up SLURM for MPI job scheduling, connecting everything to our central LDAP server, and having users able to launch jobs across all the nodes. While, it’s not entirely a fair comparison since I have almost a decade more of experience now, but I think it helps put into perspective that this is far from the most grueling experience I’ve had installing software.

After going through the whole exercise I don’t actually run this cloud 24/7, mostly because it heats up my apartment too much and I can’t sleep at night when it’s running.  The power consumption for the servers is also pretty high and I don’t really want to pay the power bill. This basically means I failed the second half of the experiment, to virtualize my home infrastructure. Since I can’t rely on the cloud for critical infrastructure if it’s not always running. But I have found some uses for the cloud both for development tasks as well as running some highly parallel CPU tasks across the entire cloud at once.

Moving forward I intend to continue working on the cloud and upgrading it to future releases as they occur. Also, one of my goals for this entire exercise was going back to the OpenStack community with feedback on how to improve things and submitting patches and/or bugs on fixing some of the issues. This will be an ongoing process as I find time to work on them and also encounter more issues.

Cover Photo // CC BY NC

The post How to rock dirty clouds done cheap appeared first on OpenStack Superuser.

by Matthew Treinish at September 15, 2017 02:17 PM

OpenStack Blog - Swapnil Kulkarni

OpenStack Queens PTG – Day 4

Day 4 of PTG started with next Kolla discussions related to kolla-ansible. Discussion started with kolla dev-mode effort started by pbourke. discussion was about currently missing pieces in dev_mode like installing clients, libs and virtualenv bindmount. The goal in the cycle is to fill the missing pieces, verify options for multinode dev_mode, investigate on options for remote debugging and also consider using PyCharm.

One of the important topics in kolla is the gating. Currently kolla has around 14 different gates for deployment testing and it has to be improved with testing the deployment for sanity with Tempest. This will help the validate the entire deployment in the gates. Upgrades testing is also one key requirement, kolla team will model something like grenade testing for it. The key is to maximize the testing of scenarios that kolla supports in gate, but since we are restricted with openstack infra resources as well as the time each test takes to validate. It is agreed that team members will create a list of scenarios and assign to everyone to verify and record the results in a central location like a google sheet. This will also help evaluate stability of kolla deployment in each release.

Skip level upgrades is one of the major talking point in the current PTG. Kolla team will evaluate fast forward upgrades for each service deployed with kolla to decide on skip level upgrade support in kolla. This would be a PoC in current cycle.

Second half of the discussion was around the kolla-kubernetes, where the team discussed the roadmap for current cycle. That will include upgrade prototyping for z stream & x stream services, validate the logging solution with fluent-bit, automated deployment, remove deprecated components and improve documentation.

Most of the teams have wrapped up their design discussions on Thursday and will be having hackathons on the last day.

by Swapnil Kulkarni at September 15, 2017 01:58 PM

September 14, 2017

OpenStack Blog - Swapnil Kulkarni

OpenStack Queens PTG – Day 3

Day 3 of Queens PTG started with project specific design discussions, I joined the Kolla team where we started with the topic first for all the design summits we have had and very much important for the community, the “Documentation“. We broke down the discussion in documentation for quick-start with kolla, contributor, operators and reference documentation. The documentation available currently is scattered across projects after project split and its essential that it has a common landing page on OpenStack Deployment guides where everyone can refer to. We had representatives from the Documentation team Alex, Doug and Petr who are working on improving the doc experience by migrating the docs to a common format across the community. They understood the problem kolla is facing and we had a working discussion where we created the table of contents for all available and required documentation required for kolla.

Kolla team then joined the TripleO team which is consuming the kolla images for OpenStack deployment for discussion about collaboration of effort. The teams will work together to improve the build and publish pipeline for kolla images, improving & adding more CI jobs for the kolla/kolla-ansible/kolla-kubernetes, Configuration management post deployment of containers. The tripleo team has come up with basic healthchecks for containerized deployment, the kolla team will help get those checks in current kolla images and improve on those to better monitor the contaierized OpenStack deployment. The teams will also collaborate on improving the orchestration steps, container testing, upgrades and creating metrics for OpenStack deployment.

During lunch we had extended the discussion with Lars and kfox for discussion around Monitoring for OpenStack, Prometheus and other monitoring tools.

Post lunch, kolla team started with key discussion to the heart of operators, the OpenStack plugins deployment with kolla. There are multiple issues currently related to plugin as when would be ideal time to make them available, during build/deployment? Plugins might have non-matching depedencies to OpenStack components and so further. The team came up with multiple permutation of options available which would need to be PoCed during the release.

Since the inception of project loci there has been discussion around kolla-images size and the team had an interesting discussion on how to reduce that. The important part is to remove the things like apt/yum cache, removing the fat base image and so further. The team also discussed about utilizing althernate container build tooling to writing own image build tool. The team will hack on Friday removing the fat base images and see if that improves the image size.

External tools like Ceph are common pain points when we are doing OpenStack deployment. When kolla community evaluated the options for Ceph as storage backed for containerized openstack deployment there was no thing like containerized ceph. The team build it from scratch and got it working. The ceph team has currently come up with ceph-docker and ceph-ansible. It would be useful for operators that kolla uses the tools directly available from vendors for. We had a discussion with representative from ceph to initiate the collaboration to deprecate current ceph deployment in kolla and use the combination of ceph-docker & ceph-ansible. It will help both the communities will benefit exchange things done better at each end.

I got a surprise gift of vintage OpenStack swag from the PTG team

and I had another photo with the marketing team for with the TSP members.

The day ended with hanging out with kolla team mates at Famous Dave‘s

by Swapnil Kulkarni at September 14, 2017 02:16 PM

Mark McLoughlin

September 10th OpenStack Foundation Board Meeting

The OpenStack Foundation met in Denver on September 10th for a Joint Leadership Meeting involving the foundation Board of Directors, the Technical Committee, and the User Committee.

The usual disclaimer applies - this my informal recollection of the meeting. It’s not an official record.

Foundation Events Update

We began with an update from Lauren, Jonathan, and Mark on the events that have happened so far this year, the Project Teams Gathering (PTG) in Denver this week, and the coming OpenStack Summit in Sydney.

Lauren outlined some details of the recent Pike release, emphasizing the positive media coverage of the release, with the "composable infrastructure services" messaging resonating.

Jonathan talked about the many OpenStack Days events that happened over the summer, including Melbourne, Tel Aviv, Budapest, Korea, Taiwan, Japan, and China. Jonathan has attended all of these, covering 13 countries since the OpenStack Summit in Boston and he spoke about the many new users and new use cases that he learned about over the course of these events. More OpenStack Days are coming this year including Benelux, UK, Italy, Turkey, Nordic, Canada, France, and Germany.

Mark spoke about the OpenDev event held the previous week in San Francisco. The goal was to bring in people who are experts in different domains, and the important and emerging use case of "Edge Computing" was chosen for this first event. The keynote from Dr Satya of Carnegie Mellon University was mentioned as one particularly inspiring contribution.

An particularly interesting conclusion from one of the sessions was a simple definition of what Edge Computing actually is:

Edge is the furthest boundary that separates application-agnostic scheduled computing workloads within the same operator's domain of control, from applications or devices that can't schedule workloads, and are outside the same operator's control.

(Thanks to Dan Sneddon for pointing this out!)

Another interesting development is the collection of edge use cases which will be published to Edge Computing mini-site on openstack.org.

The PTG was touched on next - more than 400 contributors in attendance from 35 project teams, with the first two days focused on the strategic goals of simplification, adjacent technologies, onboarding new contributors, etc.

Jonathan also talked about the coming OpenStack Summit in Sydney. We are aiming for 2500+ attendees, and the amazing work by the program committee to work the 1100 speaking submissions into an awesome three day schedule has been completed. There will be Hackathon focused on Cloud Applications the weekend before the event.

Finally, we looked forward to the OpenStack Summits in Vancouver in May, and Berlin the following November.

The Strategic Focus Areas

Back in March, at the Strategic Planning Workshop in Boston, we developed a set of 5 strategic focus areas and formed working groups around each of these. For each of those focus areas, the working group presented their findings and progress, followed by some discussion.

Better communicate about OpenStack

Thierry Carrez and Lauren Sell lead the discussion of this topic with a set of slides.

We began by discussing progresson developing a map of OpenStack deliverables. The idea is for the map to make it easy for users of the software to make sense of what OpenStack has to offer, and one key part of this mapping effort is to categorize deliverables into buckets:

  • openstack-user: Things an end user installs to consume the IaaS stack
  • openstack-iaas: Primary compute, storage & networking services
  • openstack-operations: Things an operator uses to manage an openstack cloud once installed
  • openstack-lifecyclemanagement: Things that help deploy/upgrade OpenStack or standalone components
  • openstack-adjacentenablers: Things that other infrastructure stacks can use to leverage individual OpenStack components

Some of the outstanding questions include how to represent projects which are coming down the line, where various types of plugins should live, and whether Glance is tied to Compute or should be represented as a Shared Service.

After the meeting, Lauren sent out a request for everyone to contribute their feedback on the draft of the map. Please do join in!

Next, we discussed at some length how OpenStack has been affected by "Big Tent" concept where we welcome collaboration, experimentation, and innovation on "infrastructure things" beyond the core OpenStack technology. We've know that users have found it difficult to make sense of the breadth of project teams, and we have created further confusion around "what is OpenStack".

Our discussion on this revolved around the idea of separating the technologies directly related to the deliverables map above (which we could call "OpenStack IaaS and friends"), the "software forge" infrastructure project, and the free-for-all project hosting area previously known as Stackforge. There was broad consensus that we should give each of those its own identity, which is particularly exciting when you think of the potential for "Infra" to have an identity that isn't so closely tied with OpenStack. We also discussed the potential to extend this model to other projects in the future, but also our desire to not become a "Foundation of Foundations" or a collection of entirely unrelated projects.

Requirements: Close the feedback loop

Melvin Hillsman and Thierry Carrez talked us through the unanswered requirements strategic focus area.

The focus of this discussion was on the creation of OpenStack Special Interest Groups (SIGs) as a mechanism to have cross-community collaboration on a given topic, without the work being under the umbrella of any one governance body.

The SIGs created so far are:

  • a Meta SIG to discuss how to improve SIG processes
  • an API SIG, which is an evolution of the API Working Group already formed, and
  • a still-forming Ansible SIG, with the goal of facilitating collaboration between Ansible and OpenStack projects.

Community Health

On this topic, Steve Dake talked us through some efforts to help grow the next generation of leaders in the OpenStack community, supporting people who wish to become a core contributor or PTL. Steve particularly highlighted efforts along these lines within the Kolla project.

Increase complementary with adjacent technologies

Steve Dake again took the lead on presenting this topic, focusing on success stories of collaboration between OpenStack and other communities - Ansible and Helm, in particular.

For Ansible, it was observed that OpenStack has built upon Ansible's highly reusable technology in many ways, and OpenStack members have contributed significantly to the Ansible modules for OpenStack based on Shade. The conclusion was that the success was down to (a) building releationships between the communities, (b) leadership endorsement, and (c) the simplified collaboration process adopted by Ansible.

For Helm, the collaboration has been focused in areas where Helm is being used to deploy OpenStack services on Kubernetes.

Finally, Dims gave a read-out on collaboration on OpenStack within the Kubernetes community, mostly with work in the OpenStack SIG focused on the OpenStack cloud provider.

Technology changes: Simplify OpenStack

For our final strategic focus area, Mike Perez gave an update on progress. He described projects which have recently been retired, the OpenStack manuals project migration, and the status of a number of projects who are seeing low levels of contribution activity.

Clarifying and communicating where help is needed

Next up, Thierry walked us through the TC's mechanism for exposing areas where help is needed in the community. We talked through this "top 5 help wanted list" and had a good discussion on the two items currently on the list - Documentation and Glance.

Interoperability Working Group

As our final topic, Egle Sigler gave an update from Interoperability Working Group.

The first item of business was to approve the 2017.09 guideline. Both the compute and object components gained some new capabilities in this update.

As discussed in the previous meeting, the working group proposed the creation of "add-on" programs which would focus on interoperability between different implementations of a given service, without having to add that service as a requirement in the core OpenStack Powered programs. As a starting point, it was proposed to create advisory add-ons for DNS (Designate) and Orchestration (Heat). After some discussions on the implications of these additions, they were formally approved by the board.

Next Meeting

The board's next meeting is a 2 hour conference call on Tuesday, October 10. Our next in-person meeting will be in Sydney on Sunday, November 5.

by markmc at September 14, 2017 10:00 AM

September 13, 2017

OpenStack @ NetApp

My Name Is Pike

NetApp is thrilled to have contributed to and congratulates the OpenStack Community on the Pike release of OpenStack.  The OpenStack Foundation’s Pike press release mentions a lot of new and exciting features across the board.  In this blog, we call out a few that are of particular interest to us. Cinder: Volume Groups Cinder now supports the new ... Read more

The post My Name Is Pike appeared first on thePub.

by Chad Morgenstern at September 13, 2017 04:59 PM

OpenStack Superuser

It’s that time again! Cast your vote for the Sydney Superuser Awards

Voting is now closed. Stay tuned to find out who wins!

The OpenStack Summit kicks off in less than eight weeks and seven deserving organizations have been nominated to be recognized during the opening keynotes. These organizations are competing to win the Superuser Award that will be presented by the most recent winner from the OpenStack Summit Boston.

For this cycle, the community (that means you!) will review the candidates before the Superuser editorial advisors select the finalists and ultimate winner. Finalists will be recognized onstage during the OpenStack Summit Sydney keynotes.

Check out the nominations in alphabetical order below and click through to see each organization’s full application. Then, rate the nominees to select who you think should be recognized at the OpenStack Summit Sydney.

You have until Wednesday, September 20 at 11:59 p.m. PT to rate them. Cast your ratings here.

China Railway Corporation

China Railway’s deployment of OpenStack cloud has helped them save millions of dollars and their application launch cycle has been shortened from several months to one-two days, with much quicker response to business requirements. Cloud computing has improved their resource utilization rates and measurements have shown that energy consumption has been saved by approximately 50 percent. They are also adjusting architecture for large-scale deployment, verifying that 800 servers hosting 100,000 VMs in the same region can be operating stably, supporting high availability of control nodes.

China UnionPay

China UnionPay is a pivotal element in China’s bankcard industry and has extended its card acceptance to 162 countries and regions. They have a scale of 1,200 compute nodes across two data centers, which make up 10,000 cores and about 2.0 petabytes of storage, all running on OpenStack. Some 80 percent of UnionPay’s production applications, including many critical ones, have been migrated onto UnionPay Cloud, accounting for over 150 applications. Their cloud is now supporting 500 million users, averaging 50 million transactions per day, 3,000 transaction per second at peak, with 100 billion RMB per day.

City Network

City Network runs their public OpenStack based cloud in eight regions across three continents. All of their data centers are interconnected via private networks. Apart from their public cloud, they run a pan-European cloud for  finance verticals solving all regulatory challenges. Over 2,000 users of their Infrastructure as a Service (IaaS) solutions run over 10,000 cores in production.

Insurance Australia Group Data Team

IAG is the highly trusted name behind many of the APAC region’s leading insurance groups. Their adoption of OpenStack has enabled the move to a real-time data architecture that allows for continuous integration and building of data products in days instead of months, changing their culture and way of delivering both internal and external customer value. The performance for data workloads is four times the performance and one-fifth the cost of their VMware environment, currently supporting workloads from micro to 8xlarge with volumes from 40G to 18TB per node.

Memset Hosting

OpenStack Swift has been part of Memset’s infrastructure since the Essex release. It forms the core of their multi-award winning distributed cloud storage platform, Memstore, which serves about 6.7 million requests a day. Memset operates multiple geographically diverse Swift platforms in production, each with varying capacity totaling around .75PB. Alongside this, their OpenStack IaaS deployment has approximately 2TB of production compute available over two geographically diverse regions advertising 300TB of raw Ceph storage through Cinder.

Tencent TStack Team

Tencent is a leading provider of Internet value added services in China. Their OpenStack-based private cloud platform cuts server costs by 30 percent and operator and maintainer costs by 55 percent, and saves them RMB100 million+ each year. It shortens resource delivery from two weeks to 0.5 hours and supports the development teams (such as QQ, WeChat, and Game) for services that generate tens of billions of revenues.

VEXXHOST

VEXXHOST is a leading Canadian public, private and hybrid cloud provider with an infrastructure powered by 100 percent vanilla OpenStack. Since the migration to an OpenStack infrastructure in 2011, VEXXHOST has been offering infrastructure-as-a-service without any vendor lock-in or proprietary technology. OpenStack compute, network and storage are the backbone powering all of their managed solutions.

The post It’s that time again! Cast your vote for the Sydney Superuser Awards appeared first on OpenStack Superuser.

by Ashlee Ferguson at September 13, 2017 04:47 PM

Sydney Superuser Award Nominee: China Railway Corporation

Voting is now closed. Stay tuned to find out who wins!

It’s time for the community to determine the winner of the Superuser Award to be presented at the OpenStack Sydney Summit. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

The China Railway Corporation is among the seven nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline Wednesday, September 20 at 11:59 p.m. Pacific Time Zone.

Please specify the team and organization for nomination.

The team for nomination: China Railway Information Technology Center(CRITC)
The joint team: CRITC, Beijing SinoRail Information Technology Co. Ltd.(BJSRIT), Beijing T2Cloud Technology Co. Ltd., with a total of about 200 engineers.

CRITC:
Mingxing Gao/Project Director
Yang Liu/System Architect
Liang Liu/Senior Engineer

BJSRIT:
Wei Rao/R&D Manager
Guangqian Li/Technical Support Manager
Yahong Du/OpenStack Leader
Minhong Wang/IAD of OMS Group Leader
Gang Xu/MON & AMS of OMS Group Leader

T2Cloud:
Jinyang Zhao/R&D Manager
Yahui Hou/CMP Group Leader
Chao Xie/OpenStack Leader
Hanchen Lin/Testing Leader
Tony Xu/Software Architect

How has OpenStack transformed the organization’s business?

OpenStack is the key element to promote transforming the construction model of China railway information system from the traditional project-driven to platform-driven. It also supports the high-speed rail as the foundation of China’s growth strategy. The adoption of OpenStack marked the first time CRITC has fully and embraced open source technology with an open mind. The deployment of OpenStack cloud saved us millions of dollars and the application launch cycle has been shortened from several months to 1-2 days, with much quicker response to business requirements.Cloud computing has improved resource utilization rates, and measurements have shown that energy consumption has been saved by approximately 50 percent.

How has the organization participated in or contributed to the OpenStack community?

In 2014, CRITC started developing an open source cloud solution based on OpenStack and started to actively participate in OpenStack community activities from 2016 on. This included sharing a topic at the 2016 OpenStack Days China, two topics at 2017 Boston Summit, two topics at 2017 Sydney Summit to be held and giving a keynote presentation at 2017 OpenStack Days China. In 2016 and 2017 they also hosted OpenStack meetups in Beijing and Nanjing respectively. We have contributed 734 patch sets, 5,979 lines of code, and submitted and resolved 47 bugs for the OpenStack community. We have shared some practice experience of stress testing and performance optimization analysis at 2017 Boston summit and 2017 OpenStack Days China and written whitepapers for the communities.

What open source technologies does the organization use in its IT environment?

In addition to OpenStack, the China Railway Cloud depends on KVM, OpenVSwitch/LinuxBridge, Hadoop, Kafka, Flume, Spark, CentOS, LXC, Docker, Kubernetes, OpenShift, CEPH, GlusterFS, Redis, MongoDB, MySQL/MariaDB, Ansible, Open-Falcon, ELK, ZeroMQ/RabbitMQ, etc..

What is the scale of the OpenStack deployment?

Deployed about 5,000 physical server nodes, including about 800 KVM nodes and about 730 VMware nodes; 20PB SAN storage, 3PB distributed storage (Ceph). An additional 2,000 physical server nodes are to be deployed in the end of 2017..

Our OpenStack cloud platform, with a scale of 800 physical nodes, hosts thousands of VMs and a dozen of mission critical applications, which covers 18 railway bureaus and over 2,000 railway stations and powers the production. OpenStack cloud platform also well undertook the huge pressure brought by Spring Festival peak to the system, especially the over 31 billion daily average page view, and it has also supported stable, safe and 24/7 uninterrupted operation of real-time dispatching management for all the trains, locomotives and vehicles.

What kind of operational challenges have you overcome during your experience with OpenStack?

Suspension of OpenStack service: By monitoring the service process and the status of log generation, we can check whether the OpenStack service is hung up or not.

Version upgrade problems: Previously, we had already upgraded the cloud software from Essex to Juno version and from Juno to Liberty. With lots of changes made on the community edition, we are now upgrading the software from the Liberty version to Ocata and we will upgrade the production system online after the above work are done.

High availability of cloud: We realized there was a function of automatic failover to achieve high availability of VMs. Compute nodes would be isolated and their VMs would be evacuated to the other nodes in the same zone if a certain network is down or is unresponsive for two minutes.

How is this team innovating with OpenStack?

Comprehensive cloud solution plan: Based on open source components, we developed the Operation Management System (OMS) complementary to our cloud software. Currently OMS consists of monitor, automation and analysis.

Multidimensional and customer-oriented improvements: Modified front-end functions to optimize customer experience, added operation type logs for easier archiving by administrators, added permissions control, added failover function, etc..

Architecture adjustment for large-scale deployment: We verified that 800 servers hosting 100,000 VMs in the same region can be operating stably, supporting high availability of control nodes.

How many Certified OpenStack Administrators (COAs) are on your team?

There is currently one COA on our team. Our team plans to be more involved in the OpenStack community, and cultivate more talent, aiming to obtain 3-5 COAs in 2018 and 5-10 COAs in 2019.

The post Sydney Superuser Award Nominee: China Railway Corporation appeared first on OpenStack Superuser.

by Ashlee Ferguson at September 13, 2017 04:45 PM

Sydney Superuser Awards Nominee: China UnionPay

Voting is now closed. Stay tuned to find out who wins!

It’s time for the community to determine the winner of the Superuser Award to be presented at the OpenStack Sydney Summit. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

China UnionPay is among the seven nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline Wednesday, September 20 at 11:59 p.m. Pacific Time Zone.

Please specify the team and organization for nomination. 

UnionPay plays a pivotal role in China’s bankcard industry and has extended its card acceptance to 162 countries and regions. Deployed in 2012, UnionPay Cloud was the first financial cloud powered by OpenStack in China’s financial industry. UnionPay has a dedicated OpenStack team with more than 50 members and many of them with COA training and upstream skills.

How has OpenStack transformed the organization’s business? 

With the introduction of OpenStack, UnionPay formed a cloud computing team, focusing on computing, storage, networking etc. OpenStack significantly shortened the time for UnionPay’s data centers to provide resources for the applications, going from a few days to a few minutes. OpenStack’s resilient ability has helped UnionPay to support users and greater business agility. UnionPay Cloud has supported the applications in cooperation with China Eastern Airlines, meanwhile UnionPay also introduced OpenStack to Bank of Shanghai and helped its deployment.

How has the organization participated in or contributed to the OpenStack community? 

We are frequent participants of OpenStack Days China and we also participated in most of the meetups in Shanghai and Beijing. We have delivered three keynote speeches in OpenStack Days China to share our experience as a typical OpenStack financial user in China. The request to build the Financial Working Group from UnionPay has been approved by the User Committee and we also participated Large Contributing OpenStack Operators (LCOO) Work Group to make more contribution to community. UnionPay works closely with EasyStack and Intel in the community work and we will give a speech with EasyStack, titled “UnionPay’s five years in the Financial Cloud Powered by OpenStack” at the OpenStack Sydney Summit.

What open source technologies does the organization use in its IT environment?

In terms of cloud computing, we use OpenStack, CentOS, KVM, Xen, Libvirt, Qemu, Open vSwitch, Ceph, Pacemaker, Corosync, HAProxy, Cobbler, Puppet, Ansible, Rabbitmq, Memcached, Mongodb, Mysql, Apache2, Zabbix, etc.
As for OpenStack, we use many components such as Nova, Cinder, Neutron, Keystone, Glance, Ceilometer, Ironic.
For Big Data, we use Hadoop, Spark, Impala, Kudu, etc.

We also use many other open source projects in applications such as Kafka, JBoss Application Server, Netty, Spring, Hibernate, Struts, etc.

What is the scale of the OpenStack deployment? 

We have the scale out of 1,200 compute nodes across two data centers, which make up 10,000 cores and about 2.0 petabytes of storage, all running on OpenStack. 80 percent of UnionPay’s production applications, including many critical ones, have been migrated onto UnionPay Cloud, accounting for over 150 applications. Our cloud is now supporting 500 million users, averaging 50 million transactions per day, 3000 transaction per second at peak, with 100 billion RMB per day. In order to support more new functions and more powerful management of UnionPay Cloud, we have completed the deployment of UnionPay Cloud 2.0 powered by OpenStack Liberty, with many applications already running on it.

What kind of operational challenges have you overcome during your experience with OpenStack? 

We designed several different network data planes and set corresponding QoS restrictions in both switches and servers to make sure that the network traffic will not affect each other.
We integrated Neutron with Huawei SDN solutions through driver to improve the entire network performance. It is the first mature case and put onto the critical production in China’s financial industry. We modified Keystone’s code to integrate it with UnionPay’s existing SSO system to unify the authentication mechanism. It is risky and complicated to upgrade the existing old OpenStack version directly, so we deployed the new UnionPay Cloud 2.0 paralleled with the old one and make a detailed plan to migrate the applications gradually.

How is this team innovating with OpenStack? 

We are able to manage F5 LTM device through LBaas V2 API and implemented the advanced function of LTM beyond LBaas on portal directly through F5 SDK. We are able to manage Cisco ASA devices which locate between different OpenStack regions through FWaas API and custom developed ASA SDK to automate the management of firewall. We developed a unified portal to manage all OpenStack regions for more convenient operation and maintenance. We extended Nova to implement the function of live resize for the virtual machine and it is very important for the critical applications in the financial industry. We are able to make sure the host high availability by deploying a custom daemon process on every controller node to detect the status of compute nodes.

How many Certified OpenStack Administrators (COAs) are on your team?

None, but dozens of engineers in our team have taken COA training and are preparing for COAs tests in 2018.

 

Cover photo courtesy UnionPay.

The post Sydney Superuser Awards Nominee: China UnionPay appeared first on OpenStack Superuser.

by Ashlee Ferguson at September 13, 2017 04:36 PM

Sydney Superuser Award Nominee: City Network

Voting is now closed. Stay tuned to find out who wins!

It’s time for the community to determine the winner of the Superuser Award to be presented at the OpenStack Sydney Summit. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

City Network is among the seven nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline Wednesday, September 20 at 11:59 p.m. Pacific Time Zone.

Please specify the team and organization for nomination.

City Network’s Operations and DevOps team: Marcus Murwall, Tobias Rydberg, Magnus Bergman, Tobias Johansson, Alexander Roos, Johan Hedberg, Joakim Olsson, Emil Sundstedt, Kriss Andsten.

How has Openstack transformed the organization’s business?

Shifting to 100% focus on OpenStack has been key to the global expansion of our organization in general and our cloud offerings in particular. With OpenStack and all its features, ease of use and popularity as the catalyst we have added value through multiple data centers in Europe, US, Asia and UAE as well as a clear strategy and implementation for Data protection and regulatory aspects.

With OpenStack we also get the benefit of providing open APIs which prohibits our customers to be locked in with only us as an IaaS provider. It´s easy for the customer to start and easy to leave, as we are convinced that this is a must for all providers in order to stay relevant going forward.

How has the organization participated in or contributed to the OpenStack community?

Our CEO Johan Christenson was recently elected as a member of the OpenStack Foundation Board. His goal is to help the community and the ecosystem to leverage this open platform and drive the transformation for a more open, and future proof IT infrastructure.

In 2016, we initiated and organized the first ever OpenStack Days Nordic event in Stockholm and this year we are once again leading the way to take the event to Copenhagen.

We also participate in the summit user groups public cloud and the security project.

What open source technologies does the organization use in its IT environment?

We are very pro open source and use it every time where open source is a viable option.

A selection of the Open Source technologies we are currently using: CentOS, OpenBSD, Ubuntu, Nginx, Apache, php, Python, Ansible, MySQL, Mariadb, Mongodb and Ceph.

What is the scale of the OpenStack deployment?

We run our public OpenStack based cloud in eight regions across three continents. All of our data centers are interconnected via private networks. Apart from our public cloud, we run a Pan-European cloud for the finance vertical solving all regulatory challenges. Over 2,000 users of our infrastructure-as-a-service (IaaS) solutions run over 10,000 cores in production.

What kind of operational challenges have you overcome during your experience with OpenStack?

Since we are running OpenStack as Public IaaS there have been a lot of hurdles to overcome as OpenStack is not yet fully adapted for public clouds. We had to build our own APIs in order to get network connectivity over several sites to work and also we had to add features such as volume copy and the ability to move volumes between sites. We have also had our fair share of issues with upgrading to new OpenStack versions, however we do feel as this process have been getting better with each upgrade.

How is this team innovating with OpenStack?

We innovate with OpenStack on two main focus areas:

One of them is figuring out how we can interconnect all our OpenStack data centers over a global, private network and all the benefits that comes from doing so—one of which is being able to provide our customers with direct, private access to our cloud services.

The other focus area is helping regulated companies, mainly in the financial and healthcare industries, with their digital transformation and cloud adoption. By building completely separated cloud services compliant with regulations such as ISO 9001, 27001, 27015, 27018, Basel, Solvency and HIPA, we allow for these industries to go cloud with a pay-as-you-go model and be truly agile.

How many Certified OpenStack Administrators (COAs) are on your team?

Three.

The post Sydney Superuser Award Nominee: City Network appeared first on OpenStack Superuser.

by Ashlee Ferguson at September 13, 2017 04:20 PM

Sydney Superuser Award Nominee: Insurance Australia Group (IAG) Data Team

Voting is now closed. Stay tuned to find out who wins!

It’s time for the community to determine the winner of the Superuser Award to be presented at the OpenStack Sydney Summit. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

The Insurance Australia Group (IAG) Data Team is among the seven nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline Wednesday, September 20 at 11:59 p.m. Pacific Time Zone.

How has OpenStack transformed the organization’s business? 

OpenStack has enabled the move to a real time data architecture that allows for continuous integration and building of data products in days instead of months. This work is all tightly integrated with GitHub for pair programming and peer review and then automatically deploying new data and analytics products to serve our internal users better and faster. OpenStack has also allowed us to enable new ways to interact with our external customers and make their world a safer place. The OpenStack solution has also improved our performance stats and enabled new ways to deliver open source software solutions point in time for our internal teams. Overall it has changed our culture and our way of delivering both internal and external customer value.

How has the organization participated in or contributed to the OpenStack community? 

IAG is active in many open source communities and has been on the mailing lists and attending community events as well as presenting at several conferences on our transformative solution. In addition to partnering with external companies to commit code to enable other to leverage our solutions IAG will be open sourcing our first piece of software in early September directly that has all been built using our OpenStack solution.

What open source technologies does the organization use in its IT environment?

IAG leverages a number of data as well as web open source solutions today and contributes back to a number of them as well thru our digital and data teams.

What is the scale of the OpenStack deployment? 

OpenStack is deployed in three separate tenants within IAG: Pre-production, production and analytics. The analytics tenant consists of 12 server with high performance and archival storage leveraging commercial ScaleIO SDS solution. The pre-production is 10 nodes and uses a nl-sas solution using open source ScaleIO SDS solution. The production tenant has 18 node of a mixed SSD, SAS and nl-SAS the later two leverage commercial ScaleIO SDS solution. The performance for data workloads is four times the performance and one-fifth the cost of our VMWare environment. Currently we support workloads from micro to 8xlarge and have volumes from 40G to 18TB per node.

What kind of operational challenges have you overcome during your experience with OpenStack? 

The ability to stand up a new complete environment end to end do a quick validation of proof of concept and fail fast has been highly valuable and the upgrade and migration of the platform has been seamless.

How is this team innovating with OpenStack? 

The data workloads and adoption of quick fail fast prototypes to deliver data products has aided in our pricing, customer service, marketing and analytics space. The solution has also allowed IAG to focus on core customer facing capability using developers and code, while not worrying about the infrastructure underneath.

How many Certified OpenStack Administrators (COAs) are on your team?

Five.

The post Sydney Superuser Award Nominee: Insurance Australia Group (IAG) Data Team appeared first on OpenStack Superuser.

by Ashlee Ferguson at September 13, 2017 04:15 PM

Sydney Superuser Award Nominee: Memset Hosting – OpenStack Public Cloud Team

Voting is now closed. Stay tuned to find out who wins!

It’s time for the community to determine the winner of the Superuser Award to be presented at the OpenStack Sydney Summit. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

Memset Hosting is among the seven nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline Wednesday, September 20 at 11:59 p.m. Pacific Time Zone.

Please specify the team and organization for nomination.

Ross Martyn
George Stamoulis
Simon Weald
Stewart Perrygrove
Nick Craig-Wood

Memset Hosting has been focused on open source since our inception in 2002. Given the open nature of OpenStack, Memset naturally became heavily interested early in OpenStack’s history.

We had a clear focus from day one; deploy a highly secure, highly available public cloud platform, that meets the strict requirements of the UK Government, then offer that same rock solid level of security to the remainder of our customer base.

How has OpenStack transformed the organization’s business?

OpenStack Swift has been part of Memset’s infrastructure since the Essex release. It forms the core of our multi-award winning distributed cloud storage platform, Memstore, serving ~6.7 million requests a day.

Memset has since developed a fully self-service OpenStack IaaS platform, offering our customers highly performant and tightly-secured public cloud, perfect for UK public and private sector needs.

OpenStack’s rapid development and fantastic community support enables us to implement new features, performance improvements and security updates quicker than ever before, and, excitingly, is increasingly working its way into other parts of our business, most recently being used to breathe new life into our older proprietary Xen- based Cloud VPS platform.

How has the organization participated in or contributed to the OpenStack community?

Alongside our presence in the IRC community, Memset regularly attends a variety of OpenStack events around the globe, including Summits, working groups and operators meetups. We also visit other open source events including Ceph, Kubernetes meet-ups and Gophercon. We are a regular sponsor of events including the London OpenStack Meetup and the upcoming OpenStack Days London conference.

As well as the social aspect of the community, Memset’s team strives to feedback bugs, patches and reviews to the community wherever possible, with a string of contributions to OpenStack since early 2012 to multiple projects.

Memset recently became a corporate sponsor of the OpenStack Foundation, offering our highly secure public IaaS and award winning Cloud Storage (Memstore) from the OpenStack Marketplace.

What open source technologies does the organization use in its IT environment?

Open source is built into the very core of Memset. We extensively utilize an open-source approach wherever possible. This is evident in every layer of our business, from the Ubuntu powered Operations and Development team, through to the heavy Python development and Django management backend that powers Memset’s business critical systems. We strive to build and maintain great public services, using brilliant people, community driven software and commodity hardware.

Memset’s co-owner and technical director Nick Craig-Wood has been able to personally rock the open source object storage community with his pet project rclone – (an rsync tool for cloud storage), and the widely used Go Swift Library, providing an easy way for new Go projects to interact with Swift environments.

Whilst not an exhaustive list, most commonly used technologies at Memset include: OpenStack, Puppet, Ansible, Terraform, Jenkins, Gerrit, Git, SVN, Python, Django, Packer, Docker, HaProxy, Nginx and much more…

What is the scale of the OpenStack deployment?

Memset operates multiple geographically diverse Swift platforms in production, each with varying capacity totaling around .75PB. Alongside this, our OpenStack IaaS deployment has approximately 2TB of production compute available over two geographically diverse regions advertising 300TB of raw Ceph storage through Cinder. We integrate NVME, SSD and HDD storage, all built on latest generation Dell equipment.

Memset operates at two sites in the UK, including its privately owned data center located in Dunsfold. Memset strives to operate in an environmentally friendly manner, regularly winning awards for doing so, aided by our utilization of a large local solar farm to offset a portion of our high power requirements.

What kind of operational challenges have you overcome during your experience with OpenStack?

OpenStack has been fairly easy to integrate into our mature business, mostly due to the great documentation, administration and security guides. Whilst times can become stressful during upgrades and patching cycles, and especially when dealing with customers in production, OpenStack gives us easy access to intelligent management capabilities that allow us to maintain customer uptime and in turn, their satisfaction.

OpenStack’s APIs can give us the ability to dynamically schedule migrations of workloads in order to attempt zero downtime patching, software and hardware upgrades. We strive to minimize this even further for the future, especially as recent OpenStack releases have focused so heavily on improving these processes.

How is this team innovating with OpenStack?

Memset’s OS team focus is currently moving their production environments towards a container based control plane. This will allow us to be able to keep pace with the fast OpenStack release cycle easier than ever before.

Utilizing what we have learned, we aim to be able to stay closer to trunk, and plan to offer Magnum and Octavia as self-service offerings to our loyal customers very soon.

How many Certified OpenStack Administrators (COAs) are on your team?

At Memset we have more than half a dozen Mirantis certified OpenStack Administrators and currently have two engineers studying for the COA exam.

The post Sydney Superuser Award Nominee: Memset Hosting – OpenStack Public Cloud Team appeared first on OpenStack Superuser.

by Ashlee Ferguson at September 13, 2017 04:09 PM

Sydney Superuser Awards Nominee: Tencent TStack Team

It’s time for the community to determine the winner of the Superuser Award to be presented at the OpenStack Sydney Summit. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

The Tencent TStack Team is among the seven nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline Wednesday, September 20 at 11:59 p.m. Pacific Time Zone.

Please specify the team and organization for nomination.

The Tencent TStack team is comprised of 76 members who developed the OpenStack-based TStack private cloud platform. It consists of four sub-teams:

  • Product design: Responsible for requirement analysis and interaction design
  • Technical architecture: Responsible for solution design and technology research
  • Product development: Responsible for feature design and implementation
  • Operations support: Responsible for deployment, monitoring and troubleshooting.

We built private cloud to provide services for internal IT environments and testing environments (e.g. QQ, WeChat).
We also provide complete hybrid cloud services for government departments and enterprises in China.

How has Openstack transformed the organization’s business?

The OpenStack-based private cloud platform cuts server costs by 30% and O&M costs by 55%, and saves RMB100 million+ each year for Tencent. It shortens resource delivery from two weeks to 0.5 hours and supports the development teams (such as QQ, WeChat, and Game) of the services that generate tens of billions of revenues for Tencent. It optimizes global resource scheduling. For example, the deployment duration of the global mail system has been cut from 10 days to 1 day. It is an important cloud computing platform for Tencent to achieve its “Internet plus” strategy. It is deployed in multiple provinces in China, including Sichuan government cloud, Guangdong government cloud, Xiamen government cloud, and Yunnan government cloud. It serves more than 100 million users.

How has the organization participated in or contributed to the OpenStack community?

Tencent actively participates in community activities, such as OSCAR, OpenStack Days China. As a participant and sponsor, Tencent shared experience in using OpenStack. Currently, the TStack team is preparing to participate in OpenStack Sydney Submit. We compiled the OpenStack Usage White Paper which we shared with our customers. We created a WeChat official account to share our experience with OpenStack users, and submitted bugs and blueprints to the community. We cooperated with Intel to apply the latest features (such as RDT, DPDK, SPDK, FPGA, and Clear Container) to accelerate the production. Tencent plans to join the OpenStack Foundation. Tencent is one of the members of the CNCF, Linux Foundation, and MariaDB Foundation.

What open source technologies does the organization use in its IT environment?

Technology stacks used in the TStack team are built with open source tools, including OpenStack, KVM, Centos, Ironic, HAProxy, keepalived, Docker, Clear Container, Kubernetes, Rabbitmq, Mairadb, Nginx, Ansible, Jenkins, Git, ELK, Zabbix, Grafana, Influxdb, Tempest, and Rally. These tools are used in development, testing, and CI/CD.

What is the scale of the OpenStack deployment?

TStack has 6,000+ nodes, including 2,000 ironic nodes. It manages more than 12,000 VMs, provides 80,000+ cores, 360+ TB mem, and 20+ PB disks. It covers 14 clusters in 7 data centers in 4 regions. 1,000+ nodes managed in a single region. It carries 300+ online services, including OA, WeChat gateway, email, and ERP. For example, the email system on the TStack provides 24/7 services for 40,000 employees. It provides CI/CD environment for 20,000 developers and testing environment for large applications (such as QQ and WeChat), and supports concurrent access of more than 100 million users. Tencent promoted TStack to Chinese government and enterprise. It signed cooperation agreements with 15 provinces and 50 cities, for example, TStack-based Xiamen government cloud in the BRICS Summit.

What kind of operational challenges have you overcome during your experience with OpenStack?

Many different types of VMs, such as XEN and KVM, are deployed in Tencent. It is a huge challenge to manage VMs seamlessly on the OpenStack platform. The TStack team developed a set of tools to manage VMs on existing heterogeneous virtualization platforms without interrupting services. Earlier versions of OpenStack had bottlenecks, including message queues and keystones in large-scale deployment. After large-scale optimization, a single region can now support more than 1,000 computing nodes. In large-scale deployment, the VxLAN performance is a bottleneck. Tencent developed an OpenStack-based technology that is compatible with SDN hardware and software of multiple vendors to accelerate the VxLAN. The adaptive compression technology is used to cut the VM migration duration by 50 percent.

How is this team innovating with OpenStack?

Strict real-time rate limit to control CPU usage time and ensure VM QoS. VM scheduling policies are customized based on service tag, with VMs dispatched to different hosts to ensure high availability. The online resize function resizes VMs without interruption. The adaptive compression technology is used to cut VM migration duration by 50%. Cooperated with Intel to provide Clear Container on TStack for better security. The Neutron-based SDN controller supports heterogeneous networks and performs unified management on legacy networks, SDNs and NFVs. Integration of Tencent cloud security technologies with OpenStack to provide complete cloud security services. Implemented unified management of OpenStack-based private cloud, public cloud, and VMware.

How many Certified OpenStack Administrators (COAs) are on your team?

One team member is COA-certified, 28 have received COA training and nine will take the COA examination in October.

 

Cover image: Shenzen’s Tencent building, courtesy Tencent.

The post Sydney Superuser Awards Nominee: Tencent TStack Team appeared first on OpenStack Superuser.

by Ashlee Ferguson at September 13, 2017 03:59 PM

Sydney Superuser Award Nominee: VEXXHOST

It’s time for the community to determine the winner of the Superuser Award to be presented at the OpenStack Sydney Summit. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

VEXXHOST is among the seven nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline Wednesday, September 20 at 11:59 p.m. Pacific Time Zone.

Please specify the team and organization for nomination.

VEXXHOST is a leading Canadian public, private and hybrid cloud provider with an infrastructure powered by 100 percent vanilla OpenStack. Since the migration to an OpenStack infrastructure in 2011, VEXXHOST has been offering infrastructure-as-a-service without any vendor lock-in or proprietary technology. The VEXXHOST team, lead by CEO Mohammed Naser, delivers a high level of expertise to help users optimize cloud infrastructure so they can focus on their core competencies.

How has Openstack transformed the organization’s business?

OpenStack has helped VEXXHOST speed up the delivery of new services to customers by accelerating ability to innovate. We can focus on delivering greater customer solutions instead of dealing with infrastructure issues which are solved by OpenStack. Every release of OpenStack has offered new features that were utilized to deliver higher performance.

How has the organization participated in or contributed to the OpenStack community?

VEXXHOST have been contributing to the OpenStack community since its second release in 2011. We have had a presence in the community by regularly attending OpenStack summits and being part of the Interop challenge during the Boston summit in 2017. We have hosted OpenStack Canada day and help organize the Montreal OpenStack meetup. Our co-founder Mohammed Naser, who is the PTL for Puppet OpenStack, has given talks at Montreal, Ottawa and Toronto OpenStack meetups.

We also play a part in the community by actively contributing upstream code and sharing feedback with PTLs and developers. When we encounter bugs, we report them, diagnose them and work with the community to get a full fix in. We are active on the mailing list and provide feedback and fixes. We also contribute to the community.

What open source technologies does the organization use in its IT environment?

We run exclusively OpenStack services across our entire infrastructure. Our offering is fully open source without any proprietary licensed technology. Among many others, we use in our IT environment Nova with KVM with Libvirt, Ceph centralized storage, Pacemaker for high availability, MySQL Galera for database and Puppet for config management.

What is the scale of the OpenStack deployment?

Being a public cloud provider, we cannot disclose metrics regarding the scale of our users’ consumption. Our public cloud is able to handle several production-grade enterprise-scale workloads with private and hybrid cloud solutions delivering the same level of stability and robustness as our public cloud. Both our infrastructure and users production workloads are powered by OpenStack. OpenStack compute, network and storage are the backbone that is powering all our managed solutions.

What kind of operational challenges have you overcome during your experience with OpenStack?

Initially we have faced some challenges in terms of rolling upgrades as they were difficult though they have become much easier with new releases. After upgrading our infrastructure to Pike, we found a bug in the code which we reported. The developers at OpenStack were very responsive and happy to cooperate — as they always are — to help fix the bug. The bug was fixed in less than 24 hours in trunk and less than 48 hours in stable branches. This increases our trust the OpenStack CI and grows our confidence in the software.

How is this team innovating with OpenStack?

As a public and private cloud provider, we are heavily invested in improving and extending our list of managed services. Using OpenStack has helped us innovate in our managed services. In August 2017, we launched Kubernetes services using Magnum on the Pike release. We worked with the Magnum project team to ensure delivery of the best possible Kubernetes and OpenStack experience. VEXXHOST is currently one of the very few cloud providers to offer Magnum. OpenStack has also facilitated the delivery of big data solutions with the help of Sahara integration. We were also able to speed up the deployment of clusters with the help of transient clusters which provide huge cost savings.

How many Certified OpenStack Administrators (COAs) are on your team?

The VEXXHOST team does not have any Certified OpenStack Administrators on its team.

The post Sydney Superuser Award Nominee: VEXXHOST appeared first on OpenStack Superuser.

by Ashlee Ferguson at September 13, 2017 03:52 PM

OpenStack Blog - Swapnil Kulkarni

OpenStack Queens PTG, Denver – Day 2

Sep 12, 2017

The day started with meeting with people at registration desk hallway. I joined the TC discussion around Q & A for new projects in OpenStack. The discussion included emerging projects like Blazar, Glare, Gluon, Masakari, Stackube, Cyborg and Mogan. Each project representative presented TC members with general project overview, objectives, current status and any queries related to the Requirements for new OpenStack Projects applications. TC members also asked details related to maturity of projects w.r.t. contributors diversity, potential to increase collaboration, identifying any overlap with current/existing projects in OpenStack. We also had a discussion around potential project removals from OpenStack official projects. This is due to very low contributor activity during the cycle, no project goals achieved or changed focus of the participating organisations. THe particular projects highlighted in the area are Searchlight, Solum, Designate, CloudKitty.

I and inc0 were chasing the infra/TC members for resolution related to publishing kolla images on public registry like quay.io with openstack-kolla namespace. We did not get any direct resolution, need to initiate a discussion on openstack-dev mailing list for further discussion.

I met with Saad Zaher PTL of Freezer regarding the current status of project. I had very informative discussion and got my path clear for things to do ahead.

We had few hallway discussions related to kolla-kubernetes, openstack-helm, kolla with Ironic, kolla with neutron.

In the mean time we also had team photos clicked for kolla and requirements team. Looking forward to get them from the OpenStack marketing team.

This now ends the fist section of schedule of inter-project discussions. The day ended with IBM sponsored happy hour in the hotel where random discussions happened regarding projects, OpenStack, etc.

by Swapnil Kulkarni at September 13, 2017 02:38 PM

Opensource.com

5 great new OpenStack tips and guides

Learning the ins and outs of OpenStack doesn't have to be complicated. Here are guides to help.

by Jason Baker at September 13, 2017 07:00 AM

September 12, 2017

James Page

OpenStack Charms 17.08 release!

The OpenStack Charms team is pleased to announce that the 17.08 release of the OpenStack Charms is now available from jujucharms.com!

In addition to 204 bug fixes across the charms and support for OpenStack Pike, this release includes a new charm for Gnocchi, support for Neutron internal DNS, Percona Cluster performance tuning and much more.

For full details of all the new goodness in this release please refer to the release notes.

Thanks go to the following people who contributed to this release:

Nobuto Murata
Mario Splivalo
Ante Karamatić
zhangbailin
Shane Peters
Billy Olsen
Tytus Kurek
Frode Nordahl
Felipe Reyes
David Ames
Jorge Niedbalski
Daniel Axtens
Edward Hope-Morley
Chris MacNaughton
Xav Paice
James Page
Jason Hobbs
Alex Kavanagh
Corey Bryant
Ryan Beisner
Graham Burgess
Andrew McLeod
Aymen  Frikha
Hua Zhang
Alvaro Uría
Peter Sabaini

EOM

 

 


by JavaCruft at September 12, 2017 09:59 PM

Alessandro Pilotti

Easily deploy a Kubernetes cluster on OpenStack

Platform and cloud interoperability has come a long way. IaaS and unstructured PaaS options such as OpenStack and Kubernetes can be combined to create cloud-native applications. In this port we’re going to show how Kubernetes can de deployed on an OpenStack cloud infrastructure.

 

Setup

My setup is quite simple, an Ocata all-in-one deployment with compute KVM. The OpenStack infrastructure was deployed with Kolla. The deployment method is not important here, but Magnum and Heat need to be deployed alongside other OpenStack services such as Nova or Neutron. To do this, enable those two services form /etc/kolla/global.yml file. If you are using Devstack, here is a local.conf that is deploying Heat and Magnum.

 

Kubernetes deployment

The Kubernetes cluster will consist of 1 master node and 2 minion nodes. I’m going to use Fedora atomic images for VMs. One useful info is that I used a 1 CPU, 2GB of RAM and 7GB disk flavor for the VMs. Below are the commands used to create the necessary environment setup. Please make sure to change IPs and different configurations to suit your environment.

# Download the cloud image
wget  https://ftp-stud.hs-esslingen.de/pub/Mirrors/alt.fedoraproject.org/atomic/stable/Fedora-Atomic-25-20170512.2/CloudImages/x86_64/images/Fedora-Atomic-25-20170512.2.x86_64.qcow2

# If using HyperV, convert it to VHD format
qemu-img convert -f qcow2 -O vhdx Fedora-Atomic-25-20170512.2.x86_64.qcow2 fedora-atomic.vhdx

# Provision the cloud image, I'm using KVM so using the qcow2 image
openstack image create --public --property os_distro='fedora-atomic' --disk-format qcow2 \
--container-format bare --file /root/Fedora-Atomic-25-20170512.2.x86_64.qcow2 \
fedora-atomic.qcow2

# Create a flavor
nova flavor-create cloud.flavor auto 2048 7 1 --is-public True

# Create a key pair
openstack keypair create --public-key ~/.ssh/id_rsa.pub kolla-ubuntu

# Create Neutron networks
# Public network
neutron net-create public_net --shared --router:external --provider:physical_network \
physnet2 --provider:network_type flat

neutron subnet-create public_net 10.7.15.0/24 --name public_subnet \
--allocation-pool start=10.7.15.150,end=10.7.15.180 --disable-dhcp --gateway 10.7.15.1

# Private network
neutron net-create private_net_vlan --provider:segmentation_id 500 \
--provider:physical_network physnet1 --provider:network_type vlan

neutron subnet-create private_net_vlan 10.10.20.0/24 --name private_subnet \
--allocation-pool start=10.10.20.50,end=10.10.20.100 \
--dns-nameserver 8.8.8.8 --gateway 10.10.20.1

# Create a router
neutron router-create router1
neutron router-interface-add router1 private_subnet
neutron router-gateway-set router1 public_net

Before the Kubernetes cluster is deployed, a cluster template must be created. The nice thing about this process is that Magnum does not require long config files or definitions for this. A simple cluster template creation can look like this:

magnum cluster-template-create --name k8s-cluster-template --image fedora-atomic \
--keypair kolla-controller --external-network public_net --dns-nameserver 8.8.8.8 \
--flavor cloud.flavor --docker-volume-size 3 --network-driver flannel --coe kubernetes

Based on this template the cluster can be deployed:

magnum cluster-create --name k8s-cluster --cluster-template k8s-cluster-template \
--master-count 1 --node-count 2

 

The deployment status can be checked and viewed from Horizon. There are two places where this can be done, first one in Container Infra -> Clusters tab and second in Orchestration -> Staks tab. This is because Magnum relies on Heat templates to deploy the user defined resources. I find the the Stacks option better because it allows the user to see all the resources and events involved in the process. If something goes wrong, the issue can easily be identified by a red mark.

 

In the end my cluster should look something like this:

root@kolla-ubuntu-cbsl:~# magnum cluster-show 2ffb0ea6-d3f6-494c-9001-c4c4e01e8125
+---------------------+------------------------------------------------------------+
| Property            | Value                                                      |
+---------------------+------------------------------------------------------------+
| status              | CREATE_COMPLETE                                            |
| cluster_template_id | 595cdb6c-8032-43c8-b546-710410061be0                       |
| node_addresses      | ['10.7.15.112', '10.7.15.113']                             |
| uuid                | 2ffb0ea6-d3f6-494c-9001-c4c4e01e8125                       |
| stack_id            | 91001f55-f1e8-4214-9d71-1fa266845ea2                       |
| status_reason       | Stack CREATE completed successfully                        |
| created_at          | 2017-07-20T16:40:45+00:00                                  |
| updated_at          | 2017-07-20T17:07:24+00:00                                  |
| coe_version         | v1.5.3                                                     |
| keypair             | kolla-controller                                           |
| api_address         | https://10.7.15.108:6443                                   |
| master_addresses    | ['10.7.15.108']                                            |
| create_timeout      | 60                                                         |
| node_count          | 2                                                          |
| discovery_url       | https://discovery.etcd.io/89bf7f8a044749dd3befed959ea4cf6d |
| master_count        | 1                                                          |
| container_version   | 1.12.6                                                     |
| name                | k8s-cluster                                                |
+---------------------+------------------------------------------------------------+

SSH into the master node to check the cluster status

[root@kubemaster ~]# kubectl cluster-info
Kubernetes master is running at http://localhost:8080
KubeUI is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-ui

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

So there it is, a fully functioning Kubernetes cluster with 1 master and 2 minion nodes.

 

A word on networking

Kubernetes networking is not the easiest thing to explaing but I’ll do my best to do the essentials. After an app is deployed, the user will need to access it from outside the Kubernetes Cluster. This is done with Services. To achive this, on each minion node there is a kube-proxy service running that will allow the Service to do its job. Now the service can work in multiple ways, some of them are via an VIP LoadBalancer IP provided by the cloud underneath K8S, or with port-forward on the minion node IP.

 

Deploy an app

Now that all is set up, an app can be deployed. I am going to install WordPress with Helm. Helm is the package manager for Kubernetes. It installs applications with charts, which are basically apps definitions written in yaml. Here are documentation on how to install Helm.

 

I am going to install WordPress.

[root@kubemaster ~]# helm install stable/wordpress

Pods can be seen

[root@kubemaster ~]# kubectl get pods
NAME                                    READY     STATUS    RESTARTS   AGE
my-release-mariadb-2689551905-56580     1/1       Running   0          10m
my-release-wordpress-3324251581-gzff5   1/1       Running   0          10m

There are multiple ways of accessing the contents of a pod. I am going to port-forward 8080 port from the master node to the 80 port of the pod.

kubectl port-forward my-release-wordpress-3324251581-gzff5 8080:80

Now WordPress can be accessed via the Kubernetes node IP and port 8080

http://K8S-IP:8080

Kubernetes on OpenStack is not only possible, it can also be easy!

The post Easily deploy a Kubernetes cluster on OpenStack appeared first on Cloudbase Solutions.

by Dan Ardelean at September 12, 2017 09:40 PM

OpenStack Superuser

Collaboration and code beat hype and ego every time

Tech may come and go in hype cycles but some models endure.

Take the LAMP stack, which gets its name from its original four open-source components: the Linux operating system, the Apache HTTP Server, the MySQL relational database management system (RDBMS) and the PHP programming language. It drove the creation and meteoric rise of an entire industry on open source.

LAMP provided a model that involved multiple open-source projects, each with their own communities. This was not one community creating a monolith and that’s why it worked. Each community was able to focus on what they were good at, which wasn’t feasible in just one community.

If we can’t put aside the egos, AWS will absolutely eat everyone’s lunch. In every industry. Period. Skeptical? We can meet at Whole Foods to discuss it.

Although it was somewhat opinionated as a concept, it did get the operational burden under control and yet it was still modular enough so that each layer could be swapped out. These communities seemed to have struck the right balance. We have every opportunity to replicate this success in the open infrastructure/open cloud area, but only if we work across communities and listen to users.

It’s clear from every user I talk to that they struggle with pulling all the pieces together, particularly in terms of operating the beast once it’s assembled. That’s what they want, but we just aren’t there yet. Within OpenStack itself, we have started to cull options and projects to essentially make it more opinionated, which is helping.

Our opportunity, which is far from trivial, is to rise to the occasion across communities to actually serve the needs of the users, rather than succumbing to ego-driven development. If we can’t put aside the egos, AWS will absolutely eat everyone’s lunch. In every industry. Period. Skeptical? We can meet at Whole Foods to discuss it.

To offer a concrete example, eBay is perhaps the largest OpenStack user in the world. It also happens to have what may be one of the largest deployments of Kubernetes. They deploy them together and it’s very powerful!

What works well about that opinionated combination and what doesn’t? How are the participants in each community doing when it comes to listening to what eBay actually needs and wants out of the combination? It will take more than Kubernetes+OpenStack to form the LAMP stack of the cloud, but they’re a great place to start.

That’s why we recently organized the OpenDev event in San Francisco, with members of many open-source communities looking to build the open edge computing stack. We heard from over 30 organizations, including users like eBay, Verizon, NTT, AT&T, Walmart, and Inmarsat, and industry leaders from Carnegie Mellon, Intel, VMware, Ericsson, Red Hat and Huawei.

They’re taking open infrastructure to some of the harshest environments imaginable: from ships at sea to appliances next to the ovens heating your bagels at coffee shops to massive retail stores handling hundreds of billions of dollars in commerce. For each of these use cases, the old cloud model simply doesn’t work and so it falls upon us to assemble an open infrastructure stack that will.

This event was one concrete step we’ve taken to build something that is both diverse and open, while also being opinionated enough for each use case to actually operate at scale.

Kelsey Hightower, a prominent leader in the Kubernetes community, is taking the next step by organizing a joint Kubernetes-OpenStack event during KubeCon in Austin in December. This is the precisely the kind of leadership we need.

There are many steps left, so join us!

Mark Collier, COO, OpenStack Foundation can be found at @sparkycollier or your local Whole Foods.

 

The post Collaboration and code beat hype and ego every time appeared first on OpenStack Superuser.

by Mark Collier at September 12, 2017 04:56 PM

SUSE Conversations

Hybrid Cloud vs. Multi-Cloud vs. Mixed-Cloud: What’s the Difference?

Albert Einstein reportedly said, “if you can’t explain it simply, you don’t understand it well enough.” When it comes to hybrid cloud, it appears many of us would struggle to meet that lofty standard.   According to a recent research study 1, 4 out of 5 IT professionals believe hybrid cloud is misunderstood by customers and …

+read more

The post Hybrid Cloud vs. Multi-Cloud vs. Mixed-Cloud: What’s the Difference? appeared first on SUSE Blog. Terri Schlosser

by Terri Schlosser at September 12, 2017 02:41 PM

OpenStack Blog - Swapnil Kulkarni

OpenStack Queens PTG, Denver – Day 1

It is my first time joining the Project Teams Gathering since it started in Atlanta. The location of the event is pretty unique its own way. I had never been to this part of USA and you can feel the difference. The event is held in the Renaissance Denver for 5 days between Sep 11-Sep 15.

I arrived here on Sep 10 and I could already see the active contributors in the hotel lobby discussing something or the other. I met some of the friends and took rest on the day further.

Sep, 11, the day started with registration for the event. I joined the #openstack-ptg channel to get the updates about the day and there I got introduced to ptgbot. Initially many people including me were a bit confused with how it works, but as we got familiar with it, we got more used to it for tracking the events.

As per schedule, the first two days of the event are dedicated to inter-project discussions.

I headed directly to the infra/stable/release/requirements room for discussions of requirements team. We had a discussion around topics to be worked on in Queens which include the per-project/independent/divergent requirements, OpenStack client testing, Python 3. The discussion was pretty good with insights provided by tonyb, promentheanfire, dirk, mordred, notmyname

Post lunch I joined Kolla team with discussions around collaboration across different deployment tooling in OpenStack. We had discussions around architecture, health monitoring, the role of containers, kubernetes and security.

I also attended the TC meeting for Rebooting of the Stewardship WG and Onboarding new community members.

The day ended with unofficial PTG happy hour at the elevated lounge in Renaissance Denver.

by Swapnil Kulkarni at September 12, 2017 11:27 AM

September 11, 2017

OpenStack Superuser

Developing edge computing use cases, reference architectures

SAN FRANCISCO — The edge will soon be everywhere — telecoms, retail, internet of things, supply chains – and an all-star group of industry experts has pledged to build use cases and reference architectures for it.

That’s one of the major outcomes of OpenDev, a recent two-day event sponsored by the Ericsson, Intel and the OpenStack Foundation. OpenDev was devised as more of a workshop than a traditional conference with the first day featuring sessions more like working groups based on key topics including “Thin Control Plane,” “Deployment Considerations” and “Zero-Touch Provisioning.”

Reference architecture was one of the “meatiest of all the sessions” said Jonathan Bryce, executive director of the OpenStack Foundation who facilitated the 90-minute closing session of the September 7-8 event. The takeaways from all sessions are summarized together on a single Etherpad and you can also check the event schedule for Etherpads from the individual sessions.

Participants from the reference architecture session said the next action is “to discuss and determine what functions are managed by OpenStack and what is managed by the layer above when managing edge nodes. The outcome is potentially a set of whitepapers (one per use case) with straw man deployment designs.”

Volunteers to push forward efforts here include veteran OpenStack members, employees of multinational tech conglomerates and telcos. Industries identified as solid terrain for edge include: Retail, supply chain, utilities, industrial telematics, autonomous vehicles, industrial IoT, agriculture and medical tech.

There were five major edge use cases identified to work on:

  • Micro Edge Device (Remote Radio Head / Remote Radio Unit, CPE / set top box, runs a single instance, instance changes infrequently)
  • Small Edge Device (coffee shop / POS for a store / cell tower site / FTTN cab, multiple instances, instances change occassionally)
  • Medium Edge Backhaul Critical Deployment (C-RAN / big cell site / NBN POI, multiple instances, instances change daily)
  • Medium Edge Backhaul Non-Critical Deployment (big box retail / cloudlet, multiple instances, instances change daily)
  • Large Edge Deployment (region DC, thousands of instances, instances changing constantly)

Verizon’s Beth Cohen noted in the closing session that “there isn’t really an ecosystem of vendors for edge, what we have now is coming out of internal requirements, there’s nobody to go to.” When another participant asked her if it should stay that way, she said “No! It’s very early and this is an opportunity.”

Stay tuned for more on how you can get involved and on these emerging edge computing reference architectures.

Cover Photo // CC BY NC

The post Developing edge computing use cases, reference architectures appeared first on OpenStack Superuser.

by Nicole Martinelli at September 11, 2017 04:30 PM

September 10, 2017

Amrith Kumar

Reflections on the (first annual) OpenDev Conference, SFO

Earlier this week, I attended the OpenDev conference in San Francisco, CA. The conference was focused on the emerging “edge computing” use cases for the cloud. This is an area that is of particular interest, not just from the obvious applicability to my ‘day job’ at Verizon, but also from the fact that it opens … Continue reading "Reflections on the (first annual) OpenDev Conference, SFO"

by amrith at September 10, 2017 05:19 PM

Dragonflow Team

Openstack-Vagrant - Bringing Vagrant, Ansible, and Devstack together to deploy for developers

Introduction

OpenStack developers in general, and Dragonflow developers in particular, find themselves in need of setting up many OpenStack deployments (for testing, troubleshooting, developing, and what not). Every change requires testing on a 'real' environment.

Doing this manually is impossible. This is a task that must be automated. If a patch is ready, setting up a server to test it should take seconds.

This is where Openstack-Vagrant comes in.

More Details

In essence, Openstack-Vagrant (https://github.com/omeranson/openstack-vagrant) is a Vagrantfile (read: Vagrant configuration file) that sets up a virtual machine, configures it and installs all the necessary dependencies (using Ansible), and then runs devstack.

In effect, Openstack-Vagrant allows you to create a new OpenStack deployment by simply updating a configuration file, and running vagrant up.

Vagrant

Vagrant (https://www.vagrantup.com/) allows you to easily manage your virtual machines. They can be deployed on many hosts (e.g. your personal PC, or several lab servers), with many backends (e.g. libvirt, or virtual box), with many distributions (e.g. Ubuntu, Fedora). I am sticking to Linux here, because that's what's relevant to our deployment.

Vagrant also let's you automatically provision your virtual machines, using e.g. shell or Ansible.

Ansible

Ansible (https://www.ansible.com/) allows you to easily provision your remote devices. It was selected for OpenStack-Vagrant for two main reasons:
  1. It is agent-less. No prior installation is needed.
  2. It works over SSH - out of the box for Linux cloud images.
Like many provisioning tools, Ansible is idempotent - you state the outcome (e.g. file exists, package installed) rather than the action. This way the same playbook (Ansible's list of tasks) can be replayed safely in case of errors along the way.

Devstack

Every developer in OpenStack should know devstack (https://docs.openstack.org/devstack/latest/). That's how testing setups are deployed.

Really In-Depth

Let's review how to set-up an OpenStack and Dragonflow deployment on a single server using Openstack-Vagrant.

  1. Grab a local.conf file. The Dragonflow project has some with healthy defaults (https://github.com/openstack/dragonflow/tree/master/doc/source/single-node-conf). At the time of writing, redis and etcd are gated. I recommend etcd, since it's now an Openstack base service (https://github.com/openstack/dragonflow/blob/master/doc/source/single-node-conf/etcd_local_controller.conf)
    • wget https://raw.githubusercontent.com/openstack/dragonflow/master/doc/source/single-node-conf/etcd_local_controller.conf
  2. Create a configuration for your new virtual machine. A basic example exists in the project's repository (https://github.com/omeranson/openstack-vagrant/blob/master/directory.conf.yml).
    • machines: 
        - name: one
      hypervisor:
      name: localhost
      username: root
      memory: 8192
      vcpus: 1
      box: "fedora/25-cloud-base"
      local_conf_file: etcd_local_controller.conf
  3. Run vagrant up <machine name>
    • vagrant up one
  4. Go drink coffee. You have an hour. 
  5. Once Ansible finishes its thing, you can log into the virtual machine with vagrant ssh or vagrant ssh -p -- -l stack (to log in directly to the stack user). Once as the stack user, devstack progress is available in a tmux session.
    • vagrant ssh -p -- -l stack
    • tmux attach

How Can It Be Better? 

 There are many ways we can still improve Openstack-Vagrant. Here are some thoughts that come to mind:
  1. A simple CLI interface that creates the configuration file and fires up the virtual machine.
  2. Use templates to make the local.conf file more customisable.

Conclusion

With Openstack-Vagrant, it is much easier to create new devstack deployments. A deployment can be fired in under a minute, and it will automatically boot the virtual machine, update it, install any necessary software, and run devstack

by Omer Anson (noreply@blogger.com) at September 10, 2017 01:54 PM

Kubernetes container services at scale with Dragonflow SDN Controller

Cloud native ecosystem is getting very popular, but VM based workloads are not going away. Enabling developers to connect VMs and containers to run hybrid workloads, means shorter time to market, more stable production environment and ability to leverage the maturity of the VM ecosystem.

Dragonflow is a distributed, modular and extendable SDN controller that enables to connect cloud network instances (VMs, Containers and Bare Metal servers) at scale. Kuryr allows you to use Neutron networking to connect the containers on your OpenStack cloud. Combining them allows to use the same networking solution for all workloads.

In this post I will  briefly cover both Dragonflow and Kuryr, explain how Kubernetes cluster networking is supported by Dragonflow and provide details about various Kubernetes cluster deployment options.

Introduction

Dragonflow Controller in a nutshell

Dragonflow adopts a distributed approach to solve the scaling issues for large scale deployments. With Dragonflow the load is distributed to the compute nodes running local controller. Dragonflow manages the network services for the OpenStack compute nodes by distributing network topology and policies to the compute nodes where they are translated into OpenFlow rules and programmed into Open vSwitch datapath.
Network services are implemented as Applications in the local controller.
OpenStack can use Dragonflow as its network provider through the Modular Layer 2 (ML2) Plugin.

Kuryr

Project Kuryr uses OpenStack Neutron to provide networking for containers. With kuryr-kubernetes, Kuryr project enables native Neutron-based networking for Kubernetes.
Kuryr provides solution for Hybrid workloads, enabling Bare Metal, Virtual Machines and Containers to share the  same Neutron network or to choose different routable network segments.


Kubernetes - Dragonflow Integration

To leverage Dragonflow SDN Controller as Kubernetes network provider, we use Kuryr to act as the container networking interface (CNI) for Dragonflow.


Diagram 1: Dragonflow-Kubernetes integration


Kuryr Controller watches K8s API for Kubernetes events and translates them into Neutron models. Dragonflow translates Neutron model changes into a network topology that gets stored in the distributed DB and propagates network policies to its local controllers that apply changes on open vSwitch pipeline.
Kuryr CNI driver binds Kubernetes pods on worker nodes into Dragonflow logical ports ensuring requested level of isolation.
As you can see in the diagram above, there is no kube-proxy component. Kubernetes services are implemented with the help of Neutron load balancers. Kuryr-Controller translates Kubernetes service into Load Balancer, Listener and Pool. Service endpoints are mapped to the members in the pool. See the following diagram diagram:
Diagram 2: Kubernetes service translation

Currently either Octavia or HA Proxy can be used as Neutron LBaaSv2 providers. In the Queens release, Dragonflow will provide native LBaaS implementation, as drafted in the following specification.

Deployment Scenarios

With Kuryr-Kubernetes it’s possible to choose to run both OpenStack VMs and Kubernetes Pods on the same network provided by Dragonflow if your workloads require it or to use different network segments and, for example, route between them. Below you can see the details of various scenario, including devstack recipes.  

Bare Metal deployment

Kubernetes cluster can be deployed on Bare Metal servers. Logically there are 3 different types of servers.

OS Controller hosts - required control service, such as Neutron Server, Keystone and Dragonflow Northbound Database. Of course, they can be distributed on number of servers.

K8s Master hosts - components that provide the cluster’s control plane. Kuryr-Controller is part of the cluster control plane.

K8s Worker nodes - hosts components that  run on every node, maintaining running pods and providing the Kubernetes runtime environment.

Kuryr-CNI is invoked by Kubelet. It binds Pods into Open vSwitch bridge that is managed by Dragonflow Controller.


If you want to try Bare Metal deployment with devstack, you should enable Neutron, Keystone, Dragonflow and Kuryr components. You can use this local.conf:


Nested (Containers in VMs) deployment

Another deployment option is nested-VLAN, where containers are created inside OpenStack VMs by using the Trunk ports support. Undercloud OS environment has all the needed components to create VMs (e.g., Glance, Nova, Neutron, Keystone, ...), as well as the needed Dragonflow configurations such as enabling the trunk support that will be needed for the VM to enable running Containers to use undercloud networking. The overcloud deployment inside the VM contains Kuryr components along Kubernetes Control plane components.


If you want to try nested-VLAN deployment with devstack, you can use Dragonflow Kuryr Bare Metal config with the following changes:
  1. Do not enable kuryr-kubernetes plugin and kuryr related services as they will be installed inside VM.
  2. Nova and Glance components need to be enabled to be able to create the VM where we will install the overcloud.
  3. Dragonflow Trunk service plugin need to be enable to ensure Trunk ports support.
Then create Trunk and spawn overcloud VM on the Trunk port.
Install overcloud, following the instructions as listed here.


Hybrid environment

Hybrid environment enables diverse use cases where containers, regardless if they are deployed on Bare Metal or inside Virtual Machines, are in the same Neutron network as other co-located VMs.
To bring up such environment with devstack, just follow the instructions as stated in the nested deployment section.

Testing the cluster
Once the environment is ready, we can test that network connectivity works among Kubernetes pods and services. You can check the cluster configuration according to this default configuration guide. You can run simple example application and verify the connectivity and configuration reflected in the Neutron and Dragonflow data model. Just follow the instructions to try sample kuryr-kubernetes application.

Resources


by Irena Berezovsky (noreply@blogger.com) at September 10, 2017 11:19 AM

September 08, 2017

OpenStack Blog

OpenStack Developer Mailing List Digest September 2 – 8

Successbot Says!

Summaries:

  • Notifications Update Week 36 [3]

Updates:

  • Summit Free Passes [4]
    • People that have attended the Atlanta PTG or will attend the Denver PTG will receive 100% discount passes for the Sydney Summit
    • They must be used by October 27th
  • Early Bird Deadline for Summit Passes is September 8th [5]
    • Expires at 6:59 UTC
    • Discount Saves you 50%
  • Libraries Published to pypi with YYY.X.Z versions [6]
    • Moving forward with deleting the libraries from Pypi
    • Removing these libraries:
      • python-congressclient 2015.1.0
      • python-congressclient 2015.1.0rc1
      • python-designateclient 2013.1.a8.g3a2a320
      • networking-hyperv 2015.1.0
    • Still waiting on approval from PTL’s about the others
      • mistral-extra
      • networking-odl
      • murano-dashboard
      • networking-midonet
      • sahara-image-elements
      • freezer-api
      • murano-agent
      • mistral-dashboard
      • Sahara-dashboard
  • Unified Limits work stalled [7]
    • Need for new leadership
    • Keystone merged a spec [8]
  • Should we continue providing FQDN’s for instance hostnames?[9]
    • Nova network has deprecated the option that the domain in the FQDN is based on
    • Working on getting the domain info from Neutron instead of Nova, but this may not be the right direction
    • Do we want to use a FQDN as the hostnames inside the guest?
      • The Infra servers are built with the FQDN as the instance name itself
  • Cinder V1 API Removal[10]
    • Patch here[11]
  • Removing Screen from Devstack- RSN
    • It’s been merged
    • A few people are upset that they don’t have screen for debugging anymore
    • Systemd docs are being updated to include pdb path so as to be able to debug in a similar way to how people used screen [12] [13] [14]

PTG Planning

  • Video Interviews [15]

 

[1] http://eavesdrop.openstack.org/irclogs/%23storyboard/%23storyboard.2017-09-06.log.html#t2017-09-06T22:03:06

[2] http://eavesdrop.openstack.org/irclogs/%23openstack-chef/%23openstack-chef.2017-09-08.log.html#t2017-09-08T13:48:14

[3] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121769.html

[4] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121843.html #Free

[5] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121847.html

[6] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121705.html #YYYY

[7] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121944.html

[8] https://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/unified-limits.html

[9] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121762.html

[10] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121956.html

[11] https://review.openstack.org/#/c/499342/

[12] https://review.openstack.org/#/c/501834/

[13] https://pypi.python.org/pypi/remote-pdb

[14] https://review.openstack.org/#/c/501870/

[15] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121901.html

by Kendall Nelson at September 08, 2017 05:37 PM

Chris Dent

OpenDev 2017

I've spent the last couple of days at OpenDev. Here are some quick and dirty thoughts from the two days.

The event described itself as:

OpenDev is an annual event focused at the intersection of composable open infrastructure and modern applications. The 2017 gathering will focus on edge computing, bringing together practitioners along with the brightest minds in the industry to collaborate around use cases, reference architectures and gap analysis.

The gist of that is "get some people who may care about edge computing and edge cloud in the same room and see what happens".

As is often the case with this sort of thing: When talking about something relatively new there is very little agreement about what is even being talked about. People came to the event with their ideas on what "edge" means, ideas that didn't always match with everyone else. This can lead to a bit of friction but it is a productive friction; areas of commonality are revealed and presumed areas of overlap which are not are exposed.

There were several different edges floating around the meetup. The two standouts I heard (not necessarily using these names):

  • Edge Cloud: A cloud (sometimes in single piece of hardware, sometimes not) that is closer (in terms of network latency or partitioning risk) to where the applications and/or data that could use that cloud is located. In this case a cloud is some compute, network, or storage infrastructure. VM or container agnostic. Orchestration of workloads on this thing happen on it.

  • Edge Compute: A piece or suite of hardware, often but not always small, which is closer to the consumer of the compute resources available on that hardware and where the applications on the hardware are managed in a composable fashion (for example a home networking device running its services as VMs). Orchestration of workloads on this thing happens from something else.

The locality of orchestration represents a sort of spectrum but there are also other dimensions. One example is the types of applications. One can imagine an edge service that is managing a bunch of virtualized network functions (VNF) for a customer of a large telco at the edge of the telco's network (between the customer and the telco). Or an edge service which does deep learning analysis of images with low latency for a mobile device doing augmented reality that does not have the compute capacity itself, but needs very low latency processing to provide a good experience.

The edge compute model is currently a poor match with Nova. High latency (or partitioning) between a nova-compute node and the rest of nova is not something that nova really does. It would be interesting to explore the idea of a remote-resilient (and maybe "mini") nova-compute.

In typical OpenStack fashion, the event resulted in a bunch of etherpads, a summary of which can be found.

My interest for the two days was to see to what extent the Placement service can help with these use cases or will need to be extended to cope with them. It's already quite well known that the NFV world is driving a lot of the work to make placement provide some of the pieces of the pie for what's called Enhanced Platform Awareness (aka give me NUMA, SR-IOV and huge pages). Until DPDK is able to provide a suitable alternative, VNFs at the edge will demand these sorts of things. Certain high performance edge applications (perhaps using GPUs) will have similar requirements.

Another concern for both styles of use case, but especially edge applications (cloudlets as Satya calls them) on limited hardware, may be scheduling needs related to the time dimension and to dynamic evaluation of the state of the hardware. Being able reserve capacity for a short time or use otherwise unreserved capacity during periods of intense activity will be important.

The Blazar project hopes to address some of the issues with time, something neither nova nor placement wish to address.

The nova scheduler has always had the capacity to represent dynamic aspects of host state ("Is my compute node on fire?") but it is perhaps not a consistently represented and available across deployments as it could be. Nor is it fully developed, so if it is something that people truly want there is an opportunity. The placement service made a design decision up front that it would not track data that may vary independent of individual workload requirements (e.g. temperature on a CPU) but scheduling has become a two step process: get some candidates from placement then further refine them in the nova-scheduler.

Edge looks like it is going to grow to become a major driver in the cloud environment. Hopefully continued events like OpenDev will ensure that people create interoperable implementations.

by Chris Dent at September 08, 2017 04:00 PM

John Likes OpenStack

Debugging TripleO Ceph-Ansible Deployments

Starting in Pike it is possible to use TripleO to deploy Ceph in containers using ceph-ansible. This is a guide to help you if there is a problem. It asks questions, somewhat rhetorically, to help you track down the problem.

What does this error from openstack overcloud deploy... mean?

If TripleO's new Ceph deployment fails, then you'll see an error like the following:


Stack overcloud CREATE_FAILED

overcloud.AllNodesDeploySteps.WorkflowTasks_Step2_Execution:
resource_type: OS::Mistral::ExternalResource
physical_resource_id: bb9e685c-fbe9-4573-8d74-2c053bc5de0d
status: CREATE_FAILED
status_reason: |
resources.WorkflowTasks_Step2_Execution: ERROR
Heat Stack create failed.

TripleO installs the OS and configures networking and other base services for OpenStack for the nodes during step 1 of its five-step deployment. During step 2, a new type of Heat OS::Mistral::ExternalResource is created which calls a new Mistral workflow which uses a new Mistral action to call an Ansible playook. The playbook that is called is site-docker.yaml.sample from ceph-ansible. Giulio covers this in more detail in Understanding ceph-ansible in TripleO. The above error message indicates that Heat was able to call Mistral, but that the Mistral workflow failed. So, the next place to look is the Mistral logs on the undercloud to see if the ceph-ansible site-docker.yml playbook ran.

Did the ceph-ansible playbook run?

The most helpful file for debugging TripleO ceph-ansible deployments is:


/var/log/mistral/ceph-install-workflow.log
If it doesn't exist or is empty, then the ceph-ansible playbook run did not happen.

If it does exist, then it's the key to solving the problem! Read it as it will contain the output of the ceph-ansible run which you can use to debug ceph-ansible as you normally would. The ceph-ansible docs should help. Once you think the environment has been changed so that you won't have the problem (details on that below), then re-run the `openstack overcloud deploy ...` command, and after TripleO does its normal checks, it will re-run the playbook. Because ceph-ansible and TripleO are idempotent, this process may be repeated as necessary.

Why didn't the ceph-ansible playbook run?

The following will show the playbook call to ceph-ansible:


cd /var/log/mistrtal/
grep site-docker.yml.sample executor.log | grep ansible-playbook

If there's an error during the playbook run, then it should look something like this...


2017-09-06 12:13:22.181 20608 ERROR mistral.executors.default_executor Command:
ansible-playbook -v /usr/share/ceph-ansible/site-docker.yml.sample --user tripleo-admin --become ...

If you don't see a playbook call like the above, then the Mistral tasks that set up the environment for a ceph-ansible run failed.

What does Mistral do to prepare the environment to run ceph-ansible?

A copy of the Mistral workbook which prepares the overcloud and undercloud to run ceph-ansible, and then runs it, is in:


/usr/share/tripleo-common/workbooks/ceph-ansible.yaml

The Mistral tasks do the following:

  • Configure the SSH key-pairs so the undercloud can run ansible tasks on the overcloud ndoes and the tripleo-admin user
  • Create a temporary fetch directory for ceph-ansible to use to copy configs between overcloud ndoes
  • Build a temporary Ansible inventory in a file like /tmp/ansible-mistral-actionSYRh6Q/inventory.yaml
  • Set the Ansible fork count to the number of nodes (but not >100).
  • Run the ceph-ansible site-docker.yaml.sample playbook
  • Clean up temproary files

To check the details of the Mistral tasks used by ceph-ansible, extract the workflow's UUID with the following:


WORKFLOW='tripleo.storage.v1.ceph-install'
UUID=$(mistral execution-list | grep $WORKFLOW | awk {'print $2'} | tail -1)

Then use the ID to examine each task:


for TASK_ID in $(mistral task-list $UUID | awk {'print $2'} | egrep -v 'ID|^$'); do
mistral task-get $TASK_ID
mistral task-get-result $TASK_ID | jq . | sed -e 's/\\n/\n/g' -e 's/\\"/"/g'
done

If you really need to update the workbook itself, you can modify a copy and upload it with the following, but please see if your problem can instead be solved by simply overriding the default values in a Heat environment file as per the documentation.


source ~/stackrc
cp /usr/share/tripleo-common/workbooks/ceph-ansible.yaml .
vi ceph-ansible.yaml
mistral workbook-update ceph-ansible.yaml

I already know ceph-ansible; how do I edit the files in group_vars?

Please don't. It will break the TripleO integration. Instead please use TripleO as usual, and override the default values in a Heat environment file like ceph.yaml which you then use -e to add to your openstack overcloud deploy command as described in the documentation.

What changes does the TripleO ceph-ansible integration make to the files in ceph-ansible's group_vars?

None. Instead YAQL within tripleo-head-templates builds a Mistral environment which the ceph-ansible.yaml Mistral workbook may access to when it calls ceph-ansible. The workbook then passes those parameters as JSON with the ansible-playbook command's --extra-vars option. To see what parameters were passed using this method, grep the executor.log as above to see the ceph-ansible playbook call. The sample file, site-docker.yml.sample is called because that file is shipped by ceph-ansible. This allows TripleO to not need to maintain its own ceph-ansible fork.

What does a usual ceph-ansible playbook call look like when run by TripleO?


ansible-playbook -v /usr/share/ceph-ansible/site-docker.yml.sample
--user tripleo-admin
--become
--become-user root
--extra-vars
{"monitor_secret": "***",
"ceph_conf_overrides":
{"global": {"osd_pool_default_pg_num": 32,
"osd_pool_default_size": 1}},
"osd_scenario": "non-collocated",
"fetch_directory": "/tmp/file-mistral-action3_a1Cb",
"user_config": true,
"ceph_docker_image_tag": "tag-build-master-jewel-centos-7",
"ceph_release": "jewel",
"containerized_deployment": true,
"public_network": "192.168.24.0/24",
"copy_admin_key": false,
"journal_collocation": false,
"monitor_interface": "eth0",
"admin_secret": "***",
"raw_journal_devices": ["/dev/vdd", "/dev/vdd"],
"keys": [{"mon_cap": "allow r",
"osd_cap": "allow class-read object_prefix rbd_children, allow rwx pool=volumes, ... ],
"openstack_keys": [{"mon_cap": "allow r", ... ],
"generate_fsid": false,
"osd_objectstore": "filestore",
"monitor_address_block": "192.168.24.0/24",
"ntp_service_enabled": false,
"ceph_docker_image": "ceph/daemon",
"docker": true,
"fsid": "2d87a5e8-8e72-11e7-a223-003da9b9b610",
"journal_size": 256,
"cephfs_metadata": "manila_metadata",
"openstack_config": true,
"ceph_docker_registry": "docker.io",
"pools": [],
"cephfs_data": "manila_data",
"ceph_stable": true,
"devices": ["/dev/vdb", "/dev/vdc"],
"ceph_origin": "distro",
"openstack_pools": [
{"rule_name": "", "pg_num": 32, "name": "volumes"},
{"rule_name": "", "pg_num": 32, "name": "backups"},
{"rule_name": "", "pg_num": 32, "name": "vms"},
{"rule_name": "", "pg_num": 32, "name": "images"},
{"rule_name": "", "pg_num": 32, "name": "metrics"}],
"ip_version": "ipv4",
"ireallymeanit": "yes",
"cluster_network": "192.168.24.0/24",
"cephfs": "cephfs",
"raw_multi_journal": true
}
--forks 6
--ssh-common-args "-o StrictHostKeyChecking=no"
--ssh-extra-args "-o UserKnownHostsFile=/dev/null"
--inventory-file /tmp/ansible-mistral-actiontrguE1/inventory.yaml
--private-key /tmp/ansible-mistral-actiontrguE1/ssh_private_key
--skip-tags package-install,with_pkg

You can get the above in an unformated version of the following from a grep to /var/log/mistral/executor.log as described above.

How can I re-run only the ceph-ansible playbook?

Careful. This should not be done on a production deployment because if you re-run the Mistral deployment directly after getting the error posted under the first question, then the Heat Stack will not be updated. Thus, Heat will believe the OS::Mistral::ExternalResource resource has status CREATE_FAILED. If you are doing a practice deployment or development, then you can use Mistral's task-rerun. But this only works if the task has failed.

First get the Task ID


WORKFLOW='tripleo.storage.v1.ceph-install'
UUID=$(mistral execution-list | grep $WORKFLOW | awk {'print $2'} | tail -1)
mistral task-list $UUID | grep ERROR
For example:

(undercloud) [stack@undercloud workbooks]$ mistral task-list $UUID | grep ERROR
| 31257437-c877-40f8-872f-2576da89a8ea | ceph_install | tripleo.storage.v1.ceph-install | a5287f5c-f781-40cf-8fce-c56c21c52918 | ERROR | Failed to run action [act... | 2017-09-07 15:31:43 | 2017-09-07 15:31:46 |
(undercloud) [stack@undercloud workbooks]$
Then re-run the task

(undercloud) [stack@undercloud workbooks]$ mistral task-rerun 31257437-c877-40f8-872f-2576da89a8ea
+---------------+--------------------------------------+
| Field | Value |
+---------------+--------------------------------------+
| ID | 31257437-c877-40f8-872f-2576da89a8ea |
| Name | ceph_install |
| Workflow name | tripleo.storage.v1.ceph-install |
| Execution ID | a5287f5c-f781-40cf-8fce-c56c21c52918 |
| State | RUNNING |
| State info | None |
| Created at | 2017-09-07 15:31:43 |
| Updated at | 2017-09-08 16:24:04 |
+---------------+--------------------------------------+
(undercloud) [stack@undercloud workbooks]$

If you run the above and keep the following in another window:


tail -f /var/log/mistral/ceph-install-workflow.log
Then it's just like running `ansible-playbook site-docker.yaml ...` but you don't need to pass all of the --extra-vars because the same Mistral environment built by Heat is available.

by John (noreply@blogger.com) at September 08, 2017 03:38 PM

OpenStack Superuser

How edge computing can take the pain out of daily life

SAN FRANCISCO —- Some day soon, when you assemble a piece of Ikea furniture without tears you’ll have edge computing to thank.

That’s the vision of Mahadev Satyanarayanan, known as Satya, a computer science faculty member at Carnegie Mellon University. While the word “disruptive” gets passed around more than bitcoin in the Bay Area, he means it, expecting that edge computing will touch everything from troubleshooting equipment repair to helping people with Alzheimer’s live at home for longer.

With a halo of greying curls and wry sense of humor, Satya is considered the godfather of edge computing for his 2009 paper “The Case for VM-based Cloudlets in Mobile Computing.” He showed a demo to the 200-or so technical folks assembled for OpenDev where a researcher put together an Ikea lamp with the computer-assisted technology in the form of Google Glass. The headset showed the user a video of the assembly as he was putting it together, promptly rewarded him with “good job” when he got it right and caught him when he forgot a piece.

Satya outlines the perfect storm for edge computing at OpenDev.

“We have some distance to go before these are things we can count on, but edge is coming and it will disrupt,” he concluded.

It was an inspiring kick-off to the event held at DogPatch Studios. Sponsored by the Ericsson, Intel and the OpenStack Foundation, OpenDev was devised as more of a workshop than a traditional conference. The roughly two-hour keynotes featured talks from AT&T, eBay, Verizon and Intel. (Superuser will have more on those with the videos, stat.)

The first day featured sessions more like working groups based on key topics including “Thin Control Plane,” “Deployment considerations” and “Zero-Touch Provisioning.” On day two, the afternoon slots are set aside for unworking sessions, where participants will gather insights from previous work and discuss how to move forward. (Stay tuned to Superuser for those takeaways.)

OpenStack Foundation executive director Jonathan Bryce likes to back up his predictions with numbers. And while the numbers surrounding the edge computing can be a little fuzzy (will there be 10 or 50 billion devices connected to the internet in 2020?) the crystal ball gazers have one thing right: the “estimates may be off, but the point is that all those devices will be generating data and generating work that needs to be analyzed and put into a system.”

Bryce outlined some important goals for the event:

  • Build relationships between practitioners working on edge and developers building infrastructure
  • Create and publish documentation
  • Develop use cases, architecture and deployment patterns
  • Build best practices around management, operations, security, etc
  • Devise triage needs for open projects

The outcome will be to “arrive at a vision for what this can be and where this is going to go.” More to come!

Cover Photo // CC BY NC

The post How edge computing can take the pain out of daily life appeared first on OpenStack Superuser.

by Nicole Martinelli at September 08, 2017 02:30 PM

Lee Yarwood

OpenStack - Skip level upgrades - PTG

PTG A short reminder that I’ll be chairing the skip-level upgrades room at next week’s OpenStack PTG in Denver. So far ~15 of you have shown interest in this track on the etherpad so I’m looking forward to some useful discussions over the two days. For now we still have available slots so if you do have suggestions please feel free to add them directly on the pad!

September 08, 2017 10:41 AM

Openstack Security Project

OpenStack Security Notes, and how they help you the Operator

For this post I will explain what OpenStack Security notes are, and how they benefit operators in securing an OpenStack Cloud.

OpenStack Security Notes (OSSN’s) are solely to notify operators of a discovered risk, that are often not directly addressed by a code patch.

OSSN’s can be in the form of a deployment architecture recommendation, configuration value or a file permission.

Consider the meme ‘If you do this, you’re going to have a bad time’ to get an idea of what OSSN’s are about.

Some examples of recent OSSN’s would be:

The end to end process of an OSSN, starts when a member of the security project, a project core, or a VMT member, mark a launchpad bug by adding the ‘OpenStack Security Note’ group. An author will then assign themselves to the bug, and will commit to authoring the OSSN. Public notes may be worked on by anyone, whereas embargoed notes are only handled by the security project core members.

Once the author has a draft in place, they will submit a patch to the security-docs repo, where other members of the security project and cores from the related project of the original launchpad bug, can review the note content.

After the patch has received two +2 reviews from security project core members, and a +1 from a core within the concerned project, the OSSN is merged into the security-docs repository.

Once merged, the reviewed text will be posted to the OpenStack Wiki , and a GPG signed email will be sent to the openstack & openstack-dev mailing lists.

The OpenStack Security Project welcomes anyone who wants to help Author or review OSSN’s. Security Notes are often a path to the election of core members of the OpenStack security project. OSSN authorship was how I personally found myself elected almost two years back.

Anyone new to the security project offering to help author a Security Note, will be given lots of support on creating their first OSSN from other Security Project members.

We also welcome feedback from operators on how valuable you find OSSN’s, and ways you feel may improve the process. After all, the process is there to benefit you the operator.

For anyone with an interest in OpenStack Security, the OpenStack Security Project can be found on the irc-channel #openstack-security and we meet weekly on #openstack-meeting-alt every Thursday @ 17:00 UTC time.

You can also email the security project on the OpenStack developer mailing list, by using a [security] tag in the subject line.

Luke Hinds (Security Project PTL)

September 08, 2017 12:00 AM

September 07, 2017

John Likes OpenStack

Make a NUMA-aware VM with virsh

Grégory showed me how he uses `virsh edit` on a VM to add something like the following:


<cpu mode='custom' match='exact' check='partial'>
<model fallback='allow'>SandyBridge</model>
<feature policy='force' name='vmx'/>
<numa>
<cell id='0' cpus='0-1' memory='4096000' unit='KiB'/>
<cell id='1' cpus='2-3' memory='4096000' unit='KiB'/>
</numa>
</cpu>

After that `lstopo` will show NUMA nodes you can use. E.g. if you want to start a process on your VM with `numactl`.


# lstopo-no-graphics
Machine (7999MB total)
NUMANode L#0 (P#0 3999MB)
Package L#0 + L3 L#0 (16MB) + L2 L#0 (4096KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
Package L#1 + L3 L#1 (16MB) + L2 L#1 (4096KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
NUMANode L#1 (P#1 4000MB)
Package L#2 + L3 L#2 (16MB) + L2 L#2 (4096KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
Package L#3 + L3 L#3 (16MB) + L2 L#3 (4096KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
Misc(MemoryModule)
HostBridge L#0
PCI 8086:7010
PCI 1013:00b8
GPU L#0 "card0"
GPU L#1 "controlD64"
3 x { PCI 1af4:1000 }
2 x { PCI 1af4:1001 }

by John (noreply@blogger.com) at September 07, 2017 03:32 PM

Mirantis

Is Kubernetes Repeating OpenStack’s Mistakes?

Multi-cloud is the new private. Kubernetes is the new OpenStack. But is there an opportunity to learn from the past and do it better this time around?

by Boris Renski at September 07, 2017 07:39 AM

September 06, 2017

RDO

Writing a SELinux policy from the ground up

SELinux is a mechanism that implements mandatory access controls in Linux systems. This article shows how to create a SELinux policy that confines a standard service:

  • Limit its network interfaces,
  • Restrict its system access, and
  • Protect its secrets.

Mandatory access control

By default, unconfined processes use discretionary access controls (DAC). A user has all the permissions over its objects, for example the owner of a log file can modify it or make it world readable.

In contrast, mandatory access control (MAC) enables more fine grained controls, for example it can restrict the owner of a log file to only append operations. Moreover, MAC can also be used to reduce the capability of a regular process, for example by denying debugging or networking capabilities.

This is great for system security, but is also a powerful tool to control and better understand an application. Security policies reduce services' attack surface and describes service system operations in depth.

Policy module files

A SELinux policy is composed of:

  • A type enforcement file (.te): describes the policy type and access control,
  • An interface file (.if): defines functions available to other policies,
  • A file context file (.fc): describes the path labels, and
  • A package spec file (.spec): describes how to build and install the policy.

The packaging is optional but highly recommended since it's a standard method to distribute and install new pieces on a system.

Under the hood, these files are written using macros processors:

  • A policy file (.pp) is generated using: make NAME=targeted -f "/usr/share/selinux/devel/Makefile"
  • An intermediary file (.cil) is generated using: /usr/libexec/selinux/hll/pp

Policy developpment workflow:

The first step is to get the services running in a confined domain. Then we define new labels to better protect the service. Finally the service is run in permissive mode to collect the access it needs.

As an example, we are going to create a security policy for the scheduler service of the Zuul program.

Confining a Service

To get the basic policy definitions, we use the sepolicy generate command to generate a bootstrap zuul-scheduler policy:

sepolicy generate --init /opt/rh/rh-python35/root/bin/zuul-scheduler

The –init argument tells the command to generate a service policy. Other types of policy could be generated such as user application, inetd daemon or confined administrator.

The .te file contains:

  • A new zuul_scheduler_t domain,
  • A new zuul_scheduler_exec_t file label,
  • A domain transition from systemd to zuul_scheduler_t when the zuul_scheduler_exec_t is executed, and
  • Miscellaneous definitions such as the ability to read localization settings.

The .fc file contains regular expressions to match a file path with a label: /bin/zuul-scheduler is associated with zuul_scheduler_exec_t.

The .if file contains methods (macros) that enable role extension. For example, we could use the zuul_scheduler_admin method to authorize a staff role to administrate the zuul service. We won't use this file because the admin user (root) is unconfined by default and it doesn't need special permission to administrate the service.

To install the zuul-scheduler policy we can run the provided script:

$ sudo ./zuul_scheduler.sh
Building and Loading Policy
+ make -f /usr/share/selinux/devel/Makefile zuul_scheduler.pp
Creating targeted zuul_scheduler.pp policy package
+ /usr/sbin/semodule -i zuul_scheduler.pp

Restarting the service should show (using "ps Zax") that it is now running with the system_u:system_r:zuul_scheduler_t:s0 context instead of the system_u:system_r:unconfined_service_t:s0.

And looking at the audit.log, it should show many "avc: denied error" because no permissions have yet been defined. Note that the service is running fine because this initial policy defines the zuul_scheduler_t domain as permissive.

Before authorizing the service's access, let's define the zuul resources.

Define the service resources

The service is trying to access /etc/opt/rh/rh-python35/zuul and /var/opt/rh/rh-python35/lib/zuul which inherited the etc_t and var_lib_t labels. Instead of giving zuul_scheduler_t access to etc_t and var_lib_t, we will create new types. Moreover the zuul-scheduler manages secret keys we could isolate from its general home directory and it requires two tcp ports.

In the .fc file, define the new paths:

/var/opt/rh/rh-python35/lib/zuul/keys(/.*)?  gen_context(system_u:object_r:zuul_keys_t,s0)
/etc/opt/rh/rh-python35/zuul(/.*)?           gen_context(system_u:object_r:zuul_conf_t,s0)
/var/opt/rh/rh-python35/lib/zuul(/.*)?       gen_context(system_u:object_r:zuul_var_lib_t,s0)
/var/opt/rh/rh-python35/log/zuul(/.*)?       gen_context(system_u:object_r:zuul_log_t,s0)

In the .te file, declare the new types:

# System files
type zuul_conf_t;
files_type(zuul_conf_t)
type zuul_var_lib_t;
files_type(zuul_var_lib_t)
type zuul_log_t;
logging_log_file(zuul_log_t)

# Secret files
type zuul_keys_t;
files_type(zuul_keys_t)

# Network label
type zuul_gearman_port_t;
corenet_port(zuul_gearman_port_t)
type zuul_webapp_port_t;
corenet_port(zuul_webapp_port_t);

Note that the file_type() macro is important since it provides unconfined access to the new types. Without it, even the admin user could not access the file.

In the .spec file, add the new path and setup the tcp port labels:

%define relabel_files() \
restorecon -R /var/opt/rh/rh-python35/lib/zuul/keys
...

# In the %post section, add
semanage port -a -t zuul_gearman_port_t -p tcp 4730
semanage port -a -t zuul_webapp_port_t -p tcp 8001

# In the %postun section, add
for port in 4730 8001; do semanage port -d -p tcp $port; done

Rebuild and install the package:

sudo ./zuul_scheduler.sh && sudo rpm -ivh ./noarch/*.rpm

Check that the new types are installed using "ls -Z" and "semanage port -l":

$ ls -Zd /var/opt/rh/rh-python35/lib/zuul/keys/
drwx------. zuul zuul system_u:object_r:zuul_keys_t:s0 /var/opt/rh/rh-python35/lib/zuul/keys/
$ sudo semanage port -l | grep zuul
zuul_gearman_port_t            tcp      4730
zuul_webapp_port_t             tcp      8001

Update the policy

With the service resources now declared, let's restart the service and start using it to collect all the access it needs.

After a while, we can update the policy using "./zuul_scheduler.sh –update" which basically does: "ausearch -m avc –raw | audit2allow -R". This collects all the permissions denied to generates type enforcement rules.

We can repeat this steps until all the required accesses are collected.

Here's what looks like the resulting zuul-scheduler rules:

allow zuul_scheduler_t gerrit_port_t:tcp_socket name_connect;
allow zuul_scheduler_t mysqld_port_t:tcp_socket name_connect;
allow zuul_scheduler_t net_conf_t:file { getattr open read };
allow zuul_scheduler_t proc_t:file { getattr open read };
allow zuul_scheduler_t random_device_t:chr_file { open read };
allow zuul_scheduler_t zookeeper_client_port_t:tcp_socket name_connect;
allow zuul_scheduler_t zuul_conf_t:dir getattr;
allow zuul_scheduler_t zuul_conf_t:file { getattr open read };
allow zuul_scheduler_t zuul_exec_t:file getattr;
allow zuul_scheduler_t zuul_gearman_port_t:tcp_socket { name_bind name_connect };
allow zuul_scheduler_t zuul_keys_t:dir getattr;
allow zuul_scheduler_t zuul_keys_t:file { create getattr open read write };
allow zuul_scheduler_t zuul_log_t:file { append open };
allow zuul_scheduler_t zuul_var_lib_t:dir { add_name create remove_name write };
allow zuul_scheduler_t zuul_var_lib_t:file { create getattr open rename write };
allow zuul_scheduler_t zuul_webapp_port_t:tcp_socket name_bind;

Once the service is no longer being denied permissions, we can remove the "permissive zuul_scheduler_t;" declaration and deploy it in production. To avoid issues, the domain can be set to permissive at first using:

$ sudo semanage permissive -a zuul_scheduler_t

Too long, didn't read

In short, to confine a service:

  • Use sepolicy generate
  • Declare the service's resources
  • Install the policy and restart the service
  • Use audit2allow

Here are some useful documents:

by tristanC at September 06, 2017 05:37 PM

OpenStack Superuser

Not just a road: The computing history behind “Pike”

The OpenStack community recently released Pike, the 16th release of OpenStack. As always, the OpenStack Foundation team has been busy fielding questions about the release from press, analysts and OpenStack supporters. Most are about the latest features, but on occasion a brave soul has asked, “So…uh…what’s a ‘pike’?”

“The Massachusetts Turnpike. It’s a road near Boston.”

“So this release is named after a road?”

Stop the presses! This isn’t just any road. It’s the road, and it has a greater significance to Boston’s technology history than you might think.

To get the story from a local’s point of view, I called up my father, who we’ll call “Billy from Boston.” Billy was born and raised in the Boston suburbs. He was in his teens when they began building the Massachusetts Turnpike in 1955. At this time Wang Laboratories, headquartered in nearby Cambridge, was taking off. “I had a Wang Calculator while Bill Gates was still wearing diapers!” claims my father. While this claim can be refuted by some timeline checking and basic math, the point is well taken: Boston had a high-tech industry well before much of the US.

Industries take people and there were two major roads that made “the people” factor possible in Boston’s tech story: The Massachusetts Turnpike and Route 128. Route 128 runs north-south in a semicircle, connecting the western Boston suburbs. Route 128 was called “America’s Technology Highway” because of all the labs and computing companies along the route. The Turnpike, which, if you want to sound like a local should only be called the “Mass Pike,” stretched the length of the state and brought all east-west bound traffic into Route 128 and Boston.

For the next two decades, science, research and computing in Boston grew––wave to college freshman Richard Stallman at Harvard in 1970 as we zoom by on our history adventure––laying many of the foundations for modern computing. “The Mass Pike made access to all the high tech jobs on Route 128 way easier. You drove in during the day, out at night,” says Billy, “and it worked. It resulted in the great economic success and high tech industry of Massachusetts.”

The economic success Billy from Boston is referring to is called the “Massachusetts Miracle,” when in 1975, the unemployment rate in the state fell from 12 percent to 3 percent and kept a thriving economy through the 80s, due to Boston’s tech industry.

So Pike isn’t exactly “just a road.” It’s a transportation milestone for a state, a puzzle piece in computing history and a key to unlocking the economic prowess of technology.

Each OpenStack development cycle has a code name proposed and chosen by the community, this one is an homage to the long and winding road of tech.

 

Images // CC BY NC

The post Not just a road: The computing history behind “Pike” appeared first on OpenStack Superuser.

by Anne Bertucio at September 06, 2017 04:01 PM

OpenStack in Production

Scheduled snapshots

While most of the machines on the CERN cloud are configured using Puppet with state stored in external databases or file stores, there are a few machines where this has been difficult, especially for legacy applications.

Doing a regular snapshot of these machines would be a way of protecting against failure scenarios such as hypervisor failure or disk corruptions.

This could always be scripted by the project administrator using the standard functions in the openstack client but this would also involve setting up the schedules and the credentials externally to the cloud along with appropriate skills for the project administrators. Since it is a common request, the CERN cloud investigated how this could be done as part of the standard cloud offering.

The approach that we have taken uses the Mistral project to execute the appropriate workflows at a scheduled time. The CERN cloud is running a mixture of OpenStack Newton and Ocata but we used the Mistral Pike release in order to have the latest set of fixes such as in the cron triggers. With the RDO packages coming out in the same week as the upstream release, this avoided doing an upgrade later.

Mistral has a set of terms which explain the different parts of a workflow (https://docs.openstack.org/mistral/latest/terminology).

The approach needed several steps
  • Mistral tasks to define the steps
  • Mistral workflows to provide the order to perform the steps in
  • Mistral cron triggers to execute the steps on schedule

Mistral Workflows

The Mistral workflows consist of a set of tasks and a process which decides which task to execute next based on different branch criteria such as success of a previous task or the value of some cloud properties.

Workflows can be private to the project, shared or public. By making these scheduled snapshot workflows public, the cloud administrators can improve the tasks incrementally and the cloud projects will receive the latest version of the workflow next time they execute them. With the CERN gitlab based continuous integration environment, the workflows are centrally maintained and then pushed to the cloud when the test suites have completed successfully.

The following Mistral workflows were defined

instance_snapshot

Virtual machines can be snapshotted so that a copy of the virtual machine is saved and can be used for recovery or cloning in future. The instance_snapshot workflow performs this operation for both virtual machines which have been booted from volume or locally.

Parameter
Description
Default
instance
The name of the instance to be snapshot
Mandatory
pattern
The name of the snapshot to store. The text ={0}= is replaced by the instance name and the text ={1}= is replaced by the date in the format YYYYMMDDHHMM.
{0}_snapshot_{1}
max_snapshots
The number of snapshots to keep. Older snapshots are cleaned from the store when new ones are created.
0 (i.e. keep all)
wait
Only complete the workflow when the steps have been completed and the snapshot is stored in the image storage
false
instance_stop
Shut the instance down before snapshotting and boot it up afterwards.
false (i.e. do not stop the instanc)
to_addr_success
e-mail address to send the report if the workflow is successful
null (i.e. no mail sent)
to_addr_error
e-mail address to send the report if the workflow failed
null (i.e. no mail sent)

The steps for this workflow are described in the detail in the YAML/YAQL files at https://gitlab.cern.ch/cloud-infrastructure/mistral-workflows.

The operation is very fast with Ceph based boot-from-volumes since the snapshot is done within Ceph. It can however take up to a minute for locally booted VMs while the hypervisor is ensuring the complete disk contents are available. The VM is resumed and the locally booted snapshot is then sent to Glance in the background.

The high level steps are

·      Identify server
·      Stop instance if requested by instance_stop
·      If the VM is locally booted
o   Snapshot the instance
o   Clean up the oldest image snapshot if over max_snapshots
·      If the VM is booted from volume
o   Snapshot the volume
o   Cleanup oldest volume snapshot if over max_snapshots
·      Start instance if requested by instance_stop
·      If there is an error and to_addr_error is set
o   Send an e-mail to to_addr_error
·      If there is no error and to_addr_success is set
o   Send an e-mail to to_addr_success

restore_clone_snapshot
For applications which are not highly available, a common configuration is using a LanDB alias to a particular VM. In the event of a failure, the VM can be cloned from a snapshot and the LanDB alias updated to reflect the new endpoint location for the service. This workflow will create a volume if the source instance is booted from volume. The workflow is called restore_clone_snapshot.

The source instance needs to be still defined since information such as the properties, flavor and availability zone are not included in the snapshot and these are propagated by default.

Parameter
Description
Default
instance
The name of the instance from which the snapshot will be cloned
Mandatory
Date
The date of the snapshot to clone (either YYYYMMDD or YYYYMMDDHHMM)
Mandatory
pattern
The name of the snapshot to clone. The text ={0}= is replaced by the instance name and the text ={1}= is replaced by the date.
{0}_snapshot_{1}
clone_name
The name of the new instance to be created
Mandatory
avz_name
The availability zone to create the clone in.
Same as the source instance
flavor
The flavour for the cloned instance
Same as the source instance
meta
The properties to copy to the new instance
All properties are copied from the source[1]
wait
Only complete the workflow when the steps have been completed and the cloned VM is running
false
to_addr_success
e-mail address to send the report if the workflow is successful
null (i.e. no mail sent)
to_addr_error
e-mail address to send the report if the workflow failed
null (i.e. no mail sent)

Thus, cloning the machine timbfvlinux143 to timbfvclone143 requires running the workflow with the parameters

{“instance”: “timbfvlinux143”, “clone_name”: “timbfvclone143”, “date”: “20170830” }

This results in

·      A new volume created from the snapshot timbfvlinux143_snapshot_20170830
·      A new VM is created called timbfvclone143 booted from the new volume

An instance clone can be run for VMs which are booted from volume even when the hypervisor is not running. A machine can then be recovered from it's current state using the procedure

·      Instance snapshot of original machine
·      Instance clone from that snapshot (using today's date)
·      If DNS aliases are used, the alias can then be updated to point to the new instance name

For Linux guests, the rename of the hostname to the clone name occurs as the machine is booted. In the CERN environment, this took a few minutes to create the new virtual machine and then up to 10 minutes to wait for the DNS refresh.

For Windows guests, it may be necessary to refresh the Active Directory information given the change of hostname.
restore_inplace_snapshot

In the event of an issue such as a bad upgrade, the administrator may wish to roll back to the last snapshot. This can be done using the restore_inplace_snapshot workflow.

This operation works for locally booted machines, maintains the IP and MAC address but cannot be used if the hypervisor is down. It does not currently work for boot from volume until the revert to snapshot (available in Pike from https://specs.openstack.org/openstack/cinder-specs/specs/pike/cinder-volume-revert-by-snapshot.html) is in production.

Parameter
Description
Default
instance
The name of the instance from which the snapshot will be replaced
Mandatory
date
The date of the snapshot to replace from (either YYYYMMDD or YYYYMMDDHHMM)
Mandatory
pattern
The name of the snapshot to replace from. The text ={0}= is replaced by the instance name and the text ={1}= is replaced by the date.
{0}_snapshot_{1}
wait
Only complete the workflow when the steps have been completed and the replaced VM is running
false
to_addr_success
e-mail address to send the report if the workflow is successful
null (i.e. no mail sent)
to_addr_error
e-mail address to send the report if the workflow failed
null (i.e. no mail sent)





Mistral Cron Triggers
Mistral has another nice feature where it is able to run a workflow at regular intervals. Compared to standard Unix cron, the Mistral cron triggers use Keystone trusts to save the user token when the trigger is enabled. Thus, the execution is able to run without needing the credentials such as a password or valid Kerberos token.
The steps are as follows to create a cron trigger via Horizon or the CLI.
Parameter
Description
Example
Name
The name of the cron trigger
Nightly Snapshot
Workflow ID
The name or UUID of the workflow
instance_snapshot
Params
A JSON dictionary of the parameters
{“instance”: “timbfvlinux143”, “max_snapshots”: 5, “to_addr_error”: “theadmin@cern.ch”}
Pattern
A cron schedule pattern according to http://en.wikipedia.org/wiki/Cron
* 5 * * * (i.e. run daily at 5a.m.)

This will then execute the instance snapshot at 5a.m. sending a mail to theadmin@cern.ch in the event of a failure of the snapshot. 5 past copies will be kept.

Mistral Executions
When Mistral runs a workflow, it provides details of the steps executed, the timestamps for start and end along with the results. Each step can be inspected individually as part of debugging and root cause analysis in the event of failures.
The Horizon interface gives an easy interface for selecting the failing tasks. There may be tasks reported as ‘error’ but these steps can then have subsequent actions which succeed so an error step may be a normal part of a successful task execution such as using a default if no value can be found.


References

Credits
  • Jose Castro Leon from the CERN IT cloud team did the implementation of the Mistral project and the workflows described.




[1] Except for a CERN specific one called landb-alias for a DNS alias

by Tim Bell (noreply@blogger.com) at September 06, 2017 07:59 AM

September 05, 2017

OpenStack Superuser

OpenStack operator spotlight: Adobe Advertising Cloud

We’re spotlighting users and operators who are on the front lines deploying OpenStack in their organizations to drive business success. These users are taking risks, contributing back to the community and working to secure the success of their organization in today’s software-defined economy. We want to hear from you, too: get in touch with editorATopenstack.org to share your story.

Here we catch up with Adobe’s cloud platform manager Joseph Sandoval. If you’re interested in hearing more, Adobe will be keynoting at the upcoming Sydney Summit.

Describe how are you using OpenStack. What kinds of applications or workloads are you currently running on OpenStack?

Adobe Advertising Cloud is running OpenStack in production across six data centers in the US, Europe and Asia. We’re running a high volume real-time bidding application that requires low latency and high throughput to meet our growing customers demands.

What business results, at a high level, have you seen from using OpenStack? What have been the biggest benefits to your organization as a result of using OpenStack? How are you measuring the impact?

We’re seeing higher performance with a purpose-built architecture that meets our application needs. As our business continues to grow, there are demands for increased compute. OpenStack allows us to scale in repeatable modules which scale up without scaling the team out. We tracked our application performance metrics running on OpenStack Icehouse and public cloud and both had performed equally. With our recent upgrade to Mitaka, we have measured a massive performance gain (three times faster) compared to our public-cloud instances.

What’s a challenge that you’ve faced within your organization regarding OpenStack and how did you overcome it?

Our engineering community understands public cloud and the benefits it brings but had concerns about going to private cloud. But by evangelizing the platform and providing tooling that helped them get productive quickly on a purpose-built private cloud, engineers experience the benefits and agility and build features that are optimized for a consistent compute platform.

 

Superuser wants to hear more from operators like you, get in touch at editorATopenstack.org

Cover Photo // CC BY NC

The post OpenStack operator spotlight: Adobe Advertising Cloud appeared first on OpenStack Superuser.

by Nicole Martinelli at September 05, 2017 03:07 PM

John Likes OpenStack

Trick to test external ceph clusters using only tripleo-quickstart

TripleO can stand up a Ceph cluster as part of an overcloud. However, if all you have is a tripleo-quickstart env and want to test an overcloud feature which uses an external Ceph cluster, then can have quickstart stand up two heat stacks, one to make a separate ceph cluster and the other to stand up an overcloud which uses that ceph cluster.

Deploy stand alone ceph cluster

I use deploy-ceph-only.sh with ceph-only.yaml, based on Giulio's example. I add `-- stack ceph` to `openstack overcloud deploy ...` so that the Heat stack is not called "overcloud". You cannot rename a Heat stack.

After deploying the ceph cluster, get the monitor node's IP (CephExternalMonHost), use `ceph auth list` to get the secret key secret for the client.openstack keyring (CephClientKey), and look at the ceph.conf to get the FSID (CephClusterFSID), so that overcloud-ceph-ansible-external.yaml may be updated accordingly.

Deploy an overcloud to use external ceph

I use deploy-ext-ceph.sh with overcloud-ceph-ansible-external.yaml. This uses changes in tripleo and ceph-ansible which are unmerged (at this time of writing).

Results


(undercloud) [stack@undercloud ceph-ansible]$ openstack server list
+--------------------------------------+-------------------------+--------+------------------------+----------------+--------------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-------------------------+--------+------------------------+----------------+--------------+
| 28d57de8-8354-43e0-8d4e-46de33ea4672 | overcloud-controller-0 | BUILD | ctlplane=192.168.24.8 | overcloud-full | control |
| 298943dd-b3d2-4302-93fd-c45d8375ff16 | overcloud-novacompute-0 | BUILD | ctlplane=192.168.24.21 | overcloud-full | compute |
| f4d15186-775c-4cab-ae5d-c3fd48ecfccf | ceph-cephstorage-2 | ACTIVE | ctlplane=192.168.24.18 | overcloud-full | ceph-storage |
| 24da4c0f-f945-4489-bdeb-eb9b2cf70bc0 | ceph-cephstorage-0 | ACTIVE | ctlplane=192.168.24.9 | overcloud-full | ceph-storage |
| 248eacd5-e0ae-47b2-a3a9-2b4f3d0dfa6c | ceph-cephstorage-1 | ACTIVE | ctlplane=192.168.24.15 | overcloud-full | ceph-storage |
| 5af9a2ae-3492-4874-b8ab-2de2f8530b60 | ceph-controller-0 | ACTIVE | ctlplane=192.168.24.6 | overcloud-full | control |
+--------------------------------------+-------------------------+--------+------------------------+----------------+--------------+
(undercloud) [stack@undercloud ceph-ansible]$ openstack stack list
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+--------------+
| ID | Stack Name | Project | Stack Status | Creation Time | Updated Time |
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+--------------+
| c016b71d-0c73-468d-bed5-baf26d88ea23 | overcloud | d8e1f76b116f467cbe9e60b6c91c80b3 | CREATE_IN_PROGRESS | 2017-09-05T14:30:02Z | None |
| 91370b74-41bd-4923-bacb-c24d98ca148f | ceph | d8e1f76b116f467cbe9e60b6c91c80b3 | CREATE_COMPLETE | 2017-09-05T14:11:04Z | None |
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+--------------+
(undercloud) [stack@undercloud ceph-ansible]$

I had set up my virtual hardware by running `quickstart.sh -e @myconfigfile.yml` with myconfigfile.yml.

In this scenario I used puppet-ceph to deploy the ceph cluster and ceph-ansible to deploy the ceph-client, which is the reverse of a more popular scenario. All four combinations are possible, though the puppet-ceph method will be deprecated.

by John (noreply@blogger.com) at September 05, 2017 02:44 PM

September 01, 2017

OpenStack Blog

OpenStack Developer Mailing List Digest August 26th – September 1st

Succesbot Says!

  • ttx: Pike is released! [21]
  • ttx: Release automation made Pike release process the smoothest ever [22]

 

PTG Planning

  • Monasca (virtual)[1]
  • Vitrage(virtual)[2]
  • General Info & Price Hike[3]
    • Price Goes up to $150 USD
    • Happy Hour from 5-6pm on Tuesday
    • Event Feedback during Lunch on Thursday
  • Queens Goal Tempest Plugin Split [4]
  • Denver City Guide [5]
    • Please add to and peruse
  • ETSI NFV workshop[6][7]

 

Summaries

  • TC Update [8]
  • Placement/Resource Providers Update 34 [9]

 

Updates

  • Pike is official! [10]
  • Outreachy Call for Mentors and Sponsors! [11]
    • More info here[12]
    • Next round runs December 5th to March 5th
  • Libraries published to pypi with YYYY.X.Z versions[13]
    • During Kilo when the neutron vendor decomposition happened, the release version was set to 2015.1.0 for basically all of the networking projects
    • Main issue is that networking-hyperv == 2015.1.0 is currently on Pypi and whenever someone upgrades through pip, it ‘upgrades’ to 2015.1.0 because its considered the latest version
    • Should that version be unpublished?
    • Three options[14]
      • Unpublish- simplest, but goes against policy of pypi never unpublishing
        • +1 from tonyb, made a rough list of others to unpublish that need to be confirmed with PTL’s before passing to infra to unpublish[15]
      • Rename- a bunch of work for downstreams, but cleaner than unpublishing
      • Reversion- Start new versions at 3000 or something, but very hacky and ugly
    • dhellman, ttx, and fungi think that deleting it from pypi is the simplest route though not the typically recommended way of handling things
  • Removing Screen from devstack-RSN[16]
    • Work to make devstack only have a single execution mode- same between automated QA & local- is almost done!
    • Want to merge before PTG
    • Test your devstack plugins against this patch before it gets merged
    • Patch [17]
  • Release Countdown for week R+1 and R+2[18]
    • Still have release trailing deliverables to take care of
    • Need to post their Pike final release before the cycle-trailing release deadline (September 14th)
    • Join #openstack-release if you have questions
    • ttx passes RelMgmgt mantle to smcginnis

 

Pike Retrospectives

  • Nova [19]
  • QA [20]

 

[1] https://etherpad.openstack.org/p/monasca_queens_midcycle

[2] https://etherpad.openstack.org/p/vitrage-ptg-queens

[3] http://lists.openstack.org/pipermail/openstack-dev/2017-August/121637.html

[4] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121715.html

[5] https://wiki.openstack.org/wiki/PTG/Queens/CityGuide

[6] http://lists.openstack.org/pipermail/openstack-dev/2017-August/121494.html

[7] https://etherpad.openstack.org/p/etsi-nfv-openstack-gathering-denver

[8] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121711.html

[9] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121734.html

[10]  http://lists.openstack.org/pipermail/openstack-dev/2017-August/121647.html

[11] http://lists.openstack.org/pipermail/openstack-dev/2017-August/121656.html

[12] https://wiki.openstack.org/wiki/Outreachy/Mentors

[13] http://lists.openstack.org/pipermail/openstack-dev/2017-August/121598.html

[14] http://lists.openstack.org/pipermail/openstack-dev/2017-August/121602.html

[15] http://lists.openstack.org/pipermail/openstack-dev/2017-August/121623.html

[16] http://lists.openstack.org/pipermail/openstack-dev/2017-August/121681.html

[17] https://review.openstack.org/#/c/499186/

[18] http://lists.openstack.org/pipermail/openstack-dev/2017-September/121706.html

[19] https://etherpad.openstack.org/p/nova-pike-retrospective

[20] https://etherpad.openstack.org/p/qa-pike-retrospective

[21] http://eavesdrop.openstack.org/irclogs/%23openstack-dev/%23openstack-dev.2017-08-30.log.html#t2017-08-30T16:07:24

[22] http://eavesdrop.openstack.org/irclogs/%23openstack-dev/%23openstack-dev.2017-08-30.log.html#t2017-08-30T16:08:07

by Kendall Nelson at September 01, 2017 09:53 PM

OpenStack Superuser

Getting started with OpenStack? Check out this cheat sheet

Following the long tradition of one-page cheat sheets for common UNIX applications (remember EMACS, Bash, regular expressions?) Aviv Lichtigstein, head of product evangelism at Loom Systems recently made a nice-looking one for OpenStack where he covers some common commands for services Keystone, Glance, Nova, Neutron and Cinder.

From my own experience, having a proper cheat sheet will make your life so much easier,” he says over on the Loom Systems blog.  “When I just started out with software engineering (back in 2004), I had my own cheat sheet, too.” 

You can download the PDF here.

 

Hat tip/Reddit

Superuser is always interested in community content, get in touch: editorATopenstack.org

The post Getting started with OpenStack? Check out this cheat sheet appeared first on OpenStack Superuser.

by Superuser at September 01, 2017 03:33 PM

RDO

RDO Pike released

The RDO community is pleased to announce the general availability of the RDO build for OpenStack Pike for RPM-based distributions, CentOS Linux 7 and Red Hat Enterprise Linux. RDO is suitable for building private, public, and hybrid clouds. Pike is the 16th release from the OpenStack project, which is the work of more than 2300 contributors from around the world (source).

The release is making its way out to the CentOS mirror network, and should be on your favorite mirror site momentarily.

The RDO community project curates, packages, builds, tests and maintains a complete OpenStack component set for RHEL and CentOS Linux and is a member of the CentOS Cloud Infrastructure SIG. The Cloud Infrastructure SIG focuses on delivering a great user experience for CentOS Linux users looking to build and maintain their own on-premise, public or hybrid clouds.

All work on RDO, and on the downstream release, Red Hat OpenStack Platform, is 100% open source, with all code changes going upstream first.

New and Improved

Interesting things in the Pike release include:

Added/Updated packages

The following packages and services were added or updated in this release:

  • Kuryr and Kuryr-kubernetes: an integration between OpenStack and Kubernetes networking.
  • Senlin: a clustering service for OpenStack clouds.
  • Shade: a simple client library for interacting with OpenStack clouds, used by Ansible among others.
  • python-pankoclient: a client library for the event storage and REST API for Ceilometer.
  • python-scciclient: a ServerView Common Command Interface Client Library, for the FUJITSU iRMC S4 - integrated Remote Management Controller.

Other additions include:

Python Libraries

  • os-xenapi
  • ovsdbapp (deps)
  • python-daiquiri (deps)
  • python-deprecation (deps)
  • python-exabgp
  • python-json-logger (deps)
  • python-netmiko (deps)
  • python-os-traits
  • python-paunch
  • python-scciclient
  • python-scrypt (deps)
  • python-sphinxcontrib-actdiag (deps) (pending)
  • python-sphinxcontrib-websupport (deps)
  • python-stestr (deps)
  • python-subunit2sql (deps)
  • python-sushy
  • shade (SDK)
  • update XStatic packages (update)
  • update crudini to 0.9 (deps) (update)
  • upgrade liberasurecode and pyeclib libraries to 1.5.0 (update) (deps)

Tempest Plugins

  • python-barbican-tests-tempest
  • python-keystone-testst-tempest
  • python-kuryr-tests-tempest
  • python-patrole-tests-tempest
  • python-vmware-nsx-tests-tempest
  • python-watcher-tests-tempest

Puppet-Modules

  • puppet-murano
  • puppet-veritas_hyperscale
  • puppet-vitrage

OpenStack Projects

  • kuryr
  • kuryr-kubernetes
  • openstack-glare
  • openstack-panko
  • openstack-senlin

OpenStack Clients

  • mistral-lib
  • python-glareclient
  • python-pankoclient
  • python-senlinclient

Contributors

During the Pike cycle, we started the EasyFix initiative, which has resulted in several new people joining our ranks. These include:

  • Christopher Brown
  • Anthony Chow
  • T. Nicole Williams
  • Ricardo Arguello

But, we wouldn't want to overlook anyone. Thank you to all 172 contributors who participated in producing this release:

Aditya Prakash Vaja, Alan Bishop, Alan Pevec, Alex Schultz, Alexander Stafeyev, Alfredo Moralejo, Andrii Kroshchenko, Anil, Antoni Segura Puimedon, Arie Bregman, Assaf Muller, Ben Nemec, Bernard Cafarelli, Bogdan Dobrelya, Brent Eagles, Brian Haley, Carlos Gonçalves, Chandan Kumar, Christian Schwede, Christopher Brown, Damien Ciabrini, Dan Radez, Daniel Alvarez, Daniel Farrell, Daniel Mellado, David Moreau Simard, Derek Higgins, Doug Hellmann, Dougal Matthews, Edu Alcañiz, Eduardo Gonzalez, Elise Gafford, Emilien Macchi, Eric Harney, Eyal, Feng Pan, Frederic Lepied, Frederic Lepied, Garth Mollett, Gaël Chamoulaud, Giulio Fidente, Gorka Eguileor, Hanxi Liu, Harry Rybacki, Honza Pokorny, Ian Main, Igor Yozhikov, Ihar Hrachyshka, Jakub Libosvar, Jakub Ruzicka, Janki, Jason E. Rist, Jason Joyce, Javier Peña, Jeffrey Zhang, Jeremy Liu, Jiří Stránský, Johan Guldmyr, John Eckersberg, John Fulton, John R. Dennis, Jon Schlueter, Juan Antonio Osorio, Juan Badia Payno, Julie Pichon, Julien Danjou, Karim Boumedhel, Koki Sanagi, Lars Kellogg-Stedman, Lee Yarwood, Leif Madsen, Lon Hohberger, Lucas Alvares Gomes, Luigi Toscano, Luis Tomás, Luke Hinds, Martin André, Martin Kopec, Martin Mágr, Matt Young, Matthias Runge, Michal Pryc, Michele Baldessari, Mike Burns, Mike Fedosin, Mohammed Naser, Oliver Walsh, Parag Nemade, Paul Belanger, Petr Kovar, Pradeep Kilambi, Rabi Mishra, Radomir Dopieralski, Raoul Scarazzini, Ricardo Arguello, Ricardo Noriega, Rob Crittenden, Russell Bryant, Ryan Brady, Ryan Hallisey, Sarath Kumar, Spyros Trigazis, Stephen Finucane, Steve Baker, Steve Gordon, Steven Hardy, Suraj Narwade, Sven Anderson, T. Nichole Williams, Telles Nóbrega, Terry Wilson, Thierry Vignaud, Thomas Hervé, Thomas Morin, Tim Rozet, Tom Barron, Tony Breeds, Tristan Cacqueray, afazekas, danpawlik, dnyanmpawar, hamzy, inarotzk, j-zimnowoda, kamleshp, marios, mdbooth, michaelhenkel, mkolesni, numansiddique, pawarsandeepu, prateek1192, ratailor, shreshtha90, vakwetu, vtas-hyperscale-ci, yrobla, zhangguoqing, Vladislav Odintsov, Xin Wu, XueFengLiu, Yatin Karel, Yedidyah Bar David, adriano petrich, bcrochet, changzhi, diana, djipko, dprince, dtantsur, eggmaster, eglynn, elmiko, flaper87, gpocentek, gregswift, hguemar, jason guiditta, jprovaznik, mangelajo, marcosflobo, morsik, nmagnezi, sahid, sileht, slagle, trown, vkmc, wes hayutin, xbezdick, zaitcev, and zaneb.

Getting Started

There are three ways to get started with RDO.

  • To spin up a proof of concept cloud, quickly, and on limited hardware, try an All-In-One Packstack installation. You can run RDO on a single node to get a feel for how it works.
  • For a production deployment of RDO, use the TripleO Quickstart and you'll be running a production cloud in short order.
  • Finally, if you want to try out OpenStack, but don't have the time or hardware to run it yourself, visit TryStack, where you can use a free public OpenStack instance, running RDO packages, to experiment with the OpenStack management interface and API, launch instances, configure networks, and generally familiarize yourself with OpenStack. (TryStack is not, at this time, running Pike, although it is running RDO.)

Getting Help

The RDO Project participates in a Q&A service at ask.openstack.org, for more developer-oriented content we recommend joining the rdo-list mailing list. Remember to post a brief introduction about yourself and your RDO story. You can also find extensive documentation on the RDO docs site.

The #rdo channel on Freenode IRC is also an excellent place to find help and give help.

We also welcome comments and requests on the CentOS mailing lists and the CentOS and TripleO IRC channels (#centos, #centos-devel, and #tripleo on irc.freenode.net), however we have a more focused audience in the RDO venues.

Getting Involved

To get involved in the OpenStack RPM packaging effort, see the RDO community pages and the CentOS Cloud SIG page. See also the RDO packaging documentation.

Join us in #rdo on the Freenode IRC network, and follow us at @RDOCommunity on Twitter. If you prefer Facebook, we're there too, and also Google+.

by Rich Bowen at September 01, 2017 03:21 PM

James Page

OpenStack Pike for Ubuntu 16.04 LTS

Hi All,

The Ubuntu OpenStack team at Canonical is pleased to announce the general availability of OpenStack Pike for Ubuntu 16.04 LTS via the Ubuntu Cloud Archive. Details of the Pike release can be found in the OpenStack release notes for Pike.

Ubuntu 16.04 LTS

You can enable the Ubuntu Cloud Archive pocket for OpenStack Pike on Ubuntu 16.04 LTS installations by running the following commands:

sudo add-apt-repository cloud-archive:pike
sudo apt update

The Ubuntu Cloud Archive for Pike includes updates for:

aodh, barbican, ceilometer, ceph (12.2.0 Luminous), cinder, congress, designate, designate-dashboard, dpdk (17.05.1), glance, gnocchi, heat, horizon, ironic, libvirt (3.6.0), keystone, magnum, manila, manila-ui, mistral, murano, murano-dashboard, networking-ovn, networking-sfc, neutron, neutron-dynamic-routing, neutron-fwaas, neutron-lbaas, neutron-lbaas-dashboard, nova, nova-lxd, openstack-trove, openvswitch (2.8.0 pre-release), panko, qemu (2.10), sahara, sahara-dashboard, senlin, swift, trove-dashboard, watcher and zaqar

Open vSwitch will be updated to the 2.8.0 release as soon as it’s available.

For a full list of packages and versions, please refer to the Pike UCA version tracker.

Branch Package Builds

If you would like to try out the latest updates to git branches, we deliver continuously integrated packages on each upstream commit via the following PPA’s:

sudo add-apt-repository ppa:openstack-ubuntu-testing/newton
sudo add-apt-repository ppa:openstack-ubuntu-testing/ocata
sudo add-apt-repository ppa:openstack-ubuntu-testing/pike

Reporting Bugs

If you have any issues please report bugs using the ‘ubuntu-bug’ tool to ensure that bugs get logged in the right place in Launchpad:

sudo ubuntu-bug nova-conductor

Thanks to everyone who has contributed to OpenStack Pike, both upstream and downstream!

Have fun and see you all for Queens!

Regards,

James

(on behalf of the Ubuntu OpenStack team)


by JavaCruft at September 01, 2017 03:11 PM

Openstack Security Project

OpenStack Security Notes, and how they help you the Operator

For this post I will explain what OpenStack Security notes are, and how they benefit operators in securing an OpenStack Cloud.

OpenStack Security Notes (OSSN’s) are solely to notify operators of a discovered risk, that are often not directly addressed by a code patch.

OSSN’s can be in the form of a deployment architecture recommendation, configuration value or a file permission.

Consider the meme ‘If you do this, you’re going to have a bad time’ to get an idea of what OSSN’s are about.

Some examples of recent OSSN’s would be:

The end to end process of an OSSN, starts when a member of the security project, a project core, or a VMT member, mark a launchpad bug by adding the ‘OpenStack Security Note’ group. An author will then assign themselves to the bug, and will commit to authoring the OSSN. Public notes may be worked on by anyone, whereas embargoed notes are only handled by the security project core members.

Once the author has a draft in place, they will submit a patch to the security-docs repo, where other members of the security project and cores from the related project of the original launchpad bug, can review the note content.

After the patch has received two +2 reviews from security project core members, and a +1 from a core within the concerned project, the OSSN is merged into the security-docs repository.

Once merged, the reviewed text will be posted to the OpenStack Wiki , and a GPG signed email will be sent to the openstack & openstack-dev mailing lists.

The OpenStack Security Project welcomes anyone who wants to help Author or review OSSN’s. Security Notes are often a path to the election of core members of the OpenStack security project. OSSN authorship was how I personally found myself elected almost two years back.

Anyone new to the security project offering to help author a Security Note, will be given lots of support on creating their first OSSN from other Security Project members.

We also welcome feedback from operators on how valuable you find OSSN’s, and ways you feel may improve the process. After all, the process is there to benefit you the operator.

For anyone with an interest in OpenStack Security, the OpenStack Security Project can be found on the irc-channel #openstack-security and we meet weekly on #openstack-meeting-alt every Thursday @ 17:00 UTC time.

You can also email the security project on the OpenStack developer mailing list, by using a [security] tag in the subject line.

Luke Hinds (Security Project PTL)

September 01, 2017 12:00 AM

August 31, 2017

RDO

Video interviews at the Denver PTG (Sign up now!)

TL;DR: Sign up here for the video interviews at the PTG in Denver next month.

Earlier this year, at the PTG in Atlanta I did video interviews with some of the Red Hat engineering who were there.

You can see these videos on the RDO YouTube channel.

Or you can see the teaser video here:

This year, I'll be expanding that to everyone - not just Red Hat - to emphasize the awesome cooperation and collaboration that happens across projects, and across companies.

If you'll be at the PTG, please consider signing up to talk to me about your project. I'll be conducting interviews starting on Tuesday morning, and you can sign up here

Please see the "planning for your interview" tab of that spreadsheet for the answers to all of your questions about the interviews. Or contact me directly at rbowen AT red hat DOT com if you have more questions.

by Rich Bowen at August 31, 2017 05:29 PM

About

Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.

Subscriptions

Last updated:
September 24, 2017 01:06 AM
All times are UTC.

Powered by:
Planet