August 15, 2018

Mirantis

How to deploy Spinnaker on Kubernetes: a quicker and dirtier guide

Earlier this year we gave you an easy way to deploy Spinnaker to Kubernetes. Now we're giving you two ways that are even easier.

by Guest Post at August 15, 2018 11:55 PM

OpenStack Superuser

Buffer your cloud knowledge: Upcoming training and webinars

Life long learning is a must for tech professionals. While we hope you’re getting in some well-deserved rest and relaxation, here are our top picks for free or low cost upcoming back-to-school opps.

Suse Expert Days 2018

Offered in more than 80 cities worldwide, the SUSE Expert Days tour offers a free day of technical discussions, presentations and demos featuring SUSE engineers and experts, including Brent Schroeder—CTO of SUSE and Alan Clark—Chairman of the Board of the OpenStack Foundation. This year’s theme: Open. Redefined.
Events kick off in late August for Europe and Latin America and September 12 in North America. Full list of events here.

Accelerating digital transformation with containers, fast data and hybrid cloud

In this joint webinar, HCL and Mesosphere will provide successful digital transformation strategies by automating operations for data services, micro-services, and applications consistently and securely across hybrid cloud infrastructures, including data center, edge and multi-cloud environment.
September 5, 12:30 p.m. ET. Register here.

Navigating the challenges and benefits of the cloud

Organized by the Security Industry Association (SIA) this webinar will show the benefits of cloud-based access control systems designed to allow users to control and administer access control capabilities from anywhere, at any time and on any device. Cloud-based systems are a unique niche fit for small-to medium-sized customers that have little to no network infrastructure or staff to incorporate a “use anywhere” access control system. Led by Mitchell Kane, the president of Vanderbilt Industries, the webinar is especially recommended for end users, integrators and consultants.
September 6, 7:00 PM – 8:00 PM CEST. Register here.

Insights from the 451 Alliance: Who Controls the Cloud at Your Organization?

As businesses progress through the stages of cloud maturity, they adapt to utility pricing models, implement governance and procurement processes, adopt strategies for controlling cloud spending, and seek out specific areas of platform expertise and certification. At the same time, decisions around cloud spending have gone from developer experiments to major organizational choices made at the top levels of IT management.

This webinar is led by Liam Eagle, research manager at 451 Research, who’ll examine these and other insights from 451 Alliance’s cloud, hosting and managed services, a comprehensive quarterly study of more than a thousand of tech professionals worldwide to uncover organizational objectives and strategic trends in the cloud.
August 22,  11:30 a.m.-12:30 p.m. Register here.

RabbitMQ vs Kafka

Messaging is at the core of many architectures and two giants in the messaging space are RabbitMQ and Apache Kafka. This webinar takes a look at RabbitMQ and Kafka within the context of real-time event-driven architectures. Each technology has made very different decisions regarding every aspect of their design, each with strengths and weaknesses, enabling different architectural patterns. It’ll be led by guest speaker Jack Vanlightly, a software architect based in Barcelona specializing in event-driven architectures, data processing pipelines and data stores both relational and non-relational.  This is part two in the series, you can check out the first part here.
August 30, 5:30 p.m. CEST. Register here.

Building sustainable, open communities

Learn about how The Linux Foundation focuses on building sustainable, open collaboration communities designed for longevity. While open source communities can start and grow with very distinct identities, there are certain governance and design principles that will setup a community for long-term success. Hosted by Michael Dolan, is VP of strategic programs supporting open source projects and legal programs at The Linux Foundation. He’s set up and launched dozens of open source and open standards projects covering technology segments including networking, virtualization, cloud, blockchain, internet of things, big data and analytics, security, containers, storage and embedded devices.
September 6, 10:00 a.m. PST. Register here.

OpenStack and OpenInfra Days

From Rome to Hanoi, there are more than 13 OpenStack or OpenInfra Days still on the calendar for 2018. OpenStack Days are a great way to get plugged into the community near you. You’ll find a list of upcoming OpenStack events — including meetup groups, hackathons and Openstack Days at https://www.openstack.org/community/events/ And if you’re interested in launching an event in your corner, here’s where to find more information.

 

Photo // CC BY NC

The post Buffer your cloud knowledge: Upcoming training and webinars appeared first on Superuser.

by Nicole Martinelli at August 15, 2018 02:04 PM

Jiří Stránský

Upgrading Ceph and OKD (OpenShift Origin) with TripleO

In OpenStack’s Rocky release, TripleO is transitioning towards a method of deployment we call config-download. Basically, instead of using Heat to deploy the overcloud end-to-end, we’ll be using Heat only to manage the hardware resources and Ansible tasks for individual composable services. Execution of software configuration management (which is Ansible on the top level) will no longer go through Heat, it will be done directly. If you want to know details, i recommend watching James Slagle’s TripleO Deep Dive about config-download.

Transition towards config-download affects also services/components which we deploy by embedding external installers, like Ceph or OKD (aka OpenShift Origin). E.g. previously we’ve deployed Ceph via a Heat resource, which created a Mistral workflow, which executed ceph-ansible. This is no longer possible with config-download, so we had to adapt the solution for external installers.

Deployment architecture

Before talking about upgrades, it is important to understand how we deploy services with external installers when using config-download.

Deployment using external installers with config-download has been developed during OpenStack’s Queens release cycle for the purpose of installing Kubernetes and OpenShift Origin. In Rocky release, installation of Ceph and Skydive services transitioned to using the same method (shout out to Giulio Fidente and Sylvain Afchain who ported those services to the new method).

The general solution is described in my earlier Kubernetes in TripleO blog post. I recommend being somewhat familiar with that before reading on.

Upgrades architecture

In OpenStack, and by extension in TripleO, we distinguish between minor updates and major upgrades, but with external installers the distinction is sometimes blurred. The solution described here was applied to both updates and upgrades. We still make a distinction between updates and upgrades with external installers in TripleO (e.g. by having two different CLI commands), but the architecture is the same for both. I will only mention upgrades in the text below for the sake of brevity, but everything described applies for updates too.

It was more or less given that we would use Ansible tasks for upgrades with external installers, same as we already use Ansible tasks for their deployment. However, we had two possible approaches suggest themselves. Option A was to execute service’s upgrade tasks and then immediately its deploy tasks, favoring service upgrade procedure which reuses a significant part of that service’s deployment procedure. Option B was to execute only upgrade tasks, giving more separation between the deployment and upgrade procedures, at the risk of producing repetitive code in the service templates.

We went with option A (upgrade procedure includes re-execution of deploy tasks). The upgrade tasks in this architecture are mainly meant to set variables which then affect what the deploy tasks do (e.g. select a different Ansible playbook to run). Note that with this solution, it is still possible to fully skip the deploy tasks if needed (using variables and when conditions), but it optimizes for maximum reuse between upgrade and deployment procedures.

Upgrades with external installers

Implementation for Ceph and OKD

With the focus on reuse of deploy tasks, and both ceph-ansible and openshift-ansible being suitable for such approach, implementing upgrades via the architecture described above didn’t require much code.

Feel free to skim through the Ceph upgrade and OKD upgrade patches to get an idea of how the upgrades were implemented.

CLI and workflow

In CLI, the external installer upgrades got a new command openstack overcloud external-upgrade run. (For minor version updates it is openstack overcloud external-update run, service template authors may decide if they want to distinguish between updates and upgrades, or if they want to run the same code.)

The command is a part of the normal upgrade workflow, and should be run between openstack overcloud upgrade prepare and openstack overcloud upgrade converge. It is recommended to execute it after openstack overcloud upgrade run, which corresponds to the place within upgrade workflow where we have been upgrading Ceph.

After introducing the new external-upgrade run command we have removed ceph-upgrade run command. This means that Ceph is no longer a special citizen in the TripleO upgrade procedure, and uses generic commands and hooks available to any other service.

Separate execution of external installers

There might be more services utilizing external installers within a single TripleO-managed environment, and the operator might wish to upgrade them separately. openstack overcloud external-upgrade run would upgrade all of them at the same time.

We started adding Ansible tags to the external upgrade and deploy tasks, allowing us to select which installers we want to run. This way openstack overcloud external-upgrade run --tags ceph would only run ceph-ansible, similarly openstack overcloud external-upgrade run --tags openshift would only run openshift-ansible. This also allows fine tuning the spot in the upgrade workflow where operator wants to run a particular external installer upgrade (e.g. before or after upgrade of natively managed TripleO services).

A full upgrade workflow making use of these possibilities could then perhaps look like this:

openstack overcloud upgrade prepare <args>
openstack overcloud external-upgrade run --tags openshift
openstack overcloud upgrade run --roles Controller
openstack overcloud upgrade run --roles Compute
openstack overcloud external-upgrade run --tags ceph
openstack overcloud upgrade converge <args>

by Jiří Stránský at August 15, 2018 12:00 AM

August 14, 2018

OpenStack Superuser

How to navigate the OpenStack Summit Berlin agenda

I love data. With the OpenStack Summit Berlin agenda going live this morning, I decided to take a look at some of the math behind November’s event. More than 100 sessions and workshops covering 35 open source projects over nine tracks—that’s a lot to cover in three days. It makes it even more challenging to build an onsite schedule while still providing yourself a chance to navigate the hallway track and collaborative Forum sessions—which will be added to the schedule in the upcoming weeks.

So who exactly can you collaborate with in Berlin? Represented by Summit speakers alone—there are 256 individuals from 193 companies and 45 countries that you may run into during the hallway track.

Before I start, I want to say a big thank you to the programming committee members who worked very hard creating the Summit schedule. It’s not an easy task—taking over 750 submissions from over 500 companies and turning into content that fits within 100 speaking slots.

Now, to take full advantage of the incredible talks that are planned for November, I wanted to share a few tips that I find helpful when putting my schedule together.

Start with the 101

Whether it’s your first Summit or you’re new to a project and want to get involved, there are a lot of sessions and workshops for you. You can either search for sessions that are tagged as 101 or you can filter the schedule for sessions marked as beginner. If there’s a particular project where you want to begin contributing, project on-boarding sessions will be added soon.

If this is your first Summit, I would recommend planning to attend some of the networking opportunities that are planned, including the opening night Open Infrastructure Marketplace Mixer.

Find the users

If there is anything I love more than data, it’s meeting new users and catching up with those I know. This makes the case study tag one of my most frequently used filters. If you are like me and enjoy learning how open infrastructure is being used in production, the Berlin Summit will not disappoint. From BMW sharing its CI/CD strategy with Zuul to Adobe Advertising Cloud sharing its OpenStack upgrade strategy, there are a lot of users sharing their open infrastructure use cases and strategies.

There are a few new case studies that have really caught my eye and have already landed on my personal schedule:

Filter by use case

Whether you’re interested in edge computing, CI/CD or artificial intelligence (AI), the Summit tracks provide a way to filter the sessions to find operators, developers and ecosystem companies pursuing that use case case.

Sessions are counted by track based on the number of submissions that are received during the call for presentations (CFP) process. For the Berlin Summit, here is the track breakdown by number of sessions:

Search the relevant open source projects

It was not a typo earlier when I mentioned that there are over 35 open source projects covered by the sessions at the Summit. Whether you’re trying to find one of the 45 Kubernetes sessions or a TensorFlow session on AI, the project-specific tags enable you to meet the developers and operators behind these projects.

Here are the top 10 open source projects and the number of sessions you can explore for each project:

Now, it’s time to start building your schedule. The official Summit mobile app will be available in the upcoming weeks, but you can still build your personal schedule in the web browser. Stay tuned on Superuser as we will feature top sessions by use case in the upcoming weeks and a few content themes spread across all nine tracks.

Photo // CC BY NC

The post How to navigate the OpenStack Summit Berlin agenda appeared first on Superuser.

by Allison Price at August 14, 2018 03:46 PM

Chris Dent

TC Report 18-33

Dead, Gone or Stable

Last week saw plenty of discussion about how to deal with projects for which no PTL was found by election or acclaim. That discussion continued this week, but also stimulated discussion on the differences between a project being dead, gone from OpenStack, or stable.

Mixed in with that are some dribbles of a theme which has become increasingly common of late:

As a group, the TC has very mixed feelings on these issues. On the one hand everyone would like to keep projects healthy and within OpenStack, where possible. On the other hand, it is important that people who are upstream contributors stop over-committing to compensate for lack of commitment from a downstream that benefits hugely from their labor. Letting projects "die" or become unofficial is one way to clearly signal that there are resourcing problems.

In fact, doing so with both Searchlight and Freezer raised some volunteers to help out as PTLs. However, both of those projects have been languishing for many months. How many second chances do projects get?

IRC for PTLs

Within all the discussion about the health of projects, there was some discussion of whether it was appropriate to expect PTLs to have and use IRC nicks. As the character of the pool of potential PTLs evolves, it might not fit. See the log for a bit more nuance on the issues.

TC Elections Soon

Six seats on the TC will be up for election. The nomination period will start at the end of this month. If you're considering running and have any questions, please feel free to ask me.

by Chris Dent at August 14, 2018 12:33 PM

August 13, 2018

OpenStack Superuser

Building and recognizing a more active user community

Upstream code is the building block for open source projects, but it requires more contributions and collaboration to truly make a community successful. Whether you’re a Nova developer submitting patches for the upcoming release or an operator running OpenStack in production who takes the time to share your feedback to improve the project, your contributions to the community are vital for community growth and health.

For several years, the OpenStack Technical Committee (TC) has been recognizing the developers who contributed upstream with the Active Technical Contributor (ATC) criteria that guaranteed discounted Summit registration as a way to recognize those contributions to the community. Two years ago, the Active User Contributor recognition process was introduced by the OpenStack User Committee (UC). In December 2016, criteria were finalized to acknowledge the contributions that operators and users make to the OpenStack project.

Building and recognizing a more active user community

What are the AUC Criteria?

At the Dublin PTG in February, the UC met to discuss the AUC program and determined that the existing list needed to be updated in order to more accurately capture the ways that operators were contributing. Recently, the UC voted and expanded the list of contributors recognized as AUCs up from the previous eight.

  • Organizers of Official OpenStack User Groups
  • Active members and contributors to functional teams and/or working groups (currently also manually calculated for WGs not using IRC)
  • Moderators of any of the operators official meet-up sessions
  • Contributors to any repository under the UC governance
  • Track chairs for OpenStack Summits
  • Contributors to Superuser (articles, interviews, user stories, etc.)
  • Active moderators on ask.openstack.org
  • User survey participants who completed a deployment survey for the most recent cycle
  • OpenStack Days organizers
  • Special Interest Group (SIG) Members nominated by SIG leaders
  • Active Women of OpenStack participants
  • Active Diversity Working Group participants

What does being an AUC mean?

As an AUC, you can run for open UC positions and can vote in the elections. Nominations are currently open and there are two available seats, so if you qualify as an AUC and are interested in representing fellow operators, nominate yourself on the UC mailing list. Elections will be held August 20 – August 24 and AUCs will be eligible to vote for their elected officials. 

AUCs also receive a discounted $300 USD ticket for the OpenStack Summit as well as the coveted AUC insignia on your badge.

If you’re unsure if you qualify as an AUC, reach out on the UC mailing list or join one of the weekly meetings on IRC. Check out the upcoming meeting schedule on the wiki.

The post Building and recognizing a more active user community appeared first on Superuser.

by Amy Marrich at August 13, 2018 02:23 PM

August 10, 2018

Jay Pipes

Project Mulligan’s Architecture

This is the second and final part of series describing Project Mulligan, the OpenStack Redo. If you missed the first part go read it. It is about changes I’d make to the community and mission of OpenStack had I my way with a theoretical reboot of the project.

In this part, I’ll be discussing what I’d change about the architecture, APIs and technology choices of OpenStack in the new world of Project Mulligan.

A word on programming language

For those of you frothing like hyenas waiting to get into a religious battle over programming languages, you will need to put your tongues back into your skulls. I’m afraid you’re going to be supremely disappointed by this post, since I quite deliberately did not want to get into a Golang vs. Python vs. Rust vs. Brainfuck debate.

Might as well get used to that feeling of disappointment now before going further. Or, whatever, just stop reading if that’s what you were hoping for.

For the record, shitty software can be written in any programming language. Likewise, excellent software can be written in most programming languages. I personally have a love/hate relationship with all three programming languages that I code with on a weekly basis (Python, Golang, C++).

Regardless of which programming language might be chosen for Project Mulligan, I doubt this love/hate relationship would change. At the end of the day, what is most important is the communication mechanisms between components in a distributed system and how various types of data are persisted. Programming language matters in neither of those things. There are bindings in every programming language for communicating over HTTP and for persisting and retrieving data from various purpose-build data stores.

Redoing the architecture

There are four areas of system design that I’d like to discuss in relation to OpenStack and Project Mulligan:

  • component layout: the topology of the system’s components and how components communicate with each other
  • dependencies: the technology choices a project makes regarding dependencies and implementation
  • pluggability: the degree to which a project tolerates flexibility of underlying implementation
  • extensibility: the degree to which a project enables being used in ways that the project was not originally intended

Before getting to Project Mulligan’s architecture, let’s first discuss the architecture of OpenStack v1.

OpenStack’s architecture

OpenStack has no coherent architecture. Period.

Within OpenStack, different projects are designed in different ways, with the individual contributors on a project team making decisions about how that project is structured, what technologies should be dependencies, and how opinionated the implementation should be (or conversely, how pluggable it should be made)

Some projects, like Swift, are highly opinionated in their implementation and design. Other projects, like Neutron, go out of their way to enable extensibility and avoid making any sort of choice when it comes to underlying implementation or hardware support.

Swift

Taking a further look at Swift, we see it is designed using a router-like topology with a top-level Proxy server routing incoming client requests along to various stand-alone daemon components fulfilling the needs of different parts of the request (object storage/retrieval, container and account metadata lookup, reaper and auditor workers, etc).

From a technology dependency point of view, Swift has very few.

There’s no message queue. Instead, the (minimal) communication between certain Swift internal service daemons is done via HTTP calls, and most Swift service daemons push incoming work requests on to internal simple in-memory queues for processing.

There is no centralized database either. Swift replicates SQLite database files from one container/account server to another. These SQLite database files are replicated between other nodes in the Swift system via shellout calls to the rsync command-line tool,

Finally, the object servers require filesystems with xattr support in order to do their work. While Swift can work with the OpenStack Identity service (Keystone), it has no interdependency with any OpenStack service nor does it utilitize any shared OpenStack library code (the OpenStack Oslo project).

Swift’s authors built some minor pluggability into how some of their Python backend classes were written, however pluggability is mostly not a priority in Swift. I’m not really aware of anyone implementing out-of-tree implementations for any of the Swift server code. Perhaps the Swift authors might comment on this blog entry and let me know if that is incorrect or outdated information.

Swift is not extensible in the sense that the core Swift software does not enable the scope of Swift’s API to extend beyond its core mission of being a highly available distributed object storage system.

Swift’s API is a pure data plane API. It is not a control plane API, meaning its API is not intended to perform execution of actions against some controllable resources. Instead, Swift’s API is all about writing and reading data from one or more objects and defining/managing the containers/accounts associated with those objects.

Neutron

Looking at Neutron, we see the opposite of Swift. From a component layout perspective, we see a top-level API server that issues RPC calls to a set of agent workers via a traditional brokered message bus.

There is a centralized database that stores object model information, however much of Neutron’s design is predicated on a plugin system that does the actual work of wiring up an L2 network. Layer 3 networking in Neutron always felt like it was bolted on and not really part of Neutron’s natural worldview. 1

Neutron’s list of dependencies is broad and is influenced by the hardware and vendor technology a deployer chooses for actually configuring networks and ports. Its use of the common OpenStack Oslo Python libraries is extensive, as its dependency on a raft of other Python libraries. It communicates directly (and therefore has a direct relationship) with the OpenStack Nova and Designate projects, and it depends on the OpenStack Keystone project for identity, authentication and authorization information.

Nearly everything about Neutron is both pluggable and extensible. Everything seems to be a driver or plugin or API extension 2. While there is a top-level API server, in many deployments it does little more than forward requests on to a proprietary vendor driver that does “the real work”, with the driver or plugin (hopefully 3) saving some information about the work it did in Neutron’s database.

The “modular L2 plugin” (ML2) system is a framework for allowing mechanism drivers that live outside of Neutron’s source tree to perform the work of creating and plugging layer-2 ports, networks, and subnets. Within the Neutron source tree, there are some base mechanism drivers that enable software-defined networking using various technologies like Linux bridges or OpenVSwitch.

This means that every vendor offering software-defined networking functionality has its own ML2 plugin (or more than one plugin) along with separate drivers for its own proprietary technology that essentially translate the Neutron worldview into the vendor’s proprietary system’s worldview (and back again). And example of this is the Cisco Neutron ML2 plugin which has mechanism drivers that speak the various Cisco-flavored networking.

One nice relatively recent development in Neutron is the separation of many “common” API constructs and machinery into the neutron-lib repository. This at least goes part way towards reducing some duplicative code and allowing out of tree source repositories to import a much smaller footprint than the entire Neutron source tree.

On the topic of extensibility in Neutron’s API, I’d like to point to Neutron’s own documentation on its API extensions, which states the following:

The purpose of Networking API v2.0 extensions is to:
– Introduce new features in the API without requiring a version change.
– Introduce vendor-specific niche functionality.
– Act as a proving ground for experimental functionalities that might be included in a future version of the API.

I’ll discuss a bit more in the section below on Project Mulligan’s API, but I’m not a fan of API extensibility as seen in Neutron. It essentially encourages a Wild West mentality where there is no consistency between API resources, no coherent connection between various resources exposed by the API, and a proliferation of vendor-centric implementation details leaking out of the API itself. Neutron’s API is not the only OpenStack project API to succumb to these problems, though. Not by a long shot.

Nova

Still other projects, like the OpenStack Nova project, have dependencies on traditional databases and brokered message queues, a component layout that was designed to address a specific scale problem but causes many other problems and a confusing blend of highly extensible, sometimes extensible, and extensible-in-name-only approaches to underlying technology choices.

Nova’s component layout features a top-level API server, similar to Neutron and Swift. From there, however, there’s virtually nothing the same about Nova. Nova uses a system of “cells” which are designed as scaling domains for the technology underpinning Nova’s component communications: a brokered message queue for communicating between various system services and a traditional relational database system for storing and retrieving state.

Take old skool RPC with all the operational pitfalls and headaches of using RabbitMQ and AMQP. Then tack on eight years of abusing the database for more than relational data and horrible database schema inefficiencies. Finally, overcomplicate both database connectivity and data migration because of a couple operators’ poor choices and inexperienced developers early in the development of Nova. And you’ve got the ball of spaghetti that Nova’s component layout and technology choice currently entails.

As for Nova’s extensibility, it varies. There is a virt driver interface that allows different hypervisors (and Ironic for baremetal) to perform the on-compute-node actions needed to start, stop, pause and terminate a VM instance. There are some out-of-tree virt drivers that ostensibly try to keep up with the virt driver interface, but it isn’t technically public so it’s mostly a “use with caution and good luck with that” affair. The scheduler component in Nova used to be ludicrously extensible, with support for all sorts of out-of-tree filters, weighers, even whole replacement scheduler drivers.

That sucked, since there was no way to change anything without breaking the world. So now, we’ve removed a good deal of the extensibility in the scheduler in order to return some level of sanity there.

Similarly, we used to give operators the ability to whole-hog replace entire subsystems like the networking driver with an out-of-tree driver of their own making. We no longer allow this kind of madness. Neutron is the only supported networking driver at this time. Same for volume management. Cinder is the one and only supported volume manager.

OK, so what does Project Mulligan look like?

For Project Mulligan, we’ll be throwing out pretty much everything and starting over. So, out with the chaos and inconsistency. In with sensibility, simplicity and far fewer plug and extension points.

Now that Project Mulligan’s scope has been healthily trimmed, we can focus on only the components and requirements for a simple machine provisioning system.

Project Mulligan will be composed of a number of components with well-defined scopes:

  • A smart client (mulligan-client) that can itself communicate with all Mulligan service daemons without needing to go through any central API or proxy service
  • An account service (mulligan-iam) providing identity and account management functionality
  • A metadata service (mulligan-meta) provides lookup and translation service for common object names, tags and key/value items
  • An inventory management service (mulligan-resource) is responsible for tracking the resources discovered in an environment and providing a simple placement and reservation engine
  • A control service (mulligan-control) that can take an incoming request to provision or decommission a machine and send a work request to a task queue
  • executors (mulligan-executor) will read task requests from a queue and performing that task, such as booting a machine or configuring a machine’s networking

What about the technologies that Project Mulligan will be dependent upon?

I’m envisioning the following dependencies:

  • etcd for storing various pieces of information and being the notification/watch engine for all services in Project Mulligan.
  • A relational database server for storing highly relational data that needs to be efficiently queried in aggregate
  • NATS for a simple task based queue service that the executors will pop tasks from. Or maybe Gearman instead of NATS, since it’s been around longer and Monty won’t blow his shit about yet another not-invented-here simple queueing application being hosted by the CNCF…

From a birdseye view, the topology of Project Mulligan and its components and dependent low-level services looks like this:

Project Mulligan component layout

All Project Mulligan services and workers will be entirely stateless. All state will be persisted to either etcd or a RDBMS.

The account service

In order to provide multi-tenancy from the start, Project Mulligan clearly needs a service that will provide account and identity management support. If OpenStack Keystone were a separate, stand-alone service that had a gRPC API interface, we’d use that. But since it’s not, we’ll have to write our own from scratch, potentially with an adapter that understands the useful part of the OpenStack Keystone REST API — you know, users, projects and roles (assignments). The rest of the Keystone API, including the hand-rolled token API, would be cut out. Application credentials would stay but would just be a type of account instead of a separate thing that lived under the user resource.

We’ll use JSON Web Tokens (JWT) as the payload format for any authorization tokens that need to be passed between Project Mulligan service endpoints (or outside to external endpoints).

The metadata service

All objects in the Project Mulligan system will be identified by a UUID. However, as we all know, remembering UUIDs and passing them around manually is a pain in the ass. So, we need a way of associating a unique name and URI-clean slug with an object in the system. The metadata service will provide a simple UUID -> name or slug lookup (and reverse lookup) service. Data stores for individual services, whether those data stores be SQL databases or not, will not store any name or slug information. Only UUID keys for objects that the service controls. The metadata service will act as a cache for this kind of name/slug identifying information. It will be backed by an etcd data store.

The second purpose of the metadata service will be to store and retrieve two other kinds of information decorating objects in the system:

  • tags: these are simple strings that are settable on any object in the system and are not validated or protected in any way
  • metadefs: these are specific named attributes that may be defined by the system or an end user, protected against changes and given a validation schema

If you’re familiar with the OpenStack Glance Metadefs concept, that’s pretty much what the second item is.

The resource service

How resources are tracked, claimed and consumed in a software system is a critical concept to get right, and get right from the start. If there’s one thing I’ve learned working on the scheduler and placement services in OpenStack, it’s that the consistency and accuracy of the data that goes into an inventory management system dictates the quality of all systems built on top of that data, including capacity management, billing, reservations, scheduling/placement and quota management.

You need to accurately and efficiently represent both the structure of the resources and providers of resources within the system as well as the process by which those resources are consumed by users of the system.

You cannot build a quality resource management system on top of a free-for-all land-grab where each vendor essentially redefines its notion of what a resource is. It was tried in OpenStack Nova. It failed and is still causing headaches today. Vendors have shifted their focus from the scorched Earth they’ve left behind in OpenStack for the new fertile hype-hunting grounds of Kubernetes, with Intel and NVIDIA pushing for more and more extensibility in how they track resource inventory and consumption. With that extensibility comes a complete lack of interoperability and, unless the Kubernetes community is very careful, a quagmire of unmaintainable code.

Project Mulligan’s resource management service will be responsible for storing inventory information for resource providers in the system. The structure and relationship of providers to each other will also be the purview of the resource management service. In other words, the resource management service will understand groups of providers, trees of providers, relative distances between providers, etc.

The process by which consumers of resources request those resources is called a resource claim. I’ve come to realize over the last four years or so that the system that handles the consumption and claiming of resources must be able to understand the temporal aspects of the resource request as well as the quantitative aspects of the request. What this means is that I believe the system that doles out resources to consumers of those resources needs to implement a resource reservation system itself. After all, a resource reservation system is simply a resource allocation system with an extra temporal dimension. Storing this temporal data along with inventory and usage information makes the most natural sense from a systems design perspective.

So, good news for the OpenStack Blazar project, your functionality is going to be subsumed by the Project Mulligan resource management service.

The control service and executors

The control service will be nothing more than a place to send command requests. These command requests will be validated by the control service and then packaged up into tasks that get pushed onto a simple queue managed by NATS.

The executors will pull a task off the task queue and execute it, sending the results of the task back on to a separate results queue. The results queue will be processed by an executor whose job will be to simply save the results state to some data store.

Redoing the API

A project’s API is its primary user interface. As such, its API is critically important to both the success of the project as well as the project’s perceived
quality and ease of use.

There is no REST API in Project Mulligan. Only gRPC APIs are supported.

The reason for this choice is that gRPC is versioned from the get-go with a sane set of clear rules for describing the evolution of the request and response payloads.

No more inane and endless debates about “proper” REST-ness or HATEOS or which HTTP code thought up in the 1990s is more appropriate for describing a particular application failure.

No more trying to shoehorn a control plane API into a data plane API or vice versa.

Astute readers will note that there is no top-level API or proxy server in Project Mulligan.

However, this doesn’t mean that there isn’t a Project Mulligan public API. The Project Mulligan API is simply the set of gRPC API methods exposed by Project Mulligan’s gRPC service components. What is not part of Project Mulligan’s public API are the internal task payload formats that get sent from the mulligan-control service to the NATS queue(s) for eventual processing by a mulligan-executor worker.

The key to making this work is developing a smart client program that contains much of the catalog, service mesh and route-dispatching functionality that a traditional top-level API or proxy server would contain.

The problem with embedding a lot of logic into client programs, though, is that you need to duplicate that logic for each programming language binding you create. I recognize this is an issue. To address it, Project Mulligan will automatically generate this core routing logic code and generate smart clients in a variety of programming languages as part of its automated build process.

Call me crazy, I know… but gRPC, in combination with Google Protocol Buffers’
protoc compiler does this exact thing: it generates a client and server binding for whatever programming language you want after looking at the .proto files that describe an interface.

What about extensibility?

I’m not interested in having Project Mulligan become a generic framework for constructing cloud applications. I’m not interested in allowing Project Mulligan’s scope and purpose to be extended or redefined by adding API abstraction machinery ala Kubernetes custom resource definitions (CRDs) to Project Mulligan.

Project Mulligan is what it is: a simple machine provisioning system. It’s not designed to be extensible. It should do one thing and do that thing cleanly, simply and efficiently.

If you want a framework for creating something that isn’t machine provisioning, feel free to go use Kubernetes’ CRDs. Note that you’ll still need Kubernetes to be installed on some machines somewhere. After all, code needs to run on a machine. And Project Mulligan is all about demystifying that process of provisioning machines.

Important things to get right, from the beginning

Here are a few things that I think need to be done correctly, right from the start of Project Mulligan.

  • Installation and configuration
  • In-place updates and re-configuration
  • Multi-tenancy and isolation
  • Partitioning and failure domains

Install and configuration

Project Mulligan will be able to be installed via traditional software package managers like apt or yum on Linux.

In addition to traditional software packages, we’ll build Docker images for the various service components.

If an organization has already deployed a container orchestration system like Kubernetes and wants to deploy Project Mulligan in a highly-available manner, they can consume these Docker images and build their own Helm charts for deploying Project Mulligan.

Configuration of Project Mulligan should be as easy as possible. There should be no need for hundreds of configuration options and a simple bootstrapping command from the mulligan-client should be enough to get a working system up and running.

In-place updates and re-configuration

Project Mulligan’s components should be able to be updated in-place and separately from each other, with no ordering dependencies.

Each component should respond to a SIGTERM with a graceful shutdown of the process, draining any queued work properly.

When the component’s package is upgraded and the component restarted, the gRPC server contained within the component will likely contain a newer version ofgRPC API request payloads. Because of the way gRPC method versioning works, other components (or the smart clients) that communicate with the component will continue to be sending request payloads formatted for an older version of the component’s API method. The upgraded server component will respond to the older clients with the message format the older client expects. This will facilitate Project Mulligan’s components being upgraded without any ordering
dependency.

I really like OpenStack Swift’s approach to upgrades. There are no separate administrative commands for performing database schema migrations. No separate utility for performing data migrations. Everything is simply handled within the controllers themselves, with the controllers executing SQL commands to migrate the schema forward as needed.

Much more operator-friendly than having a separate nova-manage db sync && nova-manage api_db sync && nova-manage db online_data_migrations --max-count=X way of performing simple forward-only schema and data migrations.

Re-configuring a running Project Mulligan component should be possible in order to tinker with runtime settings like log output verbosity. In order to facilitate this runtime re-configuration, each Project Mulligan service component will have a gRPC API method called $ServiceConfigure() that accepts a set of configuration options that are tunable at runtime.

Multi-tenancy and isolation

From the beginning, nearly all 4 OpenStack components were designed for multi-tenancy. This means the systems are designed for multiple groups of one or more users to simultaneously utilize the service without impact to each other.

The actions those users take should not impact other users of the system, nor should the resources owned by one user have any access to resources owned by another user. In other words, by default, strong isolation exists between tenants of the system.

Project Mulligan will likewise be built from the ground up with multi-tenancy as a design tenet. Resources should be wholly owned by a tenant (group of users) and isolated from each other.

Partitioning and failure domains

Project Mulligan should have clear and concise definitions of the ways in which the system itself may be divided — both for reasons of scale as well as for limiting a failure’s effects.

This implies that all objects in the system should have an attribute that describes the shard or partition that the object “lives in”.

Kubernetes refers to a collection of controllers and nodes as a “cluster”, which is the closest concept it has 5 to a system partition or shard. There is no geographic nor failure domain connotation to a Kubernetes cluster.

OpenStack unfortunately has no consistent terminology nor model of partitioning. Some services have the concept of an “availability zone”. However, as I’ve noted on the openstack-dev mailing list before, the OpenStack term of “availability zone” is a complete pile of vestigial poo that should be flushed down the sewer pipes.

In Project Mulligan, we’ll call this partitioning concept a “region”. Regions may be hierarchical in nature, and there need not be a single root region. Regions may indicate via attributes what users should expect with regards to failure of various systems — power, network, etc — and their impact on other regions. Regions will have attributes that indicate their visibility to users, allowing both internal and external partitions of the system to be managed jointly but advertised separately.

Conclusion

In case it hasn’t become obvious by now, Project Mulligan isn’t OpenStack v2. It’s a complete change of direction, an entirely trimmed and reworked vision of a very small chunk of what today’s OpenStack services are.

Is this better than slow, incremental backwards-compatible changes to OpenStack v1, as some have suggested? I have no idea. But I’m damn sure it will be more fun to work on. And isn’t that what life is really all about?

Thanks for reading.

Peace out.

-jay

Footnotes


  1. This makes perfect sense considering the origins of the Neutron project, which was founded by folks from Nicira which ended up as VMWare’s NSX technology — an L2-centric software defined networking technology. FYI, some of those same people are the driving force behind the Cilium project that is now showing promise in the container world. Isn’t it great to be able to walk away from your original ideas (granted, after a nice hiatus) and totally re-start without any of the baggage of the original stuff you wrote? I agree, which is why Project Mulligan will be a resounding success. 
  2. Even the comically-named “core extensions“. sigh
  3. The base Neutron plugin actually allows the plugin to not use Neutron’s database for state persistence, which is basically the Neutron authors relenting to vendor pressure to just have Neutron be a very thin shim over some proprietary network administration technology — like Juniper Contrail
  4. One the largest, most glaring exceptions to this multi-tenancy rule, unfortunately is OpenStack Ironic, the baremetal provisioning system. 
  5. Kubernetes actually doesn’t really define a “cluster” to be something specific, unlike many other terms and concepts that Kubernetes does define — like Pod, Deployment, etc. The term “cluster” seems to be just something the Kubernetes community landed on to refer to a collection of nodes comprising a single control plane… 

by jaypipes at August 10, 2018 04:56 PM

Project Mulligan – OpenStack Redo

Jess Frazelle’s tweet recently got me thinking. 1

Jess Frazelle's original tweet

What if I could go back and undo basically the last eight years and remake OpenStack (whatever “OpenStack” has come to entail)? What if we could have a big do-over?

In this two-part blog post, I will be describing “Project Mulligan”, the OpenStack Redo.

This first post is about mission, scope, community, and governance.

The second part describes the architecture and technology choices that Project Mulligan will employ.

This is obviously a highly opinionated reflection on what I personally would change about the world I’ve lived in for nearly a decade.

I’m bound to offend lots of people along the way in both the OpenStack and Kubernetes communities. Sorry in advance. Try not to take things personally — in many cases I’m referring as much to myself as anyone else and so feel free to join me on this self-deprecating journey.

Background

I’ve been involved in the OpenStack community for more than eight years now. I’ve worked for five different companies on OpenStack and cloud-related projects, with a focus on compute infrastructure (as opposed to network or storage infrastructure). I’ve been on the OpenStack Technical Committee, served as a Project Team Lead (PTL) and am on a number of core reviewer teams.

When it comes to technical knowledge, I consider myself a journeyman. I’m never the smartest person in the room, but I’m not afraid to share an opinion that comes from a couple decades of programming experience.

I’ve also managed community relations for an open source company, given and listened to lots of talks at conferences, and met a whole bunch of really smart and talented individuals in my time in the community.

All this to say that I feel I do have the required background and knowledge to at least put forth a coherent vision for Project Mulligan and that I am as much responsible as anyone else for the mess that OpenStack has become.

Redoing the mission

When OpenStack began, we dreamt big. The mission of OpenStack was big, bold and screamed of self-confidence. We wanted to create an open source cloud operating system.

The #1 goal in those days was expansion. Specifically, expansion of user footprint and industry mindshare. It was all about quantity versus quality. Get as much of the pie as possible.

As time rolled on, the mission got wordier, but remained as massive and vague as “cloud operating system” ever was. In 2013, the mission looked like this:

to produce the ubiquitous Open Source Cloud Computing platform that will meet the needs of public and private clouds regardless of size, by being simple to implement and massively scalable.

See the word “ubiquitous” in there? That pretty much sums up what OpenStack’s mission has been since the beginning: get installed in as many places as possible.

While “simple to implement” and “massively scalable” were aspirational, neither were realistic and both were subject to interpretation (though I think it is safe to say OpenStack has never been “simple to implement”).

Today, the mission continues to be ludicrously broad, vague, and open-ended, to the point that it’s impossible to tell what OpenStack is by reading the mission:

to produce a ubiquitous Open Source Cloud Computing platform that is easy to use, simple to implement, interoperable between deployments, works well at all scales, and meets the needs of users and operators of both public and private clouds.

“Meets the needs of users and operators of both public and private clouds” is about as immeasurable of a thing as I can think of. Again, it’s aspirational, but so broad as to be meaningless outside anything but the most abstract discussions.

Project Mulligan is getting a new mission in life:

demystify the process of provisioning compute infrastructure

It’s aspirational but not open-ended; singularly focused on the compute provisioning process.

Why “demystify”?

Despite “easy to use” and “simple to implement” being in OpenStack’s current mission, I believe OpenStack v1 has utterly failed to simplify a complex and often burdensome process. In contrast, OpenStack v1 has made a complex process (of provisioning infrastructure pieces) even more convoluted and error-prone.

If you ask me why I think OpenStack v1 has failed to deliver on these aspects of its mission, my response is that OpenStack v1 doesn’t know what it wants to be.

It has no identity other than being open and welcoming to anyone and everyone that wants to jump on the Great Cloud Bandwagon in the Sky. 2

And because of this identity crisis, there is zero focus on any one particular thing.

Well, that ends with Project Mulligan.

Project Mulligan isn’t trying to be a “cloud operating system”. Heck, it doesn’t even care what “cloud” is. Or isn’t. Or might be in the future for a DevOpsSysAdminUserator.

“OK, Jay, but what really IS ‘compute infrastructure’?”

I’m glad you asked, because that’s a perfect segue into a discussion about the scope of Project Mulligan.

Redoing the scope

Defining the scope of OpenStack is like attempting to bathe a mud-soaked cat in a bubble bath — a slippery affair that only ends up getting the bather muddy and angering the cat.

The scope of OpenStack escapes definition due to the sheer expanse of OpenStack’s mission.

Now that we’ve slashed Project Mulligan’s mission like Freddy Krueger on holiday in a paper factory, defining the scope of Project Mulligan is a much easier task.

We’re going to start with a relatively tiny scope (compared to OpenStack v1’s), and if the demand is there, we’ll expand it later. Maybe. If I’m offered enough chocolate chip cookies.

The scope of Project Mulligan is:

singular baremetal and virtual machine resource provisioning

I’ve chosen each word in the above scope carefully.

singular

“singular” was chosen to make it clear that Project Mulligan doesn’t attempt to provision multiple identical things in the same operation.

baremetal and virtual machine

“baremetal and virtual machine” was selected to disambiguate Project Mulligan’s target deployment unit. It’s not containers. It’s not applications. It’s not lambda functions. It’s not unikernels or ACIs or OCIs or OVFs or debs or RPMs or Helm Charts or any other type of package.

Project Mulligan’s target deployment unit is a machine — either baremetal or virtual.

A machine is what is required to run some code on. Containers, cgroups, namespaces, applications, packages, and yes, serverless/lambda functions require a machine to run on. That’s what Project Mulligan targets: the machine.

resource

The word “resource” was used for good reason: a resource is something that is used or consumed by some other system. How those systems describe, request, claim and ultimately consume resources is such a core concept in any software system that extreme care must be taken to ensure that the mechanics of resource management are done right, and done in a way that doesn’t hinder the creation of higher-level systems and platforms that utilize resource and usage information (like quota management and reservation systems, for example).

I go into a lot of detail below in the second part of this blog post on “Redoing the architecture” about resource management and why it’s important to be part of Project Mulligan.

provisioning

At its core, the purpose of Project Mulligan is to demystify the arcane and hideously complex process inherent in provisioning machines. Provisioning involves the setup and activation of the machine. It does not involve operational support of the machine, nor does it involve moving the machine, restarting it, pausing it, or throwing it a birthday party.

The only things that are important to be in Project Mulligan’s scope are the items that enable its mission and that cannot be fulfilled by other existing libraries or systems in a coherent way.

From an interface perspective, this means the scope of Project Mulligan is the following:

  • Disovering and managing hardware inventory (these are the resources that will be provided to consumers of Project Mulligan’s APIs)
  • Requesting, claiming and ultimately consuming those machine resources

That’s pretty much it.

Who is Project Mulligan’s target audience?

I got some early feedback on some of these ideas from OpenStack’s bull in a China shop, Monty Taylor.

Suffice to say, Monty didn’t like Project Mulligan, and for very good reason.

Users of “cloud APIs” are not Project Mulligan’s audience.

Cloud APIs are those things that allow a user to manage their applications that run on cloud (i.e. someone else’s) infrastructure. Cloud APIs allow lifecycle management of virtual or baremetal machines and containers. Cloud APIs allow a user to create some storage buckets and place objects in those buckets. Cloud APIs enable the construction of private networks that applications running in that cloud can use to communicate with each other and with the outside world.

Cloud APIs encourage users to think in terms of automated use and re-use of some technology infrastructure.

Cloud APIs are awesome!

But that’s not what Project Mulligan is for. Project Mulligan’s target audience is the providers of machines, not the users of those machines.

I see a clear line between these audiences. And I’d like Project Mulligan to focus on the operator/provider audience, not the cloud API user audience.

Does the cloud ecosystem need One True Cloud API 3? Perhaps it does. But Project Mulligan won’t be focused on making that. Besides, Monty will need something to occupy his time when not tweeting.

But what about everything else?!

I imagine at this point, I’ve offended more than three quarters of the universe by not including in Project Mulligan’s scope any of the following:

  • Object storage
  • Network provisioning and management
  • Containers
  • Security
  • Monitoring and metrics
  • Orchestration
  • Filesystems
  • Deployment automation
  • Configuration management
  • AmigaOS

Are these things important? Yep. Well, OK, maybe not AmigaOS. Do I want them in Project Mulligan’s mission statement or scope? No. No, I don’t.

Sorry, not sorry

For the record, I believe there are certain projects inside the OpenStack ecosystem that can and should live entirely outside of OpenStack.

The Gnocchi project has already led the way in this regard by decoupling itself from OpenStack’s community and tooling and standing on its own two feet out there in the big beautiful world. Good for Gnocchi. Seriously, good for Gnocchi.

I believe Swift, Keystone, Designate and Cinder should do the same. Yes, that’s right folks. I’m formally proposing that projects that can stand alone and provide a well-defined, small scope of service live outside of the OpenStack community. I believe that by doing so, these projects will encourage re-use outside of the OpenStack echo-chamber and be a greater benefit to the broad cloud software ecosystem.

Some projects, like Swift, are already ready for the umbilical cord to be cut. I’d actually go so far as to say Swift was born into the OpenStack world as a full-fledged adult to begin with. It uses no shared OpenStack Oslo Python libraries, has no inter-service dependencies to speak of, does not rely on a coordinated release cycle, and its API is rock-solid and (yay!) does one thing and one thing well.

Cinder, Designate and Keystone are more traditional OpenStack projects with the requisite heavy dependency on a number of common OpenStack Oslo Python libraries. And their APIs are not nearly as clear-cut as Swift’s. But, that said, the services themselves are well-scoped and would benefit the larger cloud software ecosystem by living outside the OpenStack bubble. Cinder already has made inroads into the Kubernetes community by being a provider of persistent volumes there. Keystone and Designate, if run as separate projects outside the OpenStack community, would likely be forced to maintain a small scope of service.

What benefit does the OpenStack echo-chamber give to these projects, anyway? Well, let’s take a look at that topic in the next section about redoing the community.

Redoing the community

I have a number of thoughts about matters of community. My views on the subject have changed — and continue to change. These views vary depending on the day of the week, whether I’m currently taking anti-depressants, and whether I’ve happened to upset more than three people in the last 24 hours while reviewing their code.

After witnessing and participating in eight years of OpenStack’s governance style, I’m eager for a change. Well, actually, a number of changes, presented here in no particular order.

Less talk, more do

I know that some colleagues in the community adore lengthy conversations on IRC and mailing list posts that meander around like a bored teenager on Quaaludes.

However, I’m tired of talking. I want to see some action taken that really kicks Project Mulligan into high gear.

Code talks volumes compared to endless specifications. I’d much rather see a pile of proof-of-concept code than spend six months endlessly debating the minutae of an upgrade path on a specification.

But unless we have leaders with a “walk, don’t talk” mentality — people who aren’t afraid to break things — I’m afraid nothing will ever change.

We need no-bullshit, this is how it’s gonna be, types of people willing to say “oh fuck no” when needed 4. But also people who say “alright, show me in code” and “I accept the fact that we won’t be able to see the future and accurately predict every tiny detail of an upgrade path or side-effect of this feature, but let’s just do it and if we need to fix things later, we will.”

These leaders will always need to bear the brunt of inevitable criticism that will pour forth from those who seek to commandeer Project Mulligan for their own devices.

Speaking of that, let’s talk a bit about vendors.

Vendors should wear helmets

Vendors should be required to wear helmets at all times when visiting Project Mulligan. And not some silly London bobby helmet. I’m talking about those bright yellow construction helmets that both announce intentions as well as protect brain matter.

Nothing gets my goat more than a vendor that isn’t up front about their vendory-ness.

Is it too much for me to want to hear, just once, “hey, I’m aware this feature is a pet project of mine and really only helps me out”.

Or better yet, “yeah, I understand that this project is made up entirely of my own internal engineers and really its just a way for us to protect our intellectual property by being first to market and (ab)using the open source ecosystem for marketing and hiring advantages”.

Unfortunately, all too often, vendors pretend like there’s nothing to see here; move along…

I am sure that as people read this, some folks are thinking, “Hmm, am I a vendor?”

Well, you might be a vendor if…

  • You only ask for features that further your own company’s objectives
  • You cannot articulate why anyone other than your company would want a feature
  • You don’t review code outside of your own company
  • You only work on code for a driver to enable your own company’s technology
  • If you provide any documentation links at all, you can only provide links to internal documentation that first needs to go through legal review in order for that link to be made public

“But Jay, 95% of contributors to OpenStack work at some company that pays them to write code or work on deployment stuff for OpenStack. Don’t shit where you eat, my friend.”

Yes, you’re absolutely correct about that, my friend. And thank you for being concerned about my alimentary canal.

That said, being “a vendor” is exhibiting a certain set of behaviours when participating (or not) in an open source community. Being “a vendor” doesn’t mean “you work for a company”.

I’m afraid to be the bearer of bad news, but Project Mulligan, unlike OpenStack, isn’t going to cater to the vendors.

Project Mulligan won’t kick vendors out, but at the same time, we’re not going to go out of our way to deify “Platinum Members” or any such silliness. Vendors are free to take Project Mulligan’s source code and use it for their own products, but the Project Mulligan community won’t be spending its time promoting those products, developing “certification” policies for vendors to work towards, or attempting to cozy up to vendor-centric trade organizations that aren’t particularly related to the core mission of Project Mulligan.

Which brings me to something that bothers me about Kubernetes, from a community perspective.

Many questions on the Kubernetes user mailing list seem to be for specific vendor products — i.e. Google Cloud Platform, Google Kubernetes Engine, Red Hat OpenShift 5, etc — instead of being about Kubernetes itself. This is indicative of the tight coupling between the Kubernetes project and the vendors that host a Kubernetes SaaS offering.

While there are occasionally questions on the OpenStack mailing lists about a particular distribution of OpenStack — Mirantis OpenStack/Fuel, Red Hat OpenStack Platform or RDO, etc — not only are these questions rare, but they are often answered with pointers to the vendor’s support organization or bug tracker. In addition, you don’t see questions about getting support for one of the public clouds that run OpenStack.

This doesn’t seem to be the case for Kubernetes, where the vendored SaaS offerings don’t seem to be distinguishable from the open source project itself. Or at least, they don’t seem to be distinguishable for a great number of Kubernetes users. And the engineers working at the primary Kubernetes vendors don’t seem to have much of a problem with this equating of a product with the open source project that underpins those products.

Releases and time-based release cycles

It seems to me that coordinated releases are merely designed to coincide with the OpenStack Summit marketing events. I see no benefit whatsoever to time-based release cycles other than for marketing coordination purposes.

The time-based release cycles set artificial deadlines and freeze dates for no good reason and just end up placing a bunch of bureaucratic red tape and slowing to a crawl a process that frankly should be a simple thing that can be done (in a fully automated fashion) whenever necessary.

OpenStack Swift has been doing non-time-based releases this way for more than 8 years, eschewing the time-based release mechanics that most of the rest of OpenStack projects follow in favor of just tagging their source repository with a version tag when they feel that a certain fix or feature warrants a new release tag. And it works perfectly fine for Swift. The Swift project’s sensible decoupling and reliance on far fewer inter-project dependencies makes this possible.

I don’t see operators clamouring for a six-month release cycle — jeez, most operators have so much internal integration and legacy junk that they move at the speed of molasses with regards to migrating to more modern OpenStack releases. All operators care about is can they get from their old-ass version to some less old-ass version of their deployed software in a way that doesn’t result in too much downtime for their users.

I also don’t care about long-term support (LTS) releases. If distributions want to support those, they should feel free to do so. Many distributions make lots of money selling FUD about software that changes too quickly for their legacy customers to “handle in the way they’ve become accustomed”. So, it seems fair to me to have those vendors fully own their own LTS release processes and spend the money to maintain that code. I personally don’t feel it should be something an upstream development community should be involved in.

Therefore, Project Mulligan won’t be following a time-based release cycle. There will be no “spec freeze dates”, no “code freeze dates”, no “dependent library freezes”, no “milestone releases”. None of that.

When a feature or set of bug fixes warrants a “release”, then the source repository will get a new tag which will trigger the automated build of release artifacts. No red tape, no freeze dates.

Who gets to decide what “warrants a release”? Anyone should be able to propose that the source repository be tagged for release. If there are no objections from anyone in the contributor community, one of the core committers can trigger the release by pushing a tag. Simple is that.

Conferences and foundations

There are plenty people who like the large OpenStack Summit conferences.

I am not one of those people.

I have not attended the last two OpenStack summits; instead I’ve chosen to stick to the OpenStack Project Team Gathering events that are strictly for developers to discuss implementation proposals and brainstorm system design.

Perhaps this is because I remember the very first OpenStack Design Summit in Austin, Texas. There were about 150 engineers there, if I recall correctly. We discussed implementation possibilities, design choices, how to test Python software properly, how to set up an automated testing infrastructure system, and how to organize developers who were just beginning to come together around infrastructure software.

That first get-together, and the developer events that followed for the next couple years or so (until the OpenStack Foundation became a thing), were modeled after the successful Ubuntu Design Summit (UDS) concept where contributors to the Ubuntu Linux distribution would get together every six months to decide what would be the focus of the next release, what were the implementation proposals for various system pieces, and what decisions needed to be made to further the distribution.

The entire purpose of the original design summits was to be a working event. There were no presentations, no vendor displays, no sales people mixing with engineers. It was an event designed to get actual work done.

The need for OpenStack to become ubiquitous in the cloud world, however, meant that those halcyon days of productive, technical meetings slowly evolved into a more traditional IT conference, with keynotes, big-money sponsors, lots of marketing and sales people, along with the inevitable buzzword-ification of the whole affair.

Now, if you look at the session schedules for the OpenStack summits, all you see is buzzword soup, with never-ending mentions of “hyperscale”, “hyper-converged infrastructure”, “edge”, “NFV”, “carrier-grade”, “containers”, “MEC”, etc.

This obsession with following and promoting the hype of the day has led to a ludicrous lineup of vendor-driven sessions that tend to drive people who are looking for solid technical content screaming for the hills.

Here’s a smattering of sessions from the most recent OpenStack Summit in Vancouver that are perfect examples of the type of artisanal, farm-to-table buzzword vendor bullshit that has pervaded these events.

That last one is basically just an advertisement for Intel storage technology.

And then there are the sessions (keynotes even!) that are devoted to completely non-existent vaporware — such as Akraino — because the companies behind the vaporware idea (in the case of Akraino, that would be AT&T and, yes again, Intel) are a powerful vendor lobby inside the OpenStack Foundation since they are platinum foundation members.

These companies have successfully played the OpenStack Foundation and the Linux Foundation against each other to see which foundation will buy into their particular flavor of vaporware du jour. Intel and subsidiary Windriver Systems did this with the StarlingX project as well.

This kind of vendor-driven political bullshit is the reason that Project Mulligan will be quite different when it comes to community management.

Project Mulligan won’t have a foundation, sorry. I personally like quite a few of the employees at the OpenStack Foundation. I even (mostly) support recent transition to becoming an Open Infrastructure Foundation. That said, I can’t stand the hype machining that the foundation encourages along with the Linux Foundation 6 and its Cloud Native Computing Foundation (CNCF) offshoot and OPNFV project. 7

Nor will Project Mulligan have marketing conferences. For get-togethers, there will be a return to the original design summit idea, crossed with a “hack days” type of flavor. We’ll call them “Design Days”.

All developers as well as operators of Project Mulligan will be welcome at these design days. Developers of Project Mulligan means anyone who has committed code or anyone who wants to contribute code to Project Mulligan and learn about it.

The events will be self-funded and bootstrap-organized, likely at university venues that donate space over a long weekend. I used to organize these types of events in the MySQL community and they were very successful in bringing community contributors in the ecosystem together for a fun and productive few days.

On to the architecture and technology fun

Are you now craving more fast and loose opinions on how OpenStack lost its way with its mission? Well, then, proceed apace to part two for some tasty blather about blowing up the OpenStack technology and system architecture and starting over in Project Mulligan.

Footnotes


  1. Yes, I’m fully aware Jess Frazelle wasn’t actually asking the OpenStack community what the next version of OpenStack would look like. Rather, she was opining that some negative aspects of the OpenStack ecosystem and approach have snuck into Kubernetes. Still, I think it’s an interesting question to ponder, regardless. 
  2. Numerous folks have pointed to the “Big Tent” initiative from 2014-2015 as being the reason that OpenStack “lost its focus”. I’ve repeatedly called bullshit on this assertion, and will again do so here. The Big Tent initiative did not redefine OpenStack’s mission. It was a restructuring of how the OpenStack Technical Committee evaluated new project applicants to the OpenStack ecosystem. This is why the Big Tent was officially called “Project Structure Reform“. It changed governance procedures so that there was no more “Supreme Court of OpenStack” that had to be groveled to each time a new project came around. It absolutely did not broaden what OpenStack’s scope or mission was. Despite this, many people, even people knowledgeable of OpenStack governance internals, continue to equate the overly broad mission of OpenStack (which, again, has barely changed since 2012) with the “Big Tent” initiative. 
  3. As opposed to the existing environment that cloud users find themselves in, which is that every cloud vendor (AWS, GCP, Azure, DigitalOcean, etc) have their own APIs for performing actions against the resources exposed by their particular underlying infrastructure. You buy into that vendor’s API when you buy into the service they are providing. This is called lock-in. 
  4. OpenStack (as a whole as well as individual OpenStack projects) has suffered greatly from the inability of the maintainer community to say “no” (or even “fuck no”). Some people think it’s hard to say “no” to a feature request. Personally, I have no problem whatsoever saying “no” to virtually everything (saying “no” is basically my default answer to everyone other than my wife). Jess Frazelle’s article entitled “The Art of Closing” should be required reading for any contributor submitting a feature request and any maintainer looking for ways to not crush tender contributor feelings on a feature request (if that’s the sort of thing that keeps you up at night). 
  5. The current incarnation of OpenShift as of July 2018. They keep changing the damn thing’s purpose, rewriting it in different languages, and gradually updating the website so that you never quite know what http://openshift.com will lead to in any given month. 
  6. Tell me again why an open source foundation needs a Chief Revenue Officer
  7. My prediction is that eventually the Linux Foundation will end up subsuming the OpenStack (neé Open Infrastructure) Foundation and turning it into one of its projects like OPNFV. At which point, there will be a huge backlash from the CNCF folks who despise all things OpenStack that they consider to be vendor-driven, legacy (in CNCF world, legacy == The World Before Docker, so anything before 2013.) and over-architected while forgetting that many of the same people and companies that initially developed those over-architected OpenStack solutions are now, gasp!, working on CNCF projects. At which point the Apache Foundation will put out an announcement saying how they could see all of this coming years ago. And we’ll all have a big come to Jesus (come to Jess?) moment and realize that, holy shit, we’re all actually working on the same kinds of problems and mostly we’ve just been letting our biases about programming languages, SQL vs. NoSQL, and Slack vs. IRC drive wedges in between what should be fairly rock-solid relationships. 

by jaypipes at August 10, 2018 04:56 PM

OpenStack Superuser

How to integrate Qinling with OpenStack Swift

This tutorial was written by Neerja Narayanappa

So far my journey with Outreachy has been incredible! My learning graph has skyrocketed by contributing to the Qinling project with the help of my mentor, Lingxian Kong. I truly appreciate and value everything I’m learning from my mentor and this project. It will forever remain a major contribution to my success and achievements.

In this post, you’ll create a function using the object store service of OpenStack, Swift. Qinling is not only growing tremendously but is now integrated with Swift, where you can create functions and invoke them with not much effort. I believe it’s going to pave new ways in helping user/developer to achieve their tasks.

So let’s get started!

 

OpenStack object storage service Swift can be integrated with Qinling to create functions. You can upload your function package to Swift and create the function by specifying the container name and object name in Swift.

In this example, the function would return **“Hello, Neerja!” ** you can replace the string with the function input. The steps assume there’s already Python 2.7 runtime available in your deployment.

Step 1: Create a function deployment package

$ mkdir ~/qinling_swift_test
$ cd ~/qinling_swift_test
$ cat <<EOF > hello_world.py

def main(name='World',**kwargs):
    ret = 'Hello, %s' % name
    return ret
EOF

$ cd ~/qinling_swift_test && zip -r ~/qinling_swift_test/hello_world.zip ./*

Step 2: Upload the file to Swift

1. Create a container named "functions"

$ openstack container create functions

+---------------------------------------+------------------+------------------------------------+
| account                               | container        | x-trans-id                         |
+---------------------------------------+------------------+------------------------------------+
| AUTH_6ae7142bff0542d8a8f3859ffa184236 | functions        | 9b45bef5ab2658acb9b72ee32f39dbc8   |
+---------------------------------------+------------------+------------------------------------+

2. Add the function deployment package(.zip) to the container

$ openstack object create functions hello_world.zip

+-----------------+-----------+----------------------------------+
| object          | container | etag                             |
+-----------------+-----------+----------------------------------+
| hello_world.zip | functions | 9b45bef5ab2658acb9b72ee32f39dbc8 |
+-----------------+-----------+----------------------------------+

3. Display the container and its object 

$ openstack object show functions hello_world.zip

+----------------+---------------------------------------+
| Field          | Value                                 |
+----------------+---------------------------------------+
| account        | AUTH_6ae7142bff0542d8a8f3859ffa184236 |
| container      | functions                             |
| content-length | 246                                   |
| content-type   | application/zip                       |
| etag           | 9b45bef5ab2658acb9b72ee32f39dbc8      |
| last-modified  | Wed, 18 Jul 2018 17:45:23 GMT         |
| object         | hello_world.zip                       |
+----------------+---------------------------------------+

Step 3: Create a function and get the function ID, replace the runtime_id with the one in your deployment. Also, specify a Swift container and object name.

$ openstack **function** create --name hello_world \
--runtime $runtime_id \
--entry hello_world.main \
--container functions \
--object hello_world.zip

+-------------+----------------------------------------------------------------------------------------------+
| Field       | Value                                                                                        |
+-------------+----------------------------------------------------------------------------------------------+
| id          | f1102bca-fbb4-4baf-874d-ed33bf8251f7                                                         |
| name        | hello_world                                                                                  |
| description | None                                                                                         |
| count       | 0                                                                                            |
| code        | {u'source': u'swift', u'swift': {u'object': u'hello_world.zip', u'container': u'functions'}} |
| runtime_id  | 0d8bcf73-910b-4fec-86b1-38ace8bd0766                                                         |
| entry       | hello_world.main                                                                             |
| project_id  | 6ae7142bff0542d8a8f3859ffa184236                                                             |
| created_at  | 2018-07-18 17:46:29.974506                                                                   |
| updated_at  | None                                                                                         |
| cpu         | 100                                                                                          |
| memory_size | 33554432                                                                                     |
+-------------+----------------------------------------------------------------------------------------------+

Step 4: Invoke the function by specifying function_id

$ function_id=f1102bca-fbb4-4baf-874d-ed33bf8251f7
$ openstack **function** execution create $function_id --input Neerja

+------------------+-----------------------------------------------+
| Field            | Value                                         |
+------------------+-----------------------------------------------+
| id               | 3451393d-60c6-4172-bbdf-c681929fae07          |
| function_id      | f1102bca-fbb4-4baf-874d-ed33bf8251f7          |
| function_version | 0                                             |
| description      | None                                          |
| input            | None                                          |
| result           | {"duration": 0.031,"output": "Hello, Neerja"} |
| status           | success                                       |
| sync             | True                                          |
| project_id       | 6ae7142bff0542d8a8f3859ffa184236              |
| created_at       | 2018-07-18 17:49:46                           |
| updated_at       | 2018-07-18 17:49:48                           |
+------------------+-----------------------------------------------+

Please try out this tutorial, it really makes integrating Qinling and Swift very simple!

 

This post first appeared on Medium. Superuser is always interesting in tutorials, get in touch editorATopenstack.org

The post How to integrate Qinling with OpenStack Swift appeared first on Superuser.

by Superuser at August 10, 2018 02:23 PM

August 09, 2018

OpenStack Superuser

Getting started with cloud provider OpenStack development

This guide will help you get started with building a development environment for you to build and run a single node Kubernetes cluster with the OpenStack Cloud Provider enabled.

Contents

Prerequisites

To get started, you will need to set up your development environment.

OpenStack Cloud

You will need access to an OpenStack cloud, either public or private. You can sign up for a public OpenStack Cloud through the OpenStack Passport program, or you can install a small private development cloud with DevStack or Getting Started With OpenStack.

Once you have obtained access to an OpenStack cloud, you will need to start a development VM. The rest of this guide assumes a CentOS 7 cloud image, but should be easily transferrable to whatever development environment you prefer. You will need to have your cloud credentials loaded into your environment. For example, I use this openrc file:

export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_DOMAIN_ID=<domain_id_that_matches_name>
export OS_PROJECT_NAME=<project_name>
export OS_TENANT_NAME=<project_name>
export OS_TENANT_ID=<project_id_that_matches_name>
export OS_USERNAME=<username>
export OS_PASSWORD=<password>
export OS_AUTH_URL=http://<openstack_keystone_endpoint>/v3
export OS_INTERFACE=public
export OS_IDENTITY_API_VERSION=3
export OS_REGION_NAME=<region_name>

The specific values you use will vary based on your particular environment. You may notice that several values are aliases of one another. This is in part because the values expected by the OpenStack client and Gopher Cloud are slightly different, especially with respect to the change from using tenant to project. One of our development goals is to make this setup easier and more consistent.

Docker

Your cloud instance will need to have Docker installed. If you’re OK with working from the latest release, it’s simple enough to call the Get Docker script:

curl -sSL https://get.docker.io | bash

If you don’t want to pipe a random script from the internet into your environment, you can install the latest version of Docker with this script.

sudo yum update -y
sudo yum install -y -q epel-release yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum-config-manager --enable docker-ce-edge
sudo yum install -y -q docker-ce

However, the Kubernetes community still recommends that you run Docker v1.12. To install that version by hand you can use the following script.

sudo yum -y update
sudo yum -y -q install yum-utils
sudo yum-config-manager --add-repo https://yum.dockerproject.org/repo/main/centos/7
sudo yum -y --nogpgcheck install docker-engine-1.12.6-1.el7.centos.x86_64

You’ll want to set up Docker to use the same cgroup driver as Kubernetes

sed -i '/^ExecStart=\/usr\/bin\/dockerd$/ s/$/ --exec-opt native.cgroupdriver=systemd/' \
       /usr/lib/systemd/system/docker.service

You may want to configure your environment to allow you to control Docker without sudo:

user="$(id -un 2\>/dev/null || true)"
sudo usermod -aG docker centos

Regardless of how you install, enable start the service:

sudo systemctl daemon-reload
sudo systemctl enable docker
sudo systemctl start docker

Development tools

You’re going to need a few basic development tools and applications to get, build, and run the source code. With your package manager you can install git, gcc, glide, and etcd.

sudo yum install -y -q git gcc glide etcd

You will also need a recent version of Go and set your environment variables.

GO_VERSION=1.10
GO_ARCH=linux-amd64
curl -o go.tgz https://dl.google.com/go/go${GO_VERSION}.${GO_ARCH}.tar.gz
sudo tar -C /usr/local/ -xvzf go.tgz
export GOROOT=/usr/local/go
export GOPATH=$HOME/go

Finally, set up your Git identity and GitHub integrations.

More comprehensive setup instructions are available in the Development Guide in the Kubernetes repository. When in doubt, check there for additional setup and versioning information.

Development

Getting and Building Cloud Provider OpenStack

Following the GitHub Workflow guidlines for Kubernetes development, set up your environment and get the latest development repository. Begin by forking both the Kubernetes and Cloud-Provider-OpenStack projects into your GitHub into your local workspace (or bringing your current fork up to date with the current state of both repositories).

Set up some environment variables to help download the repositories

export GOPATH=$HOME/go
export GOROOT=/usr/local/go
export PATH=$PATH:$GOROOT/bin
export user={your github profile name}
export working_dir=$GOPATH/src/k8s.io

With your environment variables set up, clone the forks into your go environment.

mkdir -p $working_dir
cd $working_dir
git clone https://github.com/{user}/cloud-provider-openstack
cd cloud-provider-openstack

If you want to build the provider:

make

If you want to run unit tests:

make test

Getting and Building Kubernetes

To get and build Kubernetes

cd $working_dir
export KUBE_FASTBUILD=true
git clone https://github.com/{user}/kubernetes
cd kubernetes
make cross

Running the Cloud Provider in MiniKube

To run the OpenStack provider, integrated with your cloud, be sure to have sourced the environment variables. You will also need to create an /etc/kubernetes/cloud-config file with the minimum options:

[Global]
username=<username>
password=<password>
auth-url=http://<auth_endpoint>/v3
tenant-id=<project_id>
domain-id=<domain_id>

Start your cluster with the hack/local-up-cluster.sh with the proper environment variable set to enable the external cloud provider:

export EXTERNAL_CLOUD_PROVIDER_BINARY=$GOPATH/src/k8s.io/cloud-provider-openstack/openstack-cloud-controller-manager
export EXTERNAL_CLOUD_PROVIDER=true
export CLOUD_PROVIDER=openstack
export CLOUD_CONFIG=/etc/kubernetes/cloud-config
./hack/local-up-cluster.sh

After giving the cluster time to build and start, you can access it through the directions provided by the script:

export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
./cluster/kubectl.sh

Have a good time with OpenStack and Kubernetes!

 

Photo // CC BY NC

The post Getting started with cloud provider OpenStack development appeared first on Superuser.

by Chris Hoge at August 09, 2018 02:08 PM

August 08, 2018

OpenStack Superuser

OpenStack User Committee elections: Time to nominate and vote!

OpenStack has been a vast success and continues to grow. Additional ecosystem partners are enhancing support for OpenStack and it has become more and more vital that the communities developing services around OpenStack lead and influence the products movement.

The OpenStack User Committee helps increase operator involvement, collects feedback from the community, works with user groups around the globe and parses through user survey data, to name a few key tasks. Users are critical and the User Committee aims to represent them.  There are two UC seats available for this election; seats are valid for one-year term. It’s also worth keeping in mind that self-nomination is common and no third-party nomination is required.

Nominees must be individual members of the OpenStack Foundation who are also an Active User Contributor (AUC).  There are  a few things that will make your candidacy stand out:

You’re

  • An OpenStack end-user and/or operator
  • An OpenStack contributor from the User Committee working groups
  • Actively engaged in the OpenStack community
  • An organizer of an OpenStack local User Group meetup or event

Beyond the kinds of community activities you’re already engaged in, the User Committee role does add some additional work. The User Committee usually interacts on email to discuss any pending topics.

Other anticipated time requirements include:

  • Two meetings per month
  • Attending two OpenStack Summits per year, including Board Meetings (eligible for travel support if company will not sponsor)
  • Participating in ad hoc calls with OpenStack users to gather feedback and provide guidance
  • Responding to emails, reading user survey report
  • Working with User Committee election officials to hold elections twice a year

You can nominate yourself or someone else by sending an email to the user-committee@lists.openstack.org mailing-list with the subject: “UC Candidacy” by August 17, 05:59 UTC. The email should include a description of the candidate and what the candidate hopes to accomplish. More information can be found here.

Voting for the User Committee (UC) members opens on August 20 and remains open until August 24, 11:59 UTC.

We look forward to your applications!

Photo // CC BY NC

The post OpenStack User Committee elections: Time to nominate and vote! appeared first on Superuser.

by Superuser at August 08, 2018 02:03 PM

August 07, 2018

OpenStack Superuser

What’s next in Ironic 11.1

Ironic is an integrated OpenStack program that aims to provision bare metal machines instead of virtual machines, forked from the Nova bare metal driver. Think of it as a bare metal hypervisor API and a set of plugins that interact with the bare metal hypervisors. Oh and the mascot is rockin’ Pixie Boots, a heavy metal bear.

On August 6, project team lead Julia Kreger offered this update on the state of the Ironic universe:

In the past month we released Ironic 11.0 and now this week we expect to release Ironic 11.1.

Here’s what’s coming your way:

  • The “deploy_steps“ framework in order to give better control over
    what consists of a deployment.
  • BIOS settings management interfaces for the “ilo“ and “irmc“
    hardware types.
  • Ramdisk deploy interface has merged. We await your bug reports!
  • Conductors can now be grouped into specific failure domains with specific nodes assigned to those failure domains. This allows for an operator to configure a conductor in data center A to manage only the
    hardware in data center A, and not data center B.
  • Capability has been added to the API to allow driver interface values to be reset to the conductor default values when the driver name is being changed.
  • Support for partition images with ppc64le hardware has merged.
  • Previously operators could only use whole disk images on that
    architecture.
  • Out-of-band RAID configuration is now available with the “irmc“
    hardware type.
  • Several bug fixes related to cleaning, PXE, and UEFI booting.

Get involved

The Ironic team will be in full force at the upcoming PTG  from September 10-14 in Colorado. Focusing on the Stein release, there’s an Etherpad chock-a-block with topics to discuss including community goals, the API, improving performance and docs.

If you can’t make the PTG, there are other ways to interact with the team.

Discussion of the project also takes place in #openstack-ironic on irc.freenode.net. This is a great place to jump in and start your ironic adventure. The channel is very welcoming to new users – no question is a wrong question!

The team also holds one-hour weekly meetings at 1500 UTC on Mondays in the #openstack-ironic room on irc.freenode.netchaired by Julia Kreger (TheJulia) or Dmitry Tantsur (dtantsur).

The post What’s next in Ironic 11.1 appeared first on Superuser.

by Superuser at August 07, 2018 02:07 PM

Chris Dent

TC Report 18-32

The TC discussions of interest in the past week have been related to the recent PTL elections and planning for the forthcoming PTG.

PTL Election Gaps

A few official projects had no nominee for the PTL position. An etherpad was created to track this, and most of the situations have been resolved. Pointers to some of the discussion:

Where we (the TC) seem to have some minor disagreement is the extent to which we should be extending a lifeline to official projects which are (for whatever reason) struggling to keep up with responsibilities or we should be using the power to remove official status as a way to highlight need.

PTG Planning

The PTG is a month away, so the TC is doing a bit of planning to prepare. There will be two different days during which the TC will meet: Sunday afternoon before the PTG, and all day Friday. Most planning is happening on this etherpad. There is also of specific etherpad about the relationship between the TC and the Foundation and Foundation corporate members. And one for post-lunch topics.

IRC links:

If there's any disagreement in this planning process, it is over whether we should focus our time on topics we have some chance of resolving or at least making some concrete progress, or we should spend the time having open-ended discussions.

Ideally there would be time for both, as the latter is required to develop the shared language that is needed to take real action. But as is rampant in the community we are constrained by time and other responsibilities.

by Chris Dent at August 07, 2018 11:26 AM

August 06, 2018

OpenStack Superuser

From cryptocurrency mining to public GPU cloud: a transformation story

A lot of startups begin life with one idea then pivot to another, but not many have had as interesting a journey as Genesis Mining. Founded late 2013, co-founders Dutch Althoff and Julian Sprung told the story of how they decided to take the success of the crypto and “funnel it over into something that’s more public cloud consumable, or Genesis Cloud,” says Althoff.

Let’s start from the top: Genesis Mining is an Iceland-based company dedicated to cryptocurrency mining, especially cloud crypto mining. That means that private customers sign up, pay with “normal” money, then operate mining hardware at the company’s data centers. Customers reap returns of this mining in crypto, hauling out gems from the mines without having to install a noisy, hot machine in their homes. Genesis Cloud a cloud company from the founders of Genesis Mining.

The first generation miner was, well, a little artisanal, Sprung admits. “From a classical data center point of view, you might find it unprofessional, but this off-the-shelf hardware it allowed us to quickly get it started and open up those first small data centers.” The team quickly realized the potential and was soon building custom-designed mining hardware with their own server nodes designed for that purpose. Today, Genesis Mining has a completely vertically integrated supply chain, manufacturing their gear in China. They get their their own GPUs with custom tweaks and modifications as needed, making them one of the top customers of both AMD and NVIDIA.

They operate on a huge scale. Genesis Mining started out in 2014 with 25,000 customers, Sprung says those numbers “have really exploded” to roughly two million customers in 2018. Genesis mining operates data centers in more than 15 countries around the world, totaling up to more than a million GPUs for mining.

What are these data centers like? Two main ones serve customers now, in Iceland and Sweden. These hot spots were picked because in Sweden electricity is extremely cheap and in Iceland much of the energy is geothermal “so it’s actually nice green energy (and) cooling pretty much comes for free,” Sprung says. Up next are centers in the United States, Kazakhstan and China.

Sprung offered a look inside the Iceland data center, dubbed the Enigma, where 5-6,000 GPUs operate. “The style of deployment and the setup are a bit unconventional from a data center point of view but it’s very functional,” he says says, adding that an extremely high energy density allows it to operate this way. “It has come a long way from the garage with the GPUs blackened to a motherboard,” he notes. The facility does not have storage nodes or management nodes but they, too, are in the works.

Talking about hardware in the Swedish data center
actual hardware because “this is where we get our hands dirty” we have the mining hardware which is a custom-built server made by Genesis mining, dubbed the Archer at Genesis cloud. The archer node has eight GPUs, a dual-core processor, 16 gigabytes of memory and 100-200 gigabytes of storage “depending on how we we roll it out,” Althoff says.

The set-up comes with challenges. “It’s kind of like deploying OpenStack back in 2010-11 when dual-core was still a thing, so we struggle a little bit with how to add these to our OpenStack cluster,” Althoff admits. One thing that is more popular in our mine for how we deploy on an Archer node is through the use of container orchestration, so we need to be very lightweight and lean and then focus very specific workloads on those nodes. For the more general purpose GPU computing node they use Cirrus, essentially an Open Compute Project GPU node.

For more on the GPU workloads running on (Kubernetes on OpenStack, machine learning and more) catch the entire 36-minute talk below from OpenStack Days Budapest or on YouTube.

Photo // CC BY NC

The post From cryptocurrency mining to public GPU cloud: a transformation story appeared first on Superuser.

by Superuser at August 06, 2018 02:05 PM

August 04, 2018

Aija Jauntēva

Outreachy: Redfish Message registry II and code reviews

Message Registry

Since the last blog post, I have started implementing Message Registry. I am splitting this feature in several patches starting with the easy parts - mapping Message Registry File and Message Registry resources to sushy fields. In order to implement this, I encountered a need for a new data structure, dictionary, that was not present in sushy base fields. Implemented it before everything else and currently these 3 patches are in code review.

Having taken a closer look at Message Registry File, I find there are 3 ways how to serve the Registry file itself:

  • locally as a JSON file,
  • locally in an archive as one of the JSON files,
  • publicly on the Internet as a JSON file.

There is also 4th use case for sushy - to access standard message registry files when a Redfish services have not included them using one of the options above - sushy has to get standard registries elsewhere. This use case was mentioned in the previous blog post when we encountered licensing issue as the standard message files provided by DMTF are only copyrighted without any license. Last time it looked like they will have 3-clause BSD license, but as DMTF did not see them as code, they rejected this idea. Going back to OpenStack legal mailing list, they suggested to use CC license which currently is being reviewed by DMTF.

Thus, the next thing I'm working on - I need to implement registry file loading to support all 4 use cases. It is not difficult task as such, but I need to implement it within sushy design nicely. I have implemented yet another change to sushy base classes, this time to allow processing archives and I'm thinking how should I split these changes in code reviews - should changes to base classes be separate from implementing Message Registry loading or go together so that there is context and actual use case of the changes. I don't want to split too much and I don't want to create patch too big. But I'm not thinking about this too much, or at least I try not to, because it can take forever, so I just pick one and stick with it. If reason why chosen strategy does not work comes up during coding, it can be changed later.

To locally test my changes, I need to have a somewhat working mock Redfish server that has Registry Files supported. None of the mockups provided by DMTF have these Registries included, so I added my own mock files for Registries based on JSON schemas. The next thing - how to serve them. This looks like the use case for sushy-static emulator where it just needs a bunch of JSON files, but in this particular use case I also want to test working with a ZIP archive. However it looks that sushy-static emulator is written in a way it only serves JSON files and understands URL-s that correspond to folder structure and returns index.json files in those folders, definitely not supporting URL or file ending with .zip. Something to think about, if needed for sushy-static, but I'm on a mission to quickly test my code. For sushy so far it was sufficient to test the code in unit tests, but for unit tests I need to know what to mock, this time I really don't know the details of requests returning a file in a response and anyway to be 100% sure need to test that it works with some web server. nginx comes to mind and I set it up to serve static files on my computer. First time I'm doing this on Fedora, usually use Ubuntu, so turns out that SELinux is blocking my choice of port, 2000, and I had some issues with permissions, but did set it up and was able to test my changes and decide how to mock HTTP response.

Code reviews

Apart from working on these changes, I'm doing code reviews. In both directions - responding to comments on my patches and leaving some feedback on other patches. Since previous blog post one of the patches - emulating BIOS in sushy-emulator got merged \o/, but code review for Ethernet Interface emulation is still in progress. Usually it just sits and waits for attention from code reviewers as I try to respond as soon as I can to any comments while avoiding context switch when I'm working on new patches. Usually this means that I start or end my days with code reviews. Responding to code reviews can take a lot of time. I think there have been days where that's all I do. Sometimes there are questions in code reviews to which I don't know the best answer and I have to do my own research or try things out. Another thing that re-occurs in code reviews that I get a lot are comments about the code which I have written in the same style or pattern as already existing code. Which means that if I address the comment in my patch, then overall project style will vary and will become messy. I don't see it as a good thing, the project style should be the same even if it is 'wrong' or 'bad' (especially as in some cases this is subjective). Probably 'consistent' is one of the most used words in my code review comments. If it is decided that these changes are necessary, then they are done in a follow-up patch and addressed in all parts of the project to preserve consistency. This also applies when I'm reviewing other patches - trying to not let diverge from the consistency there is in the project. The way I see it - in addition to making the project easier to maintain and code easier to read, the consistency makes contribution for newcomers easier as it serves as a self-referencing sample how should implement any new features. E.g. if there were different approaches for the same problem - how to know which one to pick and why.

by ajya at August 04, 2018 07:00 PM

August 03, 2018

SUSE Conversations

SUSE OpenStack Cloud with Nutanix Hyperconverged Deployment

This past May, I had the privilege of attending Nutanix .NEXT in New Orleans and watching the late Anthony Bourdain deliver one of his last speeches. There was a palpable energy at the conference. SUSE met with Nutanix developers, customers, and partners where a simple message was overwhelmingly repeated – “I love Nutanix”. I was […]

The post SUSE OpenStack Cloud with Nutanix Hyperconverged Deployment appeared first on SUSE Communities.

by Masood at August 03, 2018 10:48 PM

OpenStack Superuser

How to deploy Kubernetes on your laptop

Minikube is the quickest way to quickly test Helm charts and other Kubernetes deployment methods locally. Minikube is described in the official docs as:

[…] a tool that makes it easy to run Kubernetes locally. Minikube runs a single-node Kubernetes cluster inside a VM on your laptop for users looking to try out Kubernetes or develop with it day-to-day.

This tutorial will show you how to get this running on your laptop.

Getting started

First things first: to install Minikube, you’ll need Virtualbox (or another virtualization option) and then follow instructions for your operating system. It’s a simple Go executable and it can easily be downloaded and run.

For example, on Linux:

curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/

To make use of the mini Kubernetes cluster you’ll also need kubectl, the command-line tool used to deploy and manage applications on Kubernetes. With kubectl, you can inspect cluster resources; create, delete and update components; and look at your new cluster and bring up example apps. The installation instructions are available for most operating systems.

One last thing to install is Helm, the Kubernetes package manager. The instructions for installing Helm are easy to follow. Once Minikube, kubectl, and Helm are installed, start Minikube: you can specify a K8s version and how much RAM to dedicate to the cluster.   Then enable the Minikube ingress addon for communication.

$ minikube start --kubernetes-version=v1.9.0 --memory 4096
$ minikube addons enable ingress

Once you’ve got Minikube started, you can run the Helm initialization.

$ helm init --wait

Your tiny cluster is now ready to run.

About the author

Stefano Maffulli is currently director of  community marketing at Scality where he is leading the efforts to bring Zenko, the open source multi-cloud controller, to developers around the world. Long an open-source advocate, he has previously worked at the OpenStack Foundation and the Free Software Foundation Europe.

The post How to deploy Kubernetes on your laptop appeared first on Superuser.

by Stefano Maffulli at August 03, 2018 02:08 PM

Chris Dent

Placement Update 18-31

This is placement update 18-31, a weekly update of ongoing development related to the OpenStack placement service.

Most Important

We are a week past feature freeze for the Rocky cycle, so finding and fixing bugs through testing and watching launchpad remains the big deal. Progress is also being made on making sure the Reshaper stack (see below) and using consumer generations in the report client are ready as soon as Stein opens.

What's Changed

A fair few bug fixes and refactorings have merged in the past week, thanks to everyone chipping in. The functional differences you might see include:

  • Writing allocations is retried server side up to ten times.
  • Placement functional tests are using some of their own fixtures for output, log, and warning capture. This may lead to different output when tests fail. We should fix issues as they come up.
  • Stats handling in the resource tracker is now per-node, meaning it is both more correct and more efficient.
  • Resource provider generation conflict handling in the report client is much improved.
  • When using force_hosts or force_nodes, limit is not used when doing GET /allocation_candidates.
  • You can no longer use unexpected fields when writing allocations.
  • The install guide has been updated to include instructions about the placement database.

Bugs

Main Themes

Documentation

Now that we are feature frozen we better document all the stuff. And more than likely we'll find some bugs while doing that documenting.

Matt pointed out in response to last week's pupdate that the two bullets that had been listed here are no longer valid because we punted on most of the functionality (fully working shared and nested providers) that needed the docs.

However, that doesn't mean we're in the clear. A good review of existing docs is warranted.

Consumer Generations

These are in place on the placement side. There's pending work on the client side, and a semantic fix on the server side, but neither are going to merge this cycle.

Reshape Provider Trees

Work has restarted on framing in the use of the reshaper from the compute manage. It won't merge for Rocky but we want it ready as soon as Stein opens.

It's all at: https://review.openstack.org/#/q/topic:bp/reshape-provider-tree

Extraction

A lot of test changes were made to prepare for the extraction of placement. Most of the remaining "uses of nova" in placement are things that will need to wait to post-extraction, but it is useful and informative to look at imports as there are some thing remaining.

On the PTG etherpad I've proposed that we consider stopping forward feature progress on Placement in Stein so that:

  • We can given nova some time to catch up and find bugs in existing placement features.
  • We can do the extraction and large backlog of refactoring work that we'd like to do.

That is at a list item of 'What does it take to declare placement "done"?'

Other

Going to start this list with the 5 that remains from the 11 (nice work!) that were listed last week. After that will be anything else I can find.

End

This is the last one of these I'm going to do for a while. It's less useful at the end and beginning of the cycle when there are often plenty of other resources shaping our attention. Also, I pretty badly need a break and an opportunity to more narrowly focus on fewer things for a while (you can translate that as "get things done rather than tracking things"). Unless someone else would like to pick up the mantle, I expect to pick it back up sometime in September. Ideally someone else would do it. It's been a very useful tool for me, and I hope for others, so it's not my wish that it go away.

by Chris Dent at August 03, 2018 01:40 PM

August 02, 2018

OpenStack Superuser

A look inside the OpenStack Upstream Institute

Calling all beginners and the curious: if you’re considering taking the free training offered by the OpenStack Upstream Institute, a recent packed class offered at OpenStack Days Brazil offers a sneak preview.

Led by OSF staffers Kendall Nelson, upstream developer advocate and Ildiko Vancsa, ecosystem technical lead,  this session was one of a number of summer classes on the road. Community member Claudio Miceli recorded the trainings and put them on YouTube and while nothing beats the training IRL the videos give you an idea of what to expect. The OUI will be offered next at OpenStack Days Nordics and the Berlin Summit.

The training program to share knowledge about the different ways of contributing to OpenStack like providing new features, writing documentation, participating in working groups. Aimed at beginners, trainers are all-star volunteers from the community. It’s broken down into modules– so if you’re a developer, project manager or interested in Working Groups, you can follow what most interests you.

The educational program is built on the principle of open collaboration and will teach the students how to find information and navigate the intricacies of the project’s technical tools and social interactions in order to get the contributions accepted. The live one and a half day class is focusing on hands-on practice like the students can use a development environment to work on real-life bug fixes or new features and learn how to test, prepare and upload them for review.

Upstream Institute attendees are also given the opportunity to join a mentoring program to get further help and guidance on their journey to become an active and successful member of the OpenStack community.

Get involved

If you’re interested in getting involved with Upstream Institute outside of these training events, please check out the weekly meetings.

They happen in #openstack-meeting-3 (IRC webclient)

  • Every two weeks (on odd weeks) on Monday at 20:00 UTC
  • Every two weeks (on even weeks) on Tuesday at 09:00 UTC

You’re also welcome to hang out in the #openstack-upstream-institute channel in between meetings.

Photo // CC BY NC

The post A look inside the OpenStack Upstream Institute appeared first on Superuser.

by Superuser at August 02, 2018 02:07 PM

August 01, 2018

OpenStack Superuser

Why small, diverse teams are key to successful dev-ops

Bringing together software development and operation in the form of dev-ops is often considered the path to innovation. However, if everyone on the team looks at the task of software building from the same perspective, the results can be more cookie-cutter than avant-garde.

“Dev-ops is based on an experimental approach, this implies coming up with new ideas or experimenting with new ideas,” says Alberta Bosco, senior product marketing manager for Puppet. “Creating new solutions to old problems is really hard, especially when you have a team composed of people coming from the same background and same life experience.” Bosco was interviewed in the second episode of Puppet’s “Agility through Diversity,” which explores why this matters and how to change it.

The panel also Marianne Calder, vice president and managing director for Puppet EMEA and Kate Self, an apprentice at British Telecom. Referencing the most recent edition of Puppet’s massive global dev-ops survey has brought other insights.  “What we’ve found is that transformational leadership — the ability to bring people together to drive a vision and a common goal — is absolutely the key to success,” says Calder. “Diversity is really a key cornerstone to that.”

So is dev-ops a good place for women to work? Take the example of apprentice Self, who bypassed university and jumped straight into the work world. She came into the apprenticeship without prior knowledge of coding and has learned everything, including a few languages on the job, she says.

“This is one area where it changes so much and so fast that you need to just constantly be looking out for opportunities to learn more and what you’re learning wouldn’t necessarily be in a book that you get at university, it would be from your colleagues who are right there,” Calder adds.

If dev-ops is a good fit for diverse talent, there are also a few hurdles to overcome. Calder suggests that people interested in getting started take advantage of the numerous meetups and informal gatherings to connect and share learnings across communities.

Many companies are also resistant to change of any kind — the DNA of dev-ops. Bosco says that small teams can often make more headway than larger ones. “Give them a project and then share the success of this project so people can see the real potential; they have evidence that the hard work they put into this change actually pays off.”

Check out the whole episode on YouTube.

The post Why small, diverse teams are key to successful dev-ops appeared first on Superuser.

by Superuser at August 01, 2018 02:24 PM

SUSE Conversations

SUSE Expert Days: An Opportunity to Learn

In September 2018, SUSE is kicking off another round of SUSE Expert Days!  That means it’s time for the Open Source specialists to go on tour to cities in every continent to share their knowledge and expertise with the community. This is a unique chance for to hear from one of SUSE’s special keynote speakers, such as Brent Schroeder—CTO of […]

The post SUSE Expert Days: An Opportunity to Learn appeared first on SUSE Communities.

by bfine at August 01, 2018 08:00 AM

July 31, 2018

OpenStack Superuser

Inside the latest edition of “Learning OpenStack Networking”

One of the core services of OpenStack, networking project Neutron is often cited in user surveys as difficult to detangle.

Author James Denton, principal architect at Rackspace with over 15 years in systems administration, has written four books in the last six years dedicated to OpenStack networking. Denton says that in part the versatility of Neutron — it can support many network technologies and topologies simultaneously — increases the complexity.

While it may never get as easy as properly twirling spaghetti, the latest edition of “Learning OpenStack Networking” provides the fundamentals. He talks to Superuser about what’s new in this edition and what he’s working on next.

Who will this help most?

This book is geared towards OpenStack operators as well as users, and breaks down many of the fundamental concepts presented in OpenStack Networking. For the user, it offers examples of using the command line interface and/or dashboard to accomplish networking-related tasks such as building networks, subnets, routers, floating IPs, load balancers and more. For the operator, the book goes a step further and demonstrates how those objects are implemented behind the scenes.

What are the main updates to this edition?

The latest edition has been updated to correspond to the Pike/Queens releases. The installation process walks the reader through a Pike install on Ubuntu 16.04 LTS. However, many of the concepts and examples translate directly to a Queens-based install (and beyond). The upstream install docs are always helpful to install the latest release, and then the book can take over from there. In this edition, I’ve removed VPNaaS and FWaaS content, but have updated the load balance as-a-service chapter to support LBaaS v2 and have added additional content related to RBAC, VLAN-aware VMs, network availability zones and BGP speaker.

What’s the steepest learning curve for people learning about OpenStack networking?

An OpenStack cloud can support so many different network technologies and topologies simultaneously, which in return increases the complexity of a given environment. The fundamental components of OpenStack Networking are built on concepts that most system administrators and users are familiar with: networks, subnets, routers, etc.

Traditional network administrators understand VLANs, NAT, and routing, but it’s the logical representation of those objects, and how they’re implemented in the virtual and physical network layers, that’s difficult for many people to understand at first.

How have you seen best practices shift over the time you’ve been working with OpenStack?

Over the last few years, most of the popular and/or relevant deployment tools have shifted to container technologies for hosting OpenStack services, which makes deployments, upgrades, and maintenance easier to perform. Ansible has also become the tool of choice for configuration management. The consolidation of tool sets and deployment methods will only help strengthen adoption and advancement, in my opinion.

Why is a book helpful now — there’s IRC, mailing lists, documentation, your video tutorials etc.?

The internet is a wonderful source of information on OpenStack and Neutron but is also a difficult place to navigate. I often turn to IRC and mailing lists for my own issues. For people new to the community or the project, though, those mediums may seem unapproachable. The book uses a single voice to provide the reader with a solid foundation in OpenStack networking concepts, and builds upon that foundation with every chapter. For quick reference, it’s a great start. When people feel comfortable with the basics, reaching out to the community is less intimidating.

What’s next for you?

In the future, I hope to become more involved with upstream development on OpenStack-Ansible in order to incorporate support for projects such as Tungsten Fabric, OVN, Cisco ACI, and more. As I gain experience with each of those projects and discover how Neutron can be extended to support those, I hope to blog about it!

The post Inside the latest edition of “Learning OpenStack Networking” appeared first on Superuser.

by Nicole Martinelli at July 31, 2018 02:29 PM

Chris Dent

TC Report 18-31

Welcome to this week's TC Report. Again a slow week. A small number of highlights to report.

Last Thursday there was some discussion of the health of the Trove project and how one of the issues that may have limited their success were struggles to achieve a sane security model. That and other struggles led to lots of downstream forking and variance which complicates presenting a useful tool.

On Monday there was talk about the nature of the PTL role and whether it needs to change somewhat to help break down the silos between projects and curtail burnout. This was initially prompted by some concern that PTL nominations were lagging. As usual, there were many last minute nominations.

The volume of work that continues to consolidate on individuals is concerning. We must figure out how to let some things drop. This is an area where the TC must demonstrate some leadership, but it's very unclear at this point how to change things.

Based on this message from Thierry on a slightly longer Stein cycle, the idea that the first PTG in 2019 is going to be co-located with the Summit is, if not definite, near as. There's more on that in the second paragraph of the Vancouver Summit Joint Leadership Meeting Update.

If you have issues that you would like the TC to discuss—or to discuss with the TC—at the PTG coming in September, please add to the planning etherpad.

by Chris Dent at July 31, 2018 11:07 AM

RDO

Community Blog Round Up: 31 July

One last happy birthday to OpenStack before we get ready to wrap up Rocky and prep for OpenStack PTG in Denver Colorado. Mary makes us drool over cupcakes, Carlos asks for our vote for his TripleO presentations, and Assaf dives into tenant, provider, and external neutron networks!

Happy Birthday OpenStack from SF Bay Area Open Infra by Mary Thengvall

I love birthday celebrations! They’re so full of joy and reminiscing of years gone by. Discussions of “I knew her when” or “Remember when he… ?” They have a tendency to bring separate communities of people together in unique and fun ways. And we all know how passionate I am about communities…

Read more at https://blogs.rdoproject.org/2018/07/happy-birthday-openstack-from-sf-bay-area-open-infra/

Vote for the OpenStack Berlin Summit presentations! by Carlos Camacho

I pushed some presentations for this year OpenStack summit in Berlin, the presentations are related to updates, upgrades, backups, failures and restores.

Read more at https://www.anstack.com/blog/2018/07/24/openstack-berlin-summit-vote-for-presentations.html

Tenant, Provider and External Neutron Networks by assafmuller

To this day I see confusion surrounding the terms: Tenant, provider and external networks. No doubt countless words have been spent trying to tease apart these concepts, so I thought that it’d be a good use of my time to write 470 more.

Read more at https://assafmuller.com/2018/07/23/tenant-provider-and-external-neutron-networks/

by Rain Leander at July 31, 2018 09:57 AM

July 30, 2018

Andy Smith

Configuring Hybrid Messaging for TripleO

TripleO Messaging

The OpenStack oslo.messaging library provides RPC and Notification messaging communication patterns for control plane services. The RPC interface is used for interactive invocation and control of services while the Notification interface provides a pub-sub pattern that can be used for event generation and updates.

During the tripleo rocky release, the configuration of the oslo.messaging services has been updated. The update enables the separation of the messaging backends used for RPC and Notifications and also enables the use of different messaging backend server types (e.g. in addition to rabbitmq). The deployment configurations that can be defined include the following:

  • The standard deployment of a shared rabbitmq server messaging backend (e.g. cluster) that is used for both RPC and Notification communications
  • A hybrid deployment of an apache dispatch router messaging backend (qdrouterd) for RPC communications (e.g. via the oslo.messaging AMQP 1.0 driver) and a rabbitmq server messaging backend for Notification communications

Oslo.Messaging Services

Since the mitaka release, oslo.messaging has supported the deployment of separate messaging backends for RPC and Notification communications. To support separate messaging backends in tripleo, oslo.messaging services for RPC and Notify were introduced in place of the stand alone rabbitmq service. This layering allows for the RPC and Notification messaging services to be separately controlled and configured.

Standard Deployment of Single RabbitMQ Server Backend

The standard deployment of a single rabbitmq server backend (e.g. cluster) is retained by mapping both oslo.messaging RPC and Notify services to the same messaging transport.

OS::TripleO::Services::OsloMessagingRpc:docker/services/messaging/rpc-rabbitmq.yaml
OS::TripleO::Services::OsloMessagingNotify:docker/services/messaging/notify-rabbitmq-shared.yaml

An examination of these service reveals that rpc-rabbitmq.yaml instantiates the rabbitmq server role while notify-rabbitmq-shared.yaml maps the oslo.messaging global configuration setting for Notify to the same rabbitmq server backend that is used for RPC communications.

During the deployment, the parameters used for the configuration of the oslo.messaging RPC and Notification services are used to generate the messaging transport configuration defined for each service. An example of this can be seen in the puppet-tripleo base profile for nova.

class { '::nova':
  default_transport_url      => os_transport_url({
    'transport' => $oslomsg_rpc_proto,
    'hosts'     => $oslomsg_rpc_hosts,
    'port'      => $oslomsg_rpc_port,
    'username'  => $oslomsg_rpc_username,
    'password'  => $oslomsg_rpc_password,
    'ssl'       => $oslomsg_rpc_use_ssl_real,
  }),
  notification_transport_url => os_transport_url({
    'transport' => $oslomsg_notify_proto,
    'hosts'     => $oslomsg_notify_hosts,
    'port'      => $oslomsg_notify_port,
    'username'  => $oslomsg_notify_username,
    'password'  => $oslomsg_notify_password,
    'ssl'       => $oslomsg_notify_use_ssl_real,
  }),
}

Following deployment, an example configuration of the oslo.messaging services can be checked in the /etc/nova/nova.conf configuration file. In it, you will see that the same transport is used for RPC and Notifications to the shared rabbitmq server backend.

[DEFAULT]
transport_url=rabbit://guest:secrete@host1.internalapi.localdomain:5672/?ssl=0
.
.
. 
[oslo_messaging_notifications]
transport_url=rabbit://guest:secrete@host1.internalapi.localdomain:5672/?ssl=0Hybrid Messaging Deployment

The deployment of two independent messaging backends is realized by mapping the oslo.messaging RPC and Notification transports to different messaging systems. For this deployment, the apache dispatch router (qdrouterd) is used as the RPC messaging backend in place of the rabbitmq server. The qdrouterd server is integrated with the oslo.messaging AMQP 1.0 driver and provides direct messaging capabilities for the RPC communications. The rabbitmq server is used as the messaging backend for the Notification communications where the broker storage functionality is necessary.

This hybrid messaging example environment in tripleo enables the dual messaging backend deployment for RPC and Notifications:

# *******************************************************************
# This file was created automatically by the sample environment
# generator. Developers should use `tox -e genconfig` to update it.
# Users are recommended to make changes to a copy of the file instead
# of the original, if any customizations are needed.
# *******************************************************************
# title: Hybrid qdrouterd for rpc and rabbitmq for notify messaging backend
# description: |
#   Include this environment to enable hybrid messaging backends for
#   oslo.messaging rpc and notification services
parameter_defaults:
  # The network port for messaging Notify backend
  # Type: number
  NotifyPort: 5672

  # The network port for messaging backend
  # Type: number
  RpcPort: 31459

resource_registry:
  OS::TripleO::Services::OsloMessagingNotify: ../../docker/services/messaging/notify-rabbitmq.yaml
  OS::TripleO::Services::OsloMessagingRpc: ../../docker/services/messaging/rpc-qdrouterd.yaml

An examination of these services reveals that the rpc-qdrouterd.yaml instantiates the qdrouterd server role to provide RPC communications via direct messaging while notify-rabbitmq.yaml instantiates the rabbitmq server role to provide Notify communications via the broker’s queues.

Following deployment, the resulting configuration of the oslo.messaging services can be checked in the /etc/nova/nova.conf configuration file. For the deployment, the RPC transport is defined to use the AMQP 1.0 driver with the qdrouterd messaging backend and the Notification transport is defined to use the rabbit driver with the corresponding rabbitmq messaging backend.

[DEFAULT]
transport_url=amqp://guest:secrete@cphost.internalapi.localdomain:31459/?ssl=0
.
.
.
[oslo_messaging_notifications]
transport_url=rabbit://guest:secrete@cphost.internalapi.localdomain:5672/?ssl=0

Add the following arguments to your openstack overcloud deploy command to deploy with separate messaging backends:

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/messaging/rpc-qdrouterd-notify-rabbitmq-hybrid.yaml

Summary

The configuration of oslo.messaging services has been updated in the rocky release. These updates simplify the deployment of the messaging backend systems so that operators can begin to evaluate and understand hybrid messaging and evaluate performance and scalability benefits.

by anyqp at July 30, 2018 08:37 PM

OpenStack Superuser

The Certified OpenStack Administrator exam: Why it matters for your career and how to pass it

Cloud skills continue to be top-of-mind for recruiters.

If you’ve ever asked yourself how you can prove your skills to potential employers, a simple way is to take a certification course such as the Certified OpenStack Administrator (COA) exam, offered by the OpenStack Foundation.

Now, why would this certification help you prove your skills? Because it’s a hands-on exam, test takers are exposed to an actual OpenStack environment where they demonstrate skills and knowledge to perform tasks that an administrator might need to accomplish on a daily basis.

The exam is intended for OpenStack Administrators with six months of hands-on experience and, as mentioned, the exam objectives cover tasks that you might perform on a regular basis. So, for example, under the Identity Management objective for the exam you would find ‘Manage/Create domains, groups, projects, users, and roles’. That means this exam objective might test your ability to create a user and assign a role to that user or perhaps manage an existing user depending on what the related exam question asks you to do.

Once you’ve decided taking the COA is right for you, what do you need to know? Well, the exam is currently offered on the Newton release of OpenStack and can be taken on either Ubuntu or SUSE. The exam lasts 2.5 hours and can be taken from anywhere on a laptop or desktop with a microphone, speakers and a webcam. In addition, you’ll need to use either the Chrome or Chromium browsers to take the exam and, of course, sufficient internet access.  You’ll also want to make sure you know how to navigate documentation and consider whether you’re generally more proficient using the dashboard or command-line interface.

For complete information on the COA including its objectives and requirements, see http://www.openstack.org/coa.

The COA currently has a pass rate of 60 percent. If you’re worried about not knowing enough about OpenStack, you can visit the Training section of the OpenStack Marketplace and find partners that provide online or in-person training at http://www.openstack.org/marketplace/training.

If you’d like more tips, check out the talk I gave with Gianpietro Lavado at the Summit Vancouver. Given in English and Spanish, you can also download the slide deck.

Photo // CC BY NC

The post The Certified OpenStack Administrator exam: Why it matters for your career and how to pass it appeared first on Superuser.

by Amy Marrich at July 30, 2018 04:02 PM

RDO

Happy Birthday OpenStack from SF Bay Area Open Infra

I love birthday celebrations! They’re so full of joy and reminiscing of years gone by. Discussions of “I knew her when” or “Remember when he… ?” They have a tendency to bring separate communities of people together in unique and fun ways. And we all know how passionate I am about communities…

So when Rain Leander suggested that I attend the SF Bay Area celebration of OpenStack’s 8th birthday as one of my last tasks as interim community lead for RDO, I jumped at the chance! Celebrating a birthday AND getting to know this community better, as well as reuniting with friends I already knew? Sign me up!

I arrived at the event in time to listen to a thought-provoking panel led by Lisa-Marie Namphy, Developer Advocate, community architect, and open source software specialist at Portworx. She spoke with Michael Richmond (NIO), Tony Campbell (Red Hat) and Robert Starmer (Kumulus Tech) about Kubernetes in the Real World.

Lew Tucker, CTO of Cisco, spoke next, and said one of my favorite quotes of the night:

Cloud computing has won… and it’s multiple clouds.

My brain instantly jumped to wondering about the impact that community has had on the fact that it’s not a particular company that has won in this new stage of technology, but a concept.

Dinner and mingling with OpenStack community friends new and old was up next, followed by an awesome recap of how the OpenStack community has grown over the last 8 years.

While 8 years in the grand scheme of history doesn’t seem like much, in the Open Source world, it’s a big accomplishment. The fact that OpenStack is up to 89,000+ community members in 182 countries and supported by 672 organizations is a huge feat and one that deserves to be celebrated!

Speaking of celebrating… we at RDO express our appreciation and love for community through sharing food (Rocky Road ice cream, anyone?) and this celebration was no exception. We provided the best (and cutest) mini cupcakes that I’ve ever had. The Oreo cupcake with cookie frosting gets two thumbs up in my book!

The night ended with smiles and promises of another great year to come.

Here’s to the next 8 years of fostering and building communities, moving the industry forward, and enjoying the general awesomeness that is OpenStack.

by Mary Thengvall at July 30, 2018 12:14 PM

July 28, 2018

Benjamin Kerensa

Remembering Gerv Markham

Gervase Markham (cc by sa didytile)
Gervase Markham (cc by sa didytile)

Gerv Markham, a friend and mentor to many in the Mozilla community, passed away last night surrounded by his family.

 

Gerv worked at Mozilla for many years working in a variety of capacities including being a lead developer of Bugzilla and most recently working on special projects under the Mozilla Chairwoman.

 

I had the pleasure of working with Gerv in the Thunderbird community and most recently on the MOSS Grants Committee as one of the inaugural members. Between these two areas, I often sought Gerv’s mentoring and advice, as he always had wisdom to share.

 

Anyone who has been intimately involved with the Mozilla project likely engaged Gerv from time to time, although much of his work was behind the scenes but nonetheless important work.

 

I think it goes without saying Gerv had a significant impact on the open web through his contributions to Bugzilla and various projects that moved the open web forward and he championed the values of the Mozilla manifesto. All of us who knew him and got the opportunity to collaborate were rewarded with a good friend and valuable wisdom that will be missed.

 

Thanks Gerv for being a friend of Mozilla and the open web and you will be surely missed.

by Benjamin Kerensa at July 28, 2018 10:44 PM

July 27, 2018

OpenStack Superuser

How to deploy Windows on OpenStack

In this tutorial, I’ll show you how to run Windows Server 2016 on OpenStack.

Preparation

Download a Windows 2016 Server ISO image at Microsoft (requires registration.)
Download the Fedora VirtIO drivers. You can find more options here.
In this example we will use VirtalBox, although you can also use KVM on Linux for this.

Create a VM inside VirtualBox

  • Choose a name, as a type, choose Microsoft Windows, and as a version choose Other Windows (64 bit).
  • Assign the VM a minimum of 2GB memory.
  • Create a virtual disc with a minimum disc-space of 15GB, the standard 20GB is fine, as a type, choose QCOW.

When the VM is created, do not start it yet. We first have to fine-tune certain things in the settings screen:

  • System > processor: Add a second CPU, this saves us time
  • Ports: Enable the first serial port (COM1), this is for logging and debugging purposes
  • Storage: Change the first CD-ROM to primary slave
  • Storage: Connect the first CD-ROM to the Windows ISO you’ve downloaded
  • Storage: Add a second CD-ROM (secondary master) to the existing IDE controller and connect the virtio driver disc to this CD-ROM
  • Audio: Disable audio (unless you like to use it)
  • Network: Change the network to bridge mode, this way you can access it with RDP
  • Network: Change the network type to virtio-net

Now you have a VM you can boot.

The two ISOs now have the correct start-up sequence, after booting, your VM will start to install from the ISO.

Installing Windows

After booting the VM, the installation of Windows will start automatically.

Windows wants to be aware of some regional settings, after which it will ask which version of Windows to install.

  • In this example we will choose Windows Server 2016 Datacenter Evaluation (Desktop Experience)
  • Read the license terms, if you want to continue, you will have to accept them
  • After this, choose Custom installation

Installing Drivers

Now you have the option to install the viostor SCSI drivers, these are necessary, even though we use the IDE controller of VirtualBox at this moment.

To see the drivers, navigate to e:\viostor\2k16\amd64\, and remove the checkbox so you can see everything.

Choose the disc and let the installer do its job.

Finally, Windows will ask for a passphrase, after that the installation will be finished. Because you want to create an image, you now have to edit your own settings. There are many manuals available on rules about what you can and cannot edit, and why. We will limit ourselves to the necessary settings.

  • Log in (and leave Windows for now, Server Manager wil start alongside a few background processes)

What stands out is that the network isn’t working, this is because the drivers have not been installed.

  • Navigate to E:\NetKVM\2k16\amd64 and install netkvm.inf (select the file and press right mouse button)
  • Next, install the IO drivers E:\viostor\2k16\amd64\viostor.inf (select the file and press right mouse button)

After this, the NIC will work, we will need this in a while but now we know for sure that the necessary IO drivers are installed.

Remote Desktop

To be able to use RDP later on, which makes maintenance easier, we have to change two things:

  • Start Powershell to open the firewall for RDP:
    powershell
    
    Enable-NetFirewallRule -name RemoteDesktop-UserMode-In-TCP
    
  • Click on Start > Settings, and search for remote desktop, now choose allow remote desktop access to your computer.
  • In the popup choose for Allow remote connections to this computer, Click OK to close the popup.
  • If you are using an unofficial- or older RDP client, remove the checkbox below.
  • Click on Apply and press OK.

You can now test the RDP connection, which should work by now.

Cloud-Init

To be able to use Cloud-Init (we need this to, for example, be able to set an admin passphrase when deploying) we have to configure this:

powershell
Set-ExecutionPolicy Unrestricted

After this, download and install Cloud-Init:

Invoke-WebRequest -UseBasicParsing https://cloudbase.it/downloads/CloudbaseInitSetup_Stable_x64.msi -OutFile cloudbaseinit.msi

The download is approximately 40MB

./cloudbaseinit.msi

  • Click Next
  • To continue, you will have to read and accept the license terms
  • Click Next
  • The default settings will be fine, again click Next
  • Leave everything be except for the Serial port for logging, set that to COM1 
  • Next…
  • Install

When the installation is finished, select both run Sysprep and Shutdown when Sysprep terminates.

If you click Finish the Windows installation will be prepared for use as an image and the VM will be closed.

Now you have a Windows image that is switched off, based upon a qcow file which we can upload to OpenStack.

Uploading the Image

When the upload speed of your internet connection isn’t that high (because Windows images are mostly around 10GB) I suggest you use the OpenStack CLI tools for this (see below).

Option 1: Upload with an OpenStack Dashboard

  • Go to an OpenStack Dashboard
  • Login
  • On the left, choose ‘Images’
  • Create Image
  • Enter a name and a description
  • Image Source > Image File en select the qcow image you have created. The location can be found in VirtualBox. In my case it’s a qcow of nearly 13GB.
  • Change the format to Qcow2
  • Architecture: x86_64
  • Minimum Disk: 20
  • Minimum RAM: 2048
  • Create Image

Now, the web-interface will upload the entire image, after which it will be processed and put inside the list of available images.

Option 2: Upload with OpenStack CLI tools

Make sure that the OpenStack CLI tools are installed on your system. If not, you can follow our CLI tutorials.

To be able to create a new image you have to enter the following command in the terminal:

openstack image create "imagename" --disk-format qcow2 --min-ram 2048 --min-disk 20 --file /path/to/image/imagename.qcow

If uploading will take longer than an hour you will get a 401-error, you get this error because in the meanwhile your token has expired. You can ignore this error. Enter the following command to make sure your image has been added to the list of available images in OpenStack:

openstack image list

If you don’t see the new image in the list, try to create it once more.

Starting Windows

To launch a Windows instance now, you have to follow the normal procedure except for two differences:

  • Go to the Fuga Dashboard
  • Login
  • On the left, choose Access & Security. Click on Security Groups and add a security group with RDP access for the IP you’re connecting from (Go to this website to retrieve your IP). If you do not have this yet, you can also add this, and connect the Security Group to your instance later.
  • Create a new instance in the Instances panel. To be able to login Windows later, you need a passphrase. The passphrase we had set earlier does not work anymore. Click on the Metadata tab to add a new custom key admin_pass, after which you can assign a value which is shown in plain text. This is a bit annoying, but you have to change the passphrase again after the initial login anyway.

Sources

  1. https://docs.openstack.org/image-guide/windows-image.html

This post, written by Yuri Sijtema, was first published on Fuga Academy, the learning branch of Fuga Cloud, a self-service, on-demand, cloud service based in the Netherlands.

Photo // CC BY NC

The post How to deploy Windows on OpenStack appeared first on Superuser.

by Superuser at July 27, 2018 04:06 PM

ICCLab

Experience using Kolla Ansible to upgrade Openstack from Ocata to Queens

We made a decision to use Kolla-Ansible for Openstack management approximately a year ago and we’ve just gone through the process of upgrading from Ocata to Pike to Queens. Here we provide a few notes on the experience.

By way of some context: our system is a moderate sized system with 3 storage nodes, 7 compute nodes and 3 controllers configured in HA. Our systems were running CentOS 7.5 with a 17.05.0-ce docker engine and we were using the centos-binary Kolla containers. Being an academic institution, usage of our system peaks during term time – performing the upgrade during the summer meant that system utilization was modest. As we are lucky enough to have tolerant users, we were not excessively concerned with ensuring minimal system downtime.

We had done some homework on some test systems in different configurations and had obtained some confidence with the Kolla-Ansible Ocata-Pike-Queens upgrade – we even managed to ‘upgrade’ from a set of centos containers to ubuntu containers without problem. We had also done an upgrade on a smaller, newer system which is in use and it went smoothly. However, we still had a little apprehension when performing the upgrade on the larger system.

In general, we found Kolla Ansible good and we were able to perform the upgrade without too much difficulty. However, it is not an entirely hands-off operation and it did require some intervention for which good knowledge of both Openstack and Kolla was necessary.

Our workflow was straightforward, comprising of the following three stages

  • generate the three configuration files passwords.yml, globals.yml and multinode.ha,
  • pull down all containers to the nodes using kolla-ansible pull
  • perform the upgrade using kolla-ansible upgrade.

We generated the globals.yml and passwords.yml config files by copying the empty config files from the appropriate kolla-ansible git branch to our /etc/kolla directory, comparing them with the files used in the previous deploy and copying changes from the previous versions into the new config file. We used the approach described here to generate the correct passwords.yml file.

Pulling appropriate containers to all nodes was straightforward:

/opt/kolla-ansible/tools/kolla-ansible \
    -i /etc/kolla/multinode.ha pull

It can take a bit of time, but it’s sensible as it does not have any impact on the operational system and reduces the amount of downtime when upgrading.

We were then ready to perform the deployment. Rather than run the system through the entire upgrade process, we chose a more conservative approach in which we upgraded a single service at a time: this was to maintain a little more control over the process and to enable us to check that each service was operating correctly after upgrade. We performed this using commands such as:

/opt/kolla-ansible/tools/kolla-ansible \
    -i /etc/kolla/multinode.ha --tags "haproxy" upgrade

We stepped through the services in the same order as listed in the main Kolla-Ansible playbook, deploying the services one by one.

The two services that we were most concerned about were those pertaining to data storage, naturally: mariadb and ceph. We were quite confident that the other processes should not cause significant problems as they do not retain much important state.

Before we started…

We had some initial problems with docker python libraries installed on all of our nodes. The variant of the docker python library available via standard CentOS repos is too old. We had to resort to pip to install a new docker python library which worked with newer versions of Kolla-Ansible.

Ocata-Pike Upgrade

Deploying all the services for the Ocata-Pike upgrade was straightforward: we just ran through each of the services in turn and there were no specific issues. When performing some final testing, however, the compute nodes were unable to schedule new VMs as neutron was unable to attach a VIF to the OVS bridge. We had seen this issue before and we knew that putting the compute nodes through a boot cycle solves it – not a very clean approach, but it worked.

Pike-Queens Upgrade

The Pike-Queens upgrade was more complex and we encountered issues that we had not specifically seen documented anywhere. The issues were the following:

    • the mariadb upgrade failed – when the slave instances were restarted, they did not join the mariadb cluster and we ended up with a cluster with 0 nodes in the ‘JOINED’ state. The master node also ended up in an inoperational state.
      • We solved this using the well documented approach to bootstrapping a mariadb cluster – we have our own variant of it for the kolla mariadb containers, which is essentially a replica of the mariadb_recovery functionality provided by kolla
      • This did involve a syncing process of replicating all data from the bootstrap node on each of the slave nodes; in our case, this took 10 minutes
    • when the mariadb database sync’d and reached quorum, we noticed many errors associated with record field types in the logs – for this upgrade, it was necessary to perform a mysql_upgrade, which we had not seen documented anywhere
    • the ceph upgrade process was remarkably painless, especially given that this involved a transition from Ceph Jewel to Ceph Luminous. We did have the following small issues to deal with
      • We had to modify the configuration of the ceph cluster using ceph osd require-osd-release luminous
      • We had one small issue that the cluster was in the HEALTH_WARN status as one application did not have an appropriate tag – this was easily fixed using ceph osd pool application enable {pool-name} {application-name}
      • for reasons that are not clear to us, Luminous considered the status of the cluster to be somewhat suboptimal and moved over 50% of the objects in the cluster; Jewel had given no indication that a large amount of the cluster data needed to be moved
    • Upgrading the object store rendered it unusable: in this upgrade, the user which authenticates against keystone with privilege to manage user data for the object store changed from admin to ceph_rgw. However, this user was not added to the keystone and all requests to the object store failed. Adding this user to the keystone and giving this user appropriate access to the service project fixed the issue.
      • This was due to a change that was introduced in the Ocata release after we had performed our deployment and it only became visible to use after we performed the upgrade.

Apart from those issues, everything worked fine; we did note that the nova database upgrade/migration in the Pike-Queens cycle did take quite a long time (about 10 minutes) for our small cluster – for a very large configuration, it may be necessary to monitor this more closely.

Final remarks…

The Kolla-Ansible upgrade process worked well for our modest deployment and we are happy to recommend it as an Openstack management tool for environments of such scale with quite standard configurations, although even with an advanced tool such as Kolla-Ansible, it is essential to have a good understanding of both Openstack and Kolla before depending on it in a production system.

by murp at July 27, 2018 01:54 PM

Chris Dent

Placement Update 18-30

This is placement update 18-30, a weekly update of ongoing development related to the OpenStack placement service.

Most Important

This week is feature freeze for the Rocky cycle, so the important stuff is watching already approved code to make sure it actually merges, bug fixes and testing.

What's Changed

At yesterday's meeting it was decided the pending work on the /reshaper will be punted to early Stein. Though the API level is nearly ready, the code that exercises it from the nova side is very new and the calculus of confidence, review bandwidth and gate slowness works against doing an FFE. Some references:

Meanwhile, pending work to get the report client using consumer generations is also on hold:

As far as I understand it no progress has been made on "Effectively managing nested and shared resource providers when managing allocations (such as in migrations)."

Some functionality has merged recently:

  • Several changes to make the placement functional tests more placement oriented (use placement context, not be based on nova.test.TestCase).
  • Add 'nova-manage placement sync_aggregates'
  • Consumer generation is being used in heal allocations CLI
  • Allocations schema no longer allows extra fields
  • The report client is more robust about checking and retrying provider generations.
  • If force_hosts or force_nodes is being used, don't set a limit when requesting allocation candidates.

Questions

I wrote up some analysis of the way the resource tracker talks to placement. It identifies some redundancies. Actually it reinforces that some redundancies we've known about are still there. Fixing some of these things might count as bug fixes. What do you think?

Bugs

Main Themes

Documentation

Now that we are feature frozen we better document all the stuff. And more than likely we'll find some bugs while doing that documenting.

This is a section for reminding us to document all the fun stuff we are enabling. Open areas include:

  • "How to deploy / model shared disk. Seems fairly straight-forward, and we could even maybe create a multi-node ceph job that does this - wouldn't that be awesome?!?!", says an enthusiastic Matt Riedemann.

  • The whens and wheres of re-shaping and VGPUs.

  • Please add more here by responding to this email.

Consumer Generations

These are in place on the placement side. There's pending work on the client side, and a semantic fix on the server side, but neither are going to merge this cycle.

Reshape Provider Trees

On hold, but still in progress as we hope to get it merged as soon as there is an opportunity to do so:

It's all at: https://review.openstack.org/#/q/topic:bp/reshape-provider-tree

Mirror Host Aggregates

The command line tool merged, so this is done. It allows aggregate-based limitation of allocation candidates, a nice little feature that will speed things up for people.

Extraction

I wrote up a second blog post on some of the issues associated with placement extraction. There are several topics on the PTG etherpad related to extraction.

Other

Since we're at feature freeze I'm going to only include things in the list that were already there and that might count as bug fixes or potentially relevant for near term review.

So: 11, down from 29.

End

Lots to review, test, and document.

by Chris Dent at July 27, 2018 01:00 PM

Aptira

The Network is the Computer. Part 2 – Creating the Enablers of Cloud Computing

Aptira: The Network is the Computer

We saw in our last post a massive explosion of computing capabilities during the 1980’s that was not matched by networking capabilities, leading to sub-optimal, fragmented, and support intensive solutions. 

The grand vision of highly integrated distributed computing remained but seemed further from the promise than ever. 

Around 1990, we saw three major innovations emerge that begin to solve these problems: new networking capabilities to match compute-node innovation, Server Virtualisation, and the beginnings of “open hardware”. By the end of the decade, we will be on the doorstep of the Cloud. 

Firstly, network innovation, specifically the emergence of the Internet and the deregulation of the Telecommunications marketplace. 

Over more than two decades a series of large scale research projects emerged amongst government agencies and academic centres, e.g. ARPANET, with the US Defense Department the centre of funding and governance. These projects were instrumental in the emergence of TCP/IP as the de facto protocols for commodity networking services. Limited commercial use of networks began in 1989. 

In 1991, the US Government removed itself from the governance role and contracted out to private industry the responsibilities of managing the two main address spaces (IP Addresses and Domain Names). This enabled rapid evolution of the commercial uses for these networks, and the build-out of private infrastructure. Internet governance evolved rapidly during the mid-90’s which saw the rapid emergence of Internet Service Providers (ISP’s) across the world, giving the market a simple model for flexible and ubiquitous connectivity. 

Then came communications marketplace deregulation, starting with the US ‘Telecommunications Act’ of 1996 which enabled the accelerated build out of internet infrastructure, culminating in the explosive growth in the dot com boom. internet infrastructure and private networks exploded in size, geography and capacity, and after decades of plunging compute costs, network costs began to drive downwards as well. 

The second major innovation was commodity Server Virtualisation. Virtualised computing environments were invented in the 1960’s, particularly by IBM, but it wasn’t until 1999 that VMWare launched the first commercially available hypervisor for “commodity” hardware, initially for client workstations, and then in 2001 for servers. 

This ability to create multiple computer environments (full Operating System and application stacks running in their own virtual machine) enabled significant physical server consolidation and data centre resource efficiencies. 

The servers required for a solution could be much more powerful in terms of CPU, disk and memory, and a single physical server could support multiple “functional” server requirements. 

The third major innovation was the birth of a new hardware model.  Blade servers were computing resources stripped out from cabinetry and engineered around a standardised bus in a standard 19” rack mount chassis. This was not “open hardware” in the sense that we know it today, but convergence on common component standards made it significantly easier to build high-density computing infrastructure from multiple vendors. 

Server virtualisation and blade computing together dramatically imploded networking and compute infrastructure. 

With this massive density of relatively cheap computing resource and high-speed connectivity, applications emerged that had truly global reach, for example the major search engines, and Salesforce.com. And with the increased virtualisation of resources, it was possible to conceive of solutions that treated commodity resources in a highly modular way. 

These trends put us at the doorstep of the 21st Century and almost at the birth of Cloud Computing. 

The first known use of the term “Cloud Computing” was in 1997 at Emory University in a talk entitled “Intermediaries in Cloud-Computing”, in which Cloud Computing was described as: 

The new computing paradigm, where the boundaries of computing will be determined by economic rationale, rather than technical limits alone

Professor Ramnath Chellapahttp://www.bus.emory.edu/ram/

How was this vision of Cloud Computing turned into reality? 

We will cover that in the next post. 

The post The Network is the Computer. Part 2 – Creating the Enablers of Cloud Computing appeared first on Aptira.

by Adam Russell at July 27, 2018 02:39 AM

July 26, 2018

Chris Dent

Nova's use of Placement

A year and a half ago I did some analysis on how nova uses placement.

I've repeated some of that analysis today and here's a brief summary of the results. Note that I don't present this because I'm concerned about load on placement, we've demonstrated that placement scales pretty well. Rather, this analysis indicates that the compute node is doing redundant work which we'd prefer not to do. The compute node can't scale horizontally in the same way placement does. If offloading the work to placement and being redundant is the easiest way to avoid work on the compute node, let's do that, but that doesn't seem to be quite what's happening here.

Nova uses placement mainly from two places:

  • The nova-compute nodes report resource provider and inventory to placement and make sure that the placement view of what hardware is present is accurate.

  • The nova-scheduler processes request candidates for placement, and claim resources by writing allocations to placement.

There are some additional interactions, mostly associated with migrations or fixing up unusual edge cases. Since those things are rare they are sort of noise in this discussion, so left out.

When a basic (where basic means no nested resource providers) compute node starts up it POSTs to create a resource provider and then PUTs to set the inventory. After that a periodic job runs, usually every 60 seconds. In that job we see the following 11 requests:

GET /placement/resource_providers?in_tree=82fffbc6-572b-4db0-b044-c47e34b27ec6
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/inventories
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/aggregates
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/traits
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/inventories
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/allocations
GET /placement/resource_providers?in_tree=82fffbc6-572b-4db0-b044-c47e34b27ec6
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/inventories
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/aggregates
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/traits
GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/inventories

A year and a half ago it was 5 requests per-cycle, but they were different requests:

GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/aggregates
GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/inventories
GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/allocations
GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/aggregates
GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/inventories

The difference comes from two changes:

  • We no longer confirm allocations on the compute node.
  • We've now have things called ProviderTrees which are responsible for managing nested providers, aggregates and traits in a unified fashion.

It appears, however, that we have some redundancies. We get inventories 4 times; aggregates, providers and traits 2 times, and allocations once.

The in_tree calls happen from the report client method _get_providers_in_tree which is called by _ensure_resource_provider which can be called from multiple places, but in this case is being called both times from get_provider_tree_and_ensure_root, which is also responsible for two of the inventory request.

get_provider_tree_and_ensure_root is called by _update in the resource tracker.

_update is called by both _init_compute_node and _update_available_resource. Every single period job iteration. _init_compute_node is called from _update_available_resource` itself.

That accounts for the overall doubling.

The two calls inventories per group come from the following, in get_provider_tree_and_ensure_root:

  1. _ensure_resource_provider in the report client calls _refresh_and_get_inventory for every provider in the tree (the result of the in_tree query)

  2. Immediately after the the call to _ensure_resource_provider every provider in the provider tree (from self._provider_tree.get_provider_uuids()) then has a _refresh_and_get_inventory call made.

In a non-sharing, non-nested scenario (such as a single node devstack, which is where I'm running this analysis) these are the exact same one resource provider. I'm insufficiently aware of what might be in the provider tree in more complex situations to be clear on what could be done to limit redundancy here, but it's a place worth looking.

The requests for aggregates and traits happen via _refresh_associations in _ensure_resource_provider.

The single allocation request is from the resource tracker calling _remove_deleted_instances_allocations checking to see if it is possible to clean up any allocations left over from migrations.

Summary/Actions

So what now? There are two avenues for potential investigation:

  1. Each time _update is called it calls get_provider_tree_and_ensure_root. Can one of those be skipped while keeping the rest of _update? Or perhaps it is possible to avoid one of the calls to _update entirely?
  2. Can the way get_provider_tree_and_ensure_root tries to manage inventory twice be rationalized for simple cases?

I've run out of time for now, so this doesn't address the requests that happen once an instance exists. I'll get to that another time.

by Chris Dent at July 26, 2018 04:00 PM

OpenStack Superuser

Why Python is more popular than Kim Kardashian

Programming language Python is about 10 years younger than Kim Kardashian. In this case, age really does come before beauty. One is versatile, predictable and has earned growing popularity. The other, well, not consistently.

Google searches for Python have outstripped those for Kim Kardashian in the United States last year. (Though it’s worth noting perhaps that she rallied in popularity with those topless Instagram pics in January.)

Queries for the language, long a favorite with open-source folks, have tripled since 2010 while searches for other programming languages have been flat or declining, according to The Economist. In our own quicky search of Google Trends, Python also appears to be more sought after in liberal states than conservative ones.

Python provides the backbone for YouTube, DropBox, Instagram, Reddit, plus organizations like CERN and NASA. Some 286,701 users have contributed 147,123 projects to the “cheese shop,” the nickname for the Python Package Index, a nod to Monty Python that runs through so much of the language. The APIs have been dubbed the best-kept secret of OpenStack, too.

Currently Codecademy’s most requested language, it’s considered a great starter language for kids, marketers and otherwise command-line shy journalists — including this one. Python has also been the most popular beginner programming course at U.S. universities since 2014. You can “Learn it the Hard Way,” “Learn it in One Day” or try out the crowdfunded New York Times bestseller and “Learn Python Visually.” Along the way, Python has also influenced other languages such as Go, Swift and Ruby.

Python recently celebrated almost 30 years of all-purpose programming — and weathered the decision of creator and benevolent dictator Guido van Rossum to step down.

Citing the demise of once-popular programming languages such as Fortran and Basic, The Economist doesn’t go so far as to predict an always shining future for Python.  Then again, as Kim Kardashian once said, “I love when people underestimate me and then become pleasantly surprised.”

Full story over at The Economist

The post Why Python is more popular than Kim Kardashian appeared first on Superuser.

by Nicole Martinelli at July 26, 2018 02:04 PM

Osones

Without DHCP, config-drive to the rescue

Context

For one of our customers, we had to work on an OpenStack cloud infrastructure with a certain amount of specifities. We are here talking about the Queens version, latest stable version to date.

The goal of this platform is to provide Compute instances with direct access to specific hardware (using PCI Passthrough). On the network side, creation and management of networks inside the cloud is not an identified need, therefore we made architecture choices to aim at the simplest, for example not deploying Neutron L3 agent and only using directly an external network connected to the compute node. All this with a number of hosts reduced to the minimum at the beginning, 1 controller and 1 compute node.

This apparent simplicity needs to be put in perspective with a not so minor constraint: the two hosts (controller and compute) are not connected on the same L2 network. Traffic is routed (and is actually transported through VPN).

The deployment tool we use here, OpenStack-Ansible, allows us to quite easily address these constraints: not deploying some components and connecting the two hosts through a routed network. There is still one issue to address though.

Question of providing IP addresses to instances

The usual mechanism used to provide IP addresses to instances is DHCP. The Neutron DHCP agent is in charge of spawning and configuring a DHCP server (dnsmasq by default) to do the job. Given the network constraints discussed previously, this solution is not easily usable as DHCP works inside an L2 network and cannot cross a router.

Thus we decided not to use DHCP and didn't deploy the Neutron DHCP agent. However, without connectivity, instances have no way to communicate with the metadata API (http://169.254.169.254), which raises a new question. That's why we also decided not to use this service: we disabled the metadata API exposed by Nova and didn't deploy the Neutron metadata agent which is usually in charge of proxying traffic between instances and this API.

So we had two issues to address and find solutions for:

  • Provide IP addresses to instances
  • Expose metadata to instances

Config-drive solution

The config-drive feature provided by Nova, the Compute service, addresses these two issues.

Config-drive replaces the metadata API in the sense that it embeds the informations usually exposed by this API in a disk ("config drive") that is itself connected to the instance. The cloud-init tool which is found in most of the cloud images is able to access this disk and use the data it contains.

Moreover, by enabling the appropriate option, Nova can provide network informations of the instance through metadata. These informations will also be used by cloud-init in order to configure the instance.

The force_config_drive option in nova.conf enables the config-drive feature by default without requiring the user to explicitly request it when creating an instance through the Compute API.

Then it's just about enabling the flat_injected option so that Nova generates and includes network informations.

Result

Inside an instance connected to a network without DHCP, we mount the config-drive and note that we can find the network configuration there, and we then note that this configuration has indeed been implemented by cloud-init in order to provide network connectivity:

root@myinstance:~# mkdir /mnt/config-drive
root@myinstance:~# mount -oro /dev/sr0 /mnt/config-drive/
root@myinstance:~# cd /mnt/config-drive/openstack/latest/
root@myinstance:/mnt/config-drive/openstack/latest# cat network_data.json
{"services": [], "networks": [{"network_id": "affc4c87-f1d1-4605-afda-3829c4a4cbc9", "type": "ipv4", "services": [], "netmask": "255.255.255.0", "link": "tap2498eebd-35", "routes": [{"netmask": "0.0.0.0", "network": "0.0.0.0", "gateway": "10.42.42.1"}], "ip_address": "10.42.42.6", "id": "network0"}], "links": [{"ethernet_mac_address": "fa:16:3e:62:7c:09", "mtu": 1500, "type": "bridge", "id": "tap2498eebd-35", "vif_id": "2498eebd-3593-49b9-938f-b0380930586b"}]}
root@myinstance:/mnt/config-drive/openstack/latest# cat /etc/network/interfaces.d/50-cloud-init.cfg
# This file is generated from information provided by
# the datasource.  Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
auto lo
iface lo inet loopback

auto ens3
iface ens3 inet static
    address 10.42.42.6/24
    mtu 1500
    post-up route add default gw 10.42.42.1 || true
    pre-down route del default gw 10.42.42.1 || true

Conclusion

Not using DHCP inside the OpenStack Compute service is actually quite easy thanks to the config-drive feature.

This is only one of the solutions implemented in order to address the different specific constraints of this OpenStack deployment, and let's note again that all of this was technically feasible directly with OpenStack-Ansible, without requiring the use of additional external tools.

by Adrien Cunin at July 26, 2018 02:00 PM

July 25, 2018

OpenStack Superuser

How Red Hat and OpenShift navigate a hybrid cloud world

The sky may be the limit for cloud computing, but it still helps to have an idea of how to get there.

“The vision is simple,” says Red Hat’s Mark McLoughlin. “It’s about ensuring that the majority of new software will be built on platforms which empower application developers and operators.”

In other words, McLoughlin and Red Hat want to ensure that there are solid alternatives to the dominant infrastructure providers we see now, including Google Cloud, Amazon AWS and Microsoft Azure. A healthy competitive market is key to future innovation. The senior director of engineering provided three ways to make the market thrive: “increase the threat” of new entrants by providing software that empowers them to utilize the public cloud market;  increase the trade of substitutes by providing an alternative source of infrastructure for the public cloud and make it commonplace for buyers to switch between providers to give them strong negotiating power.

McLoughlin talked about the Red Hat OpenStack Platform 13, that comes with some breakthrough integration and user experience improvements for users deploying OpenShift with OpenStack. In addition, Red Hat now offers a containers-on-cloud services solution to help customers address the complexity of deploying them.

“We need to remember that open infrastructure is a non-zero sum game,” said McLoughlin. “I believe that one of the greatest opportunities for collaboration is in the problem space of deploying and operating these technologies on physical infrastructure at scale.”

Banking on it

BBVA, a financial services company based in Spain, has 70 million customers in more than 30 different countries.  They partnered with Red Hat to build a global platform based on OpenStack and OpenShift, Red Hat’s application platform built for containers with Kubernetes. “BBVA explicitly chose OpenShift and Kubernetes to build their paths because it helped them be free from being locked into any one infrastructure supplier and they chose OpenStack because it was the obvious choice for building on-premise scalable automate-able infrastructure,” McLoughlin said.

While OpenStack and Kubernetes have completely separate missions and identities, they come together in complementary and overlapping ways to empower companies to turn a data center into an application platform, itself based on the best of open source technology. “We need to remember that open infrastructure is a non-zero sum game,” says McLoughlin. “I believe that one of the greatest opportunities for collaboration is in the problem space of deploying and operating these technologies on physical infrastructure at scale.”

In 2013 at the Hong Kong OpenStack Summit, McLoughlin talked about building tripleO, a project aimed at installing, upgrading and operating OpenStack clouds using OpenStack’s own cloud facilities as the foundation. Red Hat has since shipped more than seven major version updates of the tool and it’s been used by hundreds of customers to deploy and operate their environments. It’s evolved into a powerful, flexible framework based on OpenStack, Ansible, and containers, said McLoughlin.

At the Summit Vancouver, Angus Thomas took the audience through a demo of how tripleO can be used to deploy Kubernetes on bare metal alongside an OpenStack cloud. The multi-vendor rack consisted of IBM, HP, Dell, and Super Micro devices, as well as an IBM Power 9 machine, making it a multi-architecture rack as well. The resulting set up boasted 1,056 physical cores, 5.5 terabytes of RAM and 50 terabytes of storage. Thomas used Red Hat OpenStack Platform Director, based on the upstream tripleO project to deploy OpenShift.

“Deploying OpenShift in this way gives us the best of both worlds,” Thomas says. “You get Baremetal performance but with an underlying infrastructure as a service that can take care of deploying new instances, scaling out applications, and a lot of things that you come to expect from a cloud provider.”

Yes, it is Ironic

Thomas then used Ironic Inspector to boot the Ironic Python agent RAM disk, generate a hardware profile, register them with Director, add them to the list, and bring them under management right away. Everything about Director is based on OpenStack. Upstream in the tripleO project, Director is actually a single-node OpenStack instance. It uses Keystone for authentication, Neutron for network management, Nova for scheduling and Heat for orchestration. It also uses OpenStack-Ansible as well, but mostly it uses Ironic, OpenStack’s bare metal hypervisor project, to manage all the hardware.

Deploying the software is thanks to a set of validations Red Hat has created, based on the company’s experience with deploying multi-machine applications like OpenStack (and now OpenShift). Some of the validations run even before the deployment is started, checking for potential problems like the wrong VLAN tags on a switch port or DHCP not running where it should be.

Deploying OpenShift then becomes a series of steps in Director. Thomas clicked through the options and deployed OpenShift onto the multi-hardware rack onstage. He showed options for enabling TLS for network encryption, IPv6, and network isolation. “You can change this as much as you need to to make it right for your environment,” said Thomas, “commit those changes so that you can do iterative redeployments. You don’t need to keep changing it again and again.”

Finally, after setting pre-configured roles for the software that Red Hat supports, checking the validations, and using Ironic to power cycle the target machines, the whole set up was ready to go. Thomas had deployed an instance of OpenShift running on bare metal, deployed by Director on the rack, which was then ready to light up with containerized application workloads on the bare metal private cloud.

You can check out the entire presentation below.

Photo // CC BY NC

The post How Red Hat and OpenShift navigate a hybrid cloud world appeared first on Superuser.

by Rob LeFebvre at July 25, 2018 04:00 PM

NFVPE @ Red Hat

How to deploy TripleO Queens without external network

TripleO Queens has an interesting feature that is called ‘composable networks’. It allows to deploy Openstack with the choice of networks that you want, depending on your environment. Please see: https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/custom_networks.html By default, the following networks are defined: Storage Storage Management Internal Api Tenant Management External The external network allows to reach the endpoints externally, and also to define networks to reach the vms externally as well. But to have that, it is needed to have a network with external access, routable, on your lab. Not all labs have it, specially for CI environments, so it may be useful to…

by Yolanda Robla Mota at July 25, 2018 09:23 AM

July 24, 2018

Mirantis

How to deploy Spinnaker on Kubernetes: a quick and dirty guide

A reliable guide to deploying Spinnaker, including the magic steps it seems nobody ever talks about.

by Nick Chase at July 24, 2018 08:30 PM

Chris Dent

Placement Extraction 2

Back in February I wrote up a review of the issues involved with extracting placement from nova. Many of the tasks associated with that have been addressed but there's definitely plenty left to do so I'm writing up a second version.

Most code which is within nova.api.openstack.placement now imports only from third parties or from itself, and not from other parts of nova. Remaining issues in non-test code include:

  • Translation infrastructure via from nova.i18n import _. This is straightforward to address but only worth doing once a new repository exists.

  • Database model classes and constants via from nova.db.sqlalchemy import api_models as models and from nova.db import constants as db_const.

    It's already possible to configure a database specific for placement, using the [placement_database] configuration group, but doing so makes use of the nova-api database models and migrations. In an extracted placement this will require some significant adjustments, especially since not all the tables are used.

    More discussion and planning is certainly required here.

  • Service/application configuration is managed by from nova import conf and from nova.common import config. Placement will need its own configuration file and option setup. The placement playground series made it clear there's a relatively small set of configuration options that are required and slightly larger set that are available.

Similar issues are present in unit and functional tests:

  • UUID management with from nova.tests import uuidsentinel. We can copy that small file into a placement repo or maybe it deserves to be in oslo somewhere.

  • Unit tests for policy files make use of conf_fixture and policy_fixture and nova.utils for temp directory handling.

  • Some functional tests are based on nova.test and use fixtures defined by the nova testing system, notably the Database, StandardLogging, OutputStreamCapture fixtures. Dealing with this will require duplicating some of the fixtures, but hopefully can be more lightweight.

  • Some tests import nova.context. This can be fixed now; placement already has its own context.

There's some work in progress to use an external library for resource classes. This addresses the from nova import rc_fields and from nova.db.sqlalchemy import resource_class_cache as rc_cache imports.

And finally: documentation. Placement documentation is in three different places in the nova repo:

  • placement-api-ref
  • releasenotes
  • doc/source

Within doc/source there are some files dedicated to placement, but there are also many files where placement information is integrated.

There will of course be plenty of things that are forgotten. This is okay. When problems are found, we'll fix them. Bugs are the grist of a healthy contribution ecosystem.

by Chris Dent at July 24, 2018 07:16 PM

OpenStack Superuser

Why dev-ops is still a hands-on job

The next time you wake up in a cold sweat from a nightmare where robot overlords have taken over your life, remember that dev-ops is still mostly manual labor.

While it’s definitely better to automate all the things, it’s still not common practice. According to the recently released “State of Dev-Ops: Market Segmentation Report” from Puppet, some two thirds of the 27,000 global participants surveyed are still relying people power for things like change approval processes, a situation researchers call “the last horizon for the ‘human-knows-best’ function.”

Just how hands-on you are at work depends in part on what industry you’re in and the function, whether it’s configuration management, testing, deployment or change approval. What’s surprising is that even in fields like media, tech, telcos and retail none are under 50 percent for relying on humans to oversee change approval processes. That figure creeps up to 66 percent and 64 percent for education and manufacturing, two fields ripe for disruption but clearly not there yet.

Chart courtesy Puppet.

The media, tech, telco, retail, insurance, healthcare and financial fields have numbers in the 30-40 percent range when it comes to human intervention for configuration, testing or deployment. Those numbers hover in the 40-50 percent range if the field is education, energy, government or industry.

Report authors say they were surprised by the fact that companies with 10-19 employees automate more than their super-sized counterparts and the more servers companies have, the less they automate.  (Is it a simply a case of having to do more with less?) Companies with 100-499 servers had the lowest percentage of manual configuration management (36 percent), but respondents in charge of 100,000+ servers had the lowest levels of manual testing (38 percent) and change approval (52 percent). “Perhaps more interesting is the fact that no matter a company’s size or its number of servers, testing and change approval processes account for most manual work.”

The 21-page report also covers where the dev-ops journey starts in companies and the impact on IT performance. You can check out the entire report — free with email registration — here.

And let us know in the comments (or over email: editorATopenstack.org) if this squares with what’s going on at your company and whether you think it will change any time soon.

 

 

The post Why dev-ops is still a hands-on job appeared first on Superuser.

by Superuser at July 24, 2018 03:59 PM

Chris Dent

TC Report 18-30

Yet another slow week at TC office hours. This is part of the normal ebb and flow of work, especially with feature freeze looming, but for some reason it bothers me. It reinforces my fears that the TC is either not particularly relevant or looking at the wrong things.

Help make sure we are looking at the right things by:

  • coming to office hours and telling us what matters
  • responding to these reports and the ones that Doug produces
  • adding something to the PTG planning etherpad.

Last Thursday there was some discussion about forthcoming elections. First up are PTL elections for Stein. Note that it is quite likely that if (as far as I can tell there's not much if about it, it is going to happen, sadly there's not very much transparency on these decisions and discussions, I wish there were) the Denver PTG is the last standalone PTG, then the Stein cycle may be longer than normal to sync up with summit schedules.

On Friday there was a bit of discussion on progress towards upgrading to Mailman 3 and using that as an opportunity to shrink the number of mailing lists. By having fewer, the hope is that some of the boundaries between groups within the community will be more permeable and will help email be the reliable information sharing mechanism.

This morning there was yet more discussion about differences of opinion and approach when it comes to accepting projects to be official OpenStack projects. This is something that will be discussed at the PTG. It would be helpful if people who care about this could make their positions known.

by Chris Dent at July 24, 2018 01:43 PM

Carlos Camacho

Vote for the OpenStack Berlin Summit presentations!

¡¡¡Please vote!!!

I pushed some presentations for this year OpenStack summit in Berlin, the presentations are related to updates, upgrades, backups, failures and restores.

Happy TripleOing!

by Carlos Camacho at July 24, 2018 12:00 AM

July 23, 2018

Assaf Muller

Tenant, Provider and External Neutron Networks

To this day I see confusion surrounding the terms: Tenant, provider and external networks. No doubt countless words have been spent trying to tease apart these concepts, so I thought that it’d be a good use of my time to write 470 more.

At a Glance

Creator Model Segmentation External router interfaces
Tenant User Self service Selected by Neutron
Provider Administrator Pre created & shared Selected by the creator
External Administrator Pre created & shared Selected by the creator Yes

A Closer Look

Tenant networks are created by users, and Neutron is configured to automatically select a network segmentation type like VXLAN or VLAN. The user cannot select the segmentation type.

Provider networks are created by administrators, that can set one or more of the following attributes:

  1. Segmentation type (flat, VLAN, Geneve, VXLAN, GRE)
  2. Segmentation ID (VLAN ID, tunnel ID)
  3. Physical network tag

Any attributes not specified will be filled in by Neutron.

OpenStack Neutron supports self service networking – the notion that a user in a project can articulate their own networking topology, completely isolated from other projects in the same cloud, via the support of overlapping IPs and other technologies. A user can create their own network and subnets without the need to open a support ticket or the involvement of an administrator. The user creates a Neutron router, connects it to the internal and external networks (defined below) and off they go. Using the built-in ML2/OVS solution, this implies using the L3 agent, tunnel networks, floating IPs and liberal use of NAT techniques.

Provider networks (read: pre-created networks) is an entirely different networking architecture for your cloud. You’d forgo the L3 agent, tunneling, floating IPs and NAT. Instead, the administrator creates one or more provider networks, typically using VLANs, shares them with users of the cloud, and disables the ability of users to create networks, routers and floating IPs. When a new user signs up for the cloud, the pre-created networks are already there for them to use. In this model, the provider networks are typically routable – They are advertised to the public internet via physical routers via BGP. Therefor, provider networks are often said to be mapped to pre-existing data center networks, both in terms of VLAN IDs and subnet properties.

External networks are a subset of provider networks with an extra flag enabled (aptly named ‘external’). The ‘external’ attribute of a network signals that virtual routers can connect their external facing interface to the network. When you use the UI to give your router external connectivity, only external networks will show up on the list.

To summarize, I think that the confusion is due to a naming issue. Had the network types been called: self-service networks, data center networks and external networks, this blog post would not have been necessary and the world would have been even more exquisite.

by assafmuller at July 23, 2018 05:56 PM

OpenStack Superuser

How I design custom icons

I often get asked about my process both by clients and designers. Like a magician, people want to know how the illusion is performed. Unlike a magician, I have no problem revealing how my magic is done. In this article I’m going to detail the process I use to create icons for the OpenStack Foundation — a free and open-source software platform for cloud computing.

Each OpenStack project needs its own animal icon, and with over 70 projects this is a dream project 😊. This is going to be less of a tutorial and more of a ‘behind the scenes’ kind of article. Also, I just want to note that this project is more of an identity system for OpenStack’s many projects rather than a traditional icon set, so you’ll notice that I don’t set up a grid when I start.

Final fox
Final fox

For the sake of this post, I will show my process for one of the projects where the team chose a fox as their icon. Other than a brief description of what the product is used for, it was up to me to figure out how best to illustrate this particular animal for the project. Here’s how I went about doing just that.

Step 1: Visual Research

There’s a fine line between drawing what you know and what you see.

On one hand, drawing from knowledge allows you to pull from a sort of collective memory. If you remember it a specific way, chances are others do, too. It’s why icons are so useful — they rely on an idea or metaphor that exists in many people’s minds. The downside is that drawing from memory makes it easy to mess something up.

On the other hand, if you actually look at the thing you are trying to depict, it’s much easier to get proportions right and its legibility is increased; it becomes easier for people to see what it is supposed to be.

A quick google search reveals a plethora of foxes.
A quick google search reveals a plethora of foxes.

First thing I do is fire up google and search for foxes. Photos of real ones are best, I don’t want to be influenced by any other fox logos or illustrations. I want to see how they look, how they’re built, and even how they move. There are two things that I’m looking for here. First, I want to make sure that a fox looks like what I think a fox looks like. And second, I’m basically looking for enough of a resource to be able to find the geometry in the animal.

Step 2: Sketching

I use sketching to figure out how to best pose the animal and how to break it up into simple shapes. Typically, I start off by straight up drawing it as I see it in the photo. This helps me get a sense of what kind of shape to use for the body, leg, head, tail, etc. You know how in those “Learn to draw” books they start you off with simple shapes and you’re supposed to build up from there? This is the reverse of that. Just start drawing the thing and work backwards.

My fox sketches. At the top you can see I start out just drawing foxes, and as I work down the page, I start to simplify and break the fox down into basic shapes.
My fox sketches. At the top you can see I start out just drawing foxes, and as I work down the page, I start to simplify and break the fox down into basic shapes.

So I’ve sketched the fox in a few poses and I have a good idea of where I want to take it. The dot grid helps me with proportions, but I usually don’t spend time making sure the sketch is perfect before jumping on the computer. Because I tend to use a lot of math in my work, it doesn’t make sense to draw something out with rulers and shit when the computer does the computing for me. I know a lot of people will make sure they’ve figured out exactly how they want their logo/icon to look before jumping on the computer, but I have neither the patience nor attention span for it, especially when I know I can get there faster if I’m plugged into the matrix.

Step 3 Vectorizing

Our work is by no means done. Sketching has gotten us half way there, but as I mentioned, we still need to get the math right.

In illustrator, I make sure my rulers are set to pixels, and start off by tracing the fox using rectangles. I recommend building your shapes out using whole numbers rather than arbitrary sizes, ie. a rectangle is set to 60px by 30px or whatever. You can scale your sketch to fit your geometry if need be. This just makes it easier down the road when you want all your proportions right.

Using illustrator’s corner radius tool, I add some curves to the body and tail, and clean up the legs a bit.

The rear leg needs some special love, so I pull it out from the rest of the body. Using a quarter of circle as the shank, I make sure it matches the flow of the back of the body. I freehand the bezier curve for the front of the leg.

I rotate the tail, and add some fills to hide the back legs. You could use the pathfinder tool to join the head to the body, but in this case I just hid it with a white square since I didn’t want to lose the separate shapes just yet. I make sure the width of each leg is exactly the same. The fox is starting to look pretty good!

Step 4 Iterate!

Okay, so the fox is at the point where what’s in my head matches what I put down on paper matches what’s on the computer. But I still want to make sure I’ve explored all variations possible so that I know I’ve arrived at the best solution. This is where the computer trumps the sketchbook simply because of copy/paste.

Exploring the shape of the head and ear.

I add some more curves to the neck, but I’m not quite happy with the head and ear so I duplicate the fox a few times and explore some different shapes.

Working on the front leg/paw to make the pose more dynamic.

A secondary curve is added to the neck to help separate the white fur on a fox’s chest. I also play around with the front foot. There was something about it that just wasn’t sitting right with me. The front foot turns into an arrow shape that helps give the fox some movement and dimension.

Adding some flow to the tail. Not digging that top left fart cloud. 🐕💨

Getting close, but the tail felt too rigid and stiff, so I tried out a few variations with more flow.

We’re in a good spot!

I outline all the strokes and merge everything using the pathfinder tool. I grab all the corners and add a 1px radius for a softer look. This is something I typically do when working with thick strokes to make them less harsh. I add some shadows — black set at 50% opacity so they work for any colour.

Journey from sketch to final fox.

Lastly, I explore some colour options. I start with OpenStack’s brand colours and figure out an orange that will mesh well with the palette as a whole. I don’t want to use more than 2–3 colours, as any more than that would muddy the icon, and it would lose its punch. In this case, I stick to orange and white.

A good icon set takes time to develop. As a whole, the set should align to the brand vision, be visually cohesive, and feel like part of the same family. Individually however, each icon needs to be unique and effectively communicate what it’s representing. That’s what makes designing icon sets challenging, but it is worthwhile for companies looking to take their branding to the next level. A custom icon set creates a unique language tailored specifically to enhance a brand’s visual identity.

You can also view more of the OpenStack icon project here.

About the author

The post How I design custom icons appeared first on Superuser.

by Peter Komierowski at July 23, 2018 04:23 PM

Mirantis

How to Increase the Probability of a VNF Working with Your Cloud – Q&A

There's a lot involved in making sure a VNF will work in your cloud, and we discussed a lot of it, including the obscurities.

by Nick Chase at July 23, 2018 07:32 AM

July 22, 2018

Michael Still

The last week for linux.conf.au 2019 proposals!

Share

Dear humans of the Internet — there is ONE WEEK LEFT to propose talks for linux.conf.au 2019. LCA is one of the world’s best open source conferences, and we’d love to hear you speak!
 
Unsure what to propose? Not sure if your talk is what the conference would normally take? Just want a chat? You’re welcome to reach out to papers-chair@linux.org.au to talk things through.
 

Share

The post The last week for linux.conf.au 2019 proposals! appeared first on Made by Mikal.

by mikal at July 22, 2018 10:33 PM

July 20, 2018

OpenStack Superuser

How to set up container-based OpenStack with Open Virtual Network

Open Virtual Network (OVN) is a relatively new networking technology that provides a powerful and flexible software implementation of standard networking functionalities such as switches, routers, firewalls, etc.

Importantly, OVN is distributed in the sense that the aforementioned network entities can be realized over a distributed set of compute/networking resources. OVN is tightly coupled with OVS, essentially being a layer of abstraction which sits above a set of OVS switches and realizes the above networking components across these switches in a distributed manner.

A number of cloud computing platforms and more general compute resource management frameworks are working on OVN support, including oVirt, OpenStack, Kubernetes and Openshift – progress on this front is quite advanced. Interestingly and importantly, one dimension of the OVN vision is that it can act as a common networking substrate which could facilitate integration of more than one of the above systems, although the realization of that vision remains future work.

In the context of our work on developing an edge computing testbed, we set up a modest OpenStack cluster to emulate functionality deployed within an enterprise data center with OVN providing network capabilities to the cluster. This blog post provides a brief overview of the system architecture and notes some issues we had getting it up and running.

As our system is not a production system, providing high availability (HA) support was not one of the requirements; consequently, it was not necessary to consider HA OVN mode. As such, it was natural to host the OVN control services, including the Northbound and Southbound DBs and the Northbound daemon (ovn-northd) on the OpenStack controller node. Because is the node through which external traffic goes, we also needed to run an external facing OVS on this node which required its own OVN controller and local OVS database. Further, as this OVS chassis is intended for external traffic, it needed to be configured with ‘enable-chassis-as-gw‘.

We configured our system to use DHCP provided by OVN; consequently the Neutron DHCP agent was no longer necessary, we removed this process from our controller node. Similarly, L3 routing was done within OVN meaning that the neutron L3 agent was no longer necessary. OpenStack metadata support is implemented differently when OVN is used: instead of having a single metadata process running on a controller serving all metadata requests, the metadata service is deployed on each node and the OVS switch on each node routes requests to 169.254.169.254 to the local metadata agent; this then queries the nova metadata service to obtain the metadata for the specific VM.

The services deployed on the controller and compute nodes are shown in Figure 1 below.


Figure 1: Neutron containers with and without OVN

We used Kolla to deploy the system. Kolla does not currently have full support for OVN; however, specific Kolla containers for OVN have been created (e.g. kolla/ubuntu-binary-ovn-controller:queens, kolla/ubuntu-binary-neutron-server-ovn:queens). Hence, we used an approach which augments the standard Kolla-ansible deployment with manual configuration of the extra containers necessary to get the system running on OVN.

As always, many smaller issues were encountered while getting the system working – we won’t detail all these issues here, but rather focus on the more substantive issues. We divide these into three specific categories: OVN parameters which need to be configured, configuration specifics for the Kolla OVN containers and finally a point which arose due to assumptions made within Kolla that do not necessarily hold for OVN.

To enable OVN, it was necessary to modify the configuration of the OVS switches operating on all the nodes; the existing OVS containers and OVSDB could be used for this – the OVS version shipped with Kolla/Queens is v2.9.0 – but it was necessary to modify some settings. First, it was necessary to configure system-ids for all of the OVS chassis’ – we chose to select fixed UUIDs a priori and use these for each deployment such that we had a more systematic process for setting up the system but it’s possible to use a randomly generated UUID.

docker exec -ti openvswitch_vswitchd ovs-vsctl set open_vswitch . external-ids:system-id="$SYSTEM_ID"

On the controller node, it was also necessary to set the following parameters:

docker exec -ti openvswitch_vswitchd ovs-vsctl set Open_vSwitch . \
external_ids:ovn-remote="tcp:$HOST_IP:6642" \
external_ids:ovn-nb="tcp:$HOST_IP:6641" \
external_ids:ovn-encap-ip=$HOST_IP external_ids:ovn-encap type="geneve" \
external-ids:ovn-cms-options="enable-chassis-as-gw"

docker exec openvswitch_vswitchd ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-ex

On the compute nodes this was necessary:

docker exec -ti openvswitch_vswitchd ovs-vsctl set Open_vSwitch . \
external_ids:ovn-remote="tcp:$OVN_SB_HOST_IP:6642" \
external_ids:ovn-nb="tcp:$OVN_NB_HOST_IP:6641" \
external_ids:ovn-encap-ip=$HOST_IP \
external_ids:ovn-encap-type="geneve"

Having changed the OVS configuration on all the nodes, it was then necessary to get the services operational on the nodes. There are two specific aspects to this: modifying the service configuration files as necessary and starting the new services in the correct way.

Not many changes to the service configurations were required. The primary changes related to ensuring the the OVN mechanism driver was used and letting neutron know how to communicate with OVN. We also used the geneve tunnelling protocol in our deployment and this required the following configuration settings:For the neutron server OVN container

  • ml2_conf.ini
            mechanism_drivers = ovn
     	type_drivers = local,flat,vlan,geneve
     	tenant_network_types = geneve
    
     	[ml2_type_geneve]
     	vni_ranges = 1:65536
     	max_header_size = 38
    
     	[ovn]
     	ovn_nb_connection = tcp:172.30.0.101:6641
     	ovn_sb_connection = tcp:172.30.0.101:6642
     	ovn_l3_scheduler = leastloaded
     	ovn_metadata_enabled = true
    
  • neutron.conf
            core_plugin = neutron.plugins.ml2.plugin.Ml2Plugin
     	service_plugins = networking_ovn.l3.l3_ovn.OVNL3RouterPlugin
    

    For the metadata agent container (running on the compute nodes) it was necessary to configure it to point at the nova metadata service with the appropriate shared key as well as how to communicate with OVS running on each of the compute nodes

            nova_metadata_host = 172.30.0.101
     	metadata_proxy_shared_secret = <SECRET>
     	bridge_mappings = physnet1:br-ex
     	datapath_type = system
     	ovsdb_connection = tcp:127.0.0.1:6640
     	local_ip = 172.30.0.101
    

For the OVN specific containers – ovn-northd, ovn-sb and ovn-nb databases, it was necessary to ensure that they had the correct configuration at startup; specifically, that they knew how to communicate with the relevant dbs. Hence, start commands such as

/usr/sbin/ovsdb-server /var/lib/openvswitch/ovnnb.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/run/openvswitch/ovnnb_db.sock --remote=ptcp:$ovnnb_port:$ovsdb_ip --unixctl=/run/openvswitch/ovnnb_db.ctl --log-file=/var/log/kolla/openvswitch/ovsdb-server-nb.log

were necessary (for the ovn northbound database) and we had to modify the container start process accordingly.

It was also necessary to update the neutron database to support OVN specific versioning information: this was straightforward using the following command:

docker exec -ti neutron-server-ovn_neutron_server_ovn_1 neutron-db-manage upgrade heads

The last issue which we had to overcome was that Kolla and neutron OVN had slightly different views regarding the naming of the external bridges. Kolla-ansible configured a connection between the br-ex and br-int OVS bridges on the controller node with port names phy-br-ex and int-br-ex respectively. OVN also created ports with the same purpose but with different names patch-provnet-<UUID>-to-br-int and patch-br-int-to-provonet-<UUID>; as these ports had the same purpose, our somewhat hacky solution was to manually remove the the ports created in the first instance by Kolla-ansible.

Having overcome all these steps, it was possible to launch a VM which had external network connectivity and to which a floating IP address could be assigned.

Clearly, this approach is not realistic for supporting a production environment, but it’s an appropriate level of hackery for a testbed.

Other noteworthy issues which arose during this work include the following:

  • Standard Docker apparmor configuration in Ubuntu is such that mount cannot be run inside containers, even if they have the appropriate privileges. This has to be disabled or else it is necessary to ensure that the containers do not use the default docker apparmor profile.
  • A specific issue with mounts inside a container which resulted in the mount table filling up with 65,536 mounts and rendering the host quite unusable (thanks to Stefan for providing a bit more detail on this) – the workaround was to ensure that /run/netns was bind mounted into the container.
  • As we used geneve encapsulation, geneve kernel modules had to be loaded
  • Full datapath NAT support is only available for linux kernel 4.6 and up. We had to upgrade the 4.4 kernel which came with our standard ubuntu 16.04 environment.

This is certainly not a complete guide to how to get OpenStack up and running with OVN, but may be useful to some folks who are toying with this. In future, we’re going to experiment with extending OVN to an edge networking context and will provide more details as this work evolves.

This post first appeared on the blog for the ICCLab (Cloud Computing Lab) and the SPLab (Service Prototyping Lab) of the ZHAW Zurich University of Applied Sciences department.

Superuser is always interested in tutorials about open infrastructure, get in touch at editorATopenstack.org.

The post How to set up container-based OpenStack with Open Virtual Network appeared first on Superuser.

by Superuser at July 20, 2018 03:41 PM

Chris Dent

Placement Update 18-29

This is placement update 18-29, a weekly update of ongoing development related to the OpenStack placement service.

Thanks to Jay for providing one of these last week when I was away: http://lists.openstack.org/pipermail/openstack-dev/2018-July/132252.html

Most Important

Feature freeze is next week. We're racing now to get as much of three styles of work done as possible:

  • Effectively managing nested and shared resource providers when managing allocations (such as in migrations).
  • Correctly handling resource provider and consumer generations in the nova-side report client.
  • Supporting reshaping provider trees.

The latter two are actively in progress. Not sure about the first. Anyone?

As ever, we continue to find bugs with existing features that existing tests are not catching. These are being found by people experimenting. So: experiment please.

What's Changed

Most of the functionality and fixes related to consumer generations is in place on the placement side.

We now enforce that consumer identifiers are uuids.

Bugs

Main Themes

Documentation

This is a section for reminding us to document all the fun stuff we are enabling. Open areas include:

  • "How to deploy / model shared disk. Seems fairly straight-forward, and we could even maybe create a multi-node ceph job that does this - wouldn't that be awesome?!?!", says an enthusiastic Matt Riedemann.

  • The whens and wheres of re-shaping and VGPUs.

Consumer Generations

These are in place on the placement side. There's some pending work on using them properly and addresssing some nits:

Reshape Provider Trees

The work to support a /reshaper URI that allows moving inventory and allocations between resource providers is in progress. The database handling (at the bottom of the stack) is pretty much ready, the HTTP API is close except for a small issue with allocation schema, and the nova side is in active progress.

That's all at: https://review.openstack.org/#/q/topic:bp/reshape-provider-tree

Mirror Host Aggregates

This needs a command line tool:

Extraction

I took some time yesterday to experiment with an alternative to the os-resource-classes that jay created. My version is, thus far, just a simple spike that makes symbols pointing to strings, and that's it. I've made a proof of concept of integrating it with placement.

Other extraction things that continue to need some thought are:

  • infra and co-gating issues that are going to come up
  • copying whatever nova-based test fixture we might like

Other

20 entries two weeks ago. 29 now.

End

Thanks to everyone for all their hard work making this happen.

by Chris Dent at July 20, 2018 12:30 PM

Aija Jauntēva

Outreachy: Redfish Message registry and other

This time I will not act surprised that 2 more weeks have passed because I paid attention to time passing by.

In my previous blog post I mentioned that my last patch was failing CI. It turned out that the function assert_called_once is missing in Python 3.5. (it has assert_called_once_with though, but I can't use it this time). Locally I run Python 3.6 where this function is back, and there were no issues in Python 2.7. I replaced this with asserting call_count for now, but this patch still has to pass code reviews.

With that patch put in code review all green, I returned to @Redfish.Settings that had parts left out previously because of too many things that still required clarification. As it stands now, sushy users can update BIOS attributes, but for now sushy does not expose the status of this update. To get the ball rolling I started to write some code and encountered another dependency - Message Registries. In Redfish response there would be id-s of messages, e.g., Base.1.2.Success, Base.1.2.PropertyValueTypeError and in registry file Base.1.2.0.json that would correspond to section like this[1]:

"PropertyValueTypeError": {
    "Description": "Indicates that a property was given the wrong value type, such as when a number is supplied for a property that requires a string.",
    "Message": "The value %1 for the property %2 is of a different type than the property can accept.",
    "Severity": "Warning",
    "NumberOfArgs": 2,
    "ParamTypes": [
        "string",
        "string"
    ],
    "Resolution": "Correct the value for the property in the request body and resubmit the request if the operation failed."
}

In order to determine if update is successful need to consult the registry and give user some friendly message. In the given sample above the message supports templates and it has placeholders for parameters. sushy would have to build an error message passing the parameters from @Redfish.Settings for specific case. This approach also supports translating and localizing the messages. But for all this to work I need the registries. None of the provided mockup files have sample of these registries included. According to the schema they can be provided via ServiceRoot.Registry property. I remember somewhere I read that they are optional, but then how should sushy handle the case where Redfish service does not provide them? There could be 2 options: download the files programmatically from [2] as necessary or include them in sushy package as fallback. Downloading the files wouldn't be a reliable option because sushy might not have access to the external Internet or the site could be just down. Bundling the files together is the direction to go, but then the mentor queried about the license of these files. These standard registry files provided by DMTF have only copyright statement, but no license. That would mean that this is proprietary and cannot be included in OpenStack projects as they require OSI approved license. No-one was sure and I'm not a lawyer either so it was time to contact OpenStack legal mailing list to clarify this[3]. Before this I talked with the mentors what could be other options if the files couldn't be included - e.g., manually or using a script parse the files, generate a Python dict and store this derived dictionary instead of the original file. In the questions to legal mailing list I also included this approach as possible option. Pretty quickly an answer came back which said: NO, the files cannot be included without a license and the same goes for the derived code. As of this writing this is still on-going and DMTF might apply 3-clause BSD license which would be OK for OpenStack project[4].

On other tasks I did some cleanup patches that emerged from previous code reviews - what usually happens in code reviews is that reviewers notice other thing that need improvement but are not related to the patch in review. Or the necessity for changes is not so big to block the patch but can be done as a followup patch. One of those patches were to clean up sushy-tools documentation to consistently use the same term. Somehow the docs started to have 'simulator' to describe sushy-emulator and sushy-static. It might have been me because I though of 'simulator' to be more general term. Went through some discussions [5][6] to understand which is the right term to use. Turns out it is 'emulator'. Which also means that the title of my previous blog post is incorrect.

Another thing, I took over a patch that emulated Ethernet Interfaces in sushy-emulator. It was rather an old patch from January this year and since it was created sushy-tools had introduced support for openstackdriver and thus changed some structure in the Flask app too. I rebased and updated with the new structure and added support for openstacksdk driver. Which led me to setting up OpenStack cloud locally. A bit funny, but I haven't had a need to have access to OpenStack cloud before. This time I needed a sample to see how openstacksdk returns data for network interfaces, which was not entirely clear from the docs. I used dev-stack[7] on a VM and it worked without any problems. This patch too is in code-review.

by ajya at July 20, 2018 09:00 AM

July 19, 2018

OpenStack Superuser

Zuul case study: BMW

Zuul drives continuous integration, delivery and deployment systems with a focus on project gating and interrelated projects. In a series of interviews, Superuser asks users about why they chose it and how they’re using it.

Here Superuser talks to Tobias Henkel, software engineer at BMW, about benefits, challenges and hopes for future development. Henkel will be sharing more details about the case study in a breakout session at the OpenStack Summit Berlin.

The days of annual, monthly or even weekly releases are long gone. How is CI/CD defining new ways to develop and manage software at BMW?

Software has been an integral part of cars for several decades and has become one of the key enablers for many modern safety and comfort features. The amount of software required to implement all these features, as well as the complexity inducted by the many configuration options of current cars, is constantly rising.

The SW architecture in vehicles has evolved from more-or-less independent electronic control units (ECUs) to a set of highly connected functions spread over many ECUs. Without the right strategies, it’s not possible to manage all the required software projects that must converge on a strict schedule to deliver the BMW experience to customers.

The wide adoption of CI/CD in our internal and external development teams is one of the essential tools to deliver and integrate all software components on time with the required quality, despite the complexity of current and future software projects. Today, most BMW software projects rely on CI/CD for automating use cases of their daily work.

Are there specific features that drew you to Zuul?

After using CI/CD systems for many years for an ever-increasing amount of projects, the limitations of the existing CI solutions were starting to impact our software development efforts. With the increasing size and complexity of today’s software projects such as autonomous driving, the scaling capabilities of our CI/CD solution have become a crucial prerequisite of future development.

While scalability is an absolute must-have for our developers, testers and integrators, there are other important requirements for CI/CD:

1. Support for a centrally hosted instance for many projects

2. Support for complex cross-project CI configurations

3. Compatibility with our existing infrastructure

4. An active open-source community

The Zuul solution, especially after release of version 3.0, fully supports all our requirements to provide a centrally hosted solution that can be shared by many internal software projects. This dramatically reduces operations overhead and frees up valuable developer time to continuously improve all aspects of our CI system setup.

Zuul integrates seamlessly with our in-house OpenStack cloud and our repository systems Gerrit and GitHub. It also has an active community and provides the flexibility that our projects need.

How are you currently using Zuul at BMW?

For several years we’ve been operating several Zuul V2 instances for big software projects that need high CI/CD performance. This, of course, came at the cost of operating many instances with similar configurations.

In recent months, we’ve been preparing a centrally hosted CI/CD instance based on Zuul V3 and many projects using previous CI/CD solutions are already in the progress of migrating to the new Zuul V3 instance.

Hosting many projects on one central platform has many advantages for operation overhead and resource sharing in the cloud, but hosting many projects on one CI/CD instance directly translates to high stability and availability.

To maximize the availability of our central CI/CD service, we’re running Zuul, Nodepool and Zookeeper services in an OpenShift cluster, hosted on OpenStack. In addition to improved availability, we’ve seen several development and operation benefits for our internal CI/CD development team.

What other benefits have you seen so far?

The wide adoption of CI/CD in our software projects is the foundation to deliver high-quality software in time by automating every integral part of the development cycle from simple commit checks to full release processes.

With the introduction of Zuul-based CI/CD instances, projects were able to use the cloud resources for their automation in a very dynamic way. As a result, projects can do much more excessive testing in less time, which directly results in higher quality software, while being faster in development and integration.

With the introduction of Zuul V3 we also see a lot of benefits from the operators perspective by providing a centrally hosted CI/CD instance, as opposed to many small ones that have to be managed individually.

What challenges have you overcome?

A centrally hosted CI/CD platform for many projects faces the challenge to support many different use cases or restrictions, that projects inherently have. A common restriction for our projects is the need for non-Linux build nodes, because some required applications or tool chains are only supported for some operating systems like explicit versions of Microsoft Windows.

From the operator’s perspective, we don’t want special solutions for updating the node images, we want Nodepool to automatically do that for us, just like it does for Linux-based images. This required and still requires some extra effort.

Another interesting challenge is the management of the Zuul and Nodepool configuration, or, to be more precise, the responsibilities for managing the configuration. On one hand, we want to provide the projects as much configuration flexibility as possible, but on the other there are still centralized configuration files that we need to manage centrally. One example is the registration of static nodes at the CI system. We’re still working out how to manage these centralized configuration files effectively.

What can you tell us about future plans?

We’re currently migrating many projects to the centrally provided CI/CD instance based on Zuul V3. This instance will be the go-to solution for many existing and new software projects of BMW. We anticipate a continuous growth of project count and sizes, as well as a massive increase of our user base, which includes internal and external project members.

Given the strategic importance of Zuul and Nodepool for our development infrastructure, our main focus will be stability, availability, as well as scalability. While Zuul is already well prepared for most stability and scalability needs, there are still availability improvements required.

The main issue to solve is the removal of all single points of failure by making all services of the Zuul CI system highly available (HA). The CI/CD service should stay fully operational at all times, even if there are issues in single (virtual) machines or even a whole OpenStack availability zone.

What are you hoping the Zuul community continues to focus on or delivers?

While Zuul V3 provides a solution for most of our software projects out of the box, we still see room for improvement.

Our users of the Zuul CI/CD system would appreciate improvements to the Zuul-Web component, e.g. to provide more information on current and past jobs, cancelling a running job or configuring the status page layout per tenant.

However, the highest priority from our perspective is the removal of any single point of failures to support a configuration with high availability.

Anything else you’d like to add?

The obvious point for using the Zuul CI solution at BMW is the comprehensive feature set of Zuul that supports all major use cases for us.

An equally important part of our decision for Zuul is the active and helpful community that drives the development of Zuul and Nodepool.

Our CI/CD development team at BMW is proud to be part of the Zuul community and will continue to be active contributors of the Zuul OSS project.

We would like to thank all Zuul developers and maintainers for their great work.

 

Superuser wants to hear your open infrastructure story, get in touch: editorATopenstack.org

Cover photo: BMW M5 Competition, courtesy BMW press.

The post Zuul case study: BMW appeared first on Superuser.

by Nicole Martinelli at July 19, 2018 03:41 PM

July 18, 2018

Aptira

The Network is the Computer. Part 1 – Foundations: The Unfulfilled Promise

Aptira: What is Open Networking? The Computer is the Network.

As we saw in parts 1 & 2 of this series, Open Networking has a large footprint covering different components of technologies, skills and practices. To help you understand that footprint we’re starting a series of posts that will look at the evolution of these components and show you how they became Open Networking as we define it today.  We start with the Infrastructure domain.

In 1984 at a legendary company made a bold but, at that time, unfulfillable promise:

The network is the computer

John Gage, Sun Microsystems

Sun was born from the explosion of microelectronics advances in the mid 1970’s that initiated the long growth curve described by Moore’s Law. Sun pioneered a range of computing servers and workstations that brought computing power directly to workgroups and smaller enterprises. 

Previously, computing services were provided by large centralised mainframes, where all the computing work was done. Networks merely provided a pipe to get access to that central resource, using largely passive devices such as terminals and printers. 

This architecture produced a centralised and rigid management structure with control over computing. Centralised control had some benefits e.g. security, resource optimisation and budget control, but it also resulted in inflexibility, huge backlogs, lack of engagement with end users and many well-publicised disaster projects. 

The emergence of workgroup servers and workstations delivered computing power independently of the centralised technical and management architecture. Personal Computers (PCs) opened access to computing power to even larger numbers of people when they became commodity products in the early 1980’s. 

At the time, networking and computing were very distinct technology ecosystems which combined in very limited and fixed ways. These two ecosystems were almost completely separated across the supply chain, organisational structure, architectural principles, implementation, and operation. 

The introduction of departmental servers and PC’s helped to circumvent and undermine this centralised control model but didn’t change the basic paradigm of computer and networks as entrenched and disparate ecosystems. In part this was due to the limited and expensive options available from the telecommunications carriers, who operated in highly regulated telecommunications marketplaces that suppressed competition and innovation. 

A great gap was opening in the industry: between the late 1980’s and the early 1990’s the cost of computing rapidly reduced but the cost of connectivity remained high. At the same time the number of physical devices in distributed computing systems exploded as did end-user need. Innovation created products that solved problems at the margin but did not resolve the fundamental gap long term. For example: PC terminal cards, black box protocol emulators, and products that overlaid departmental networks on top of the mainframe terminal networks.  Solutions were possible but, in many cases, very messy. 

Sun had wanted to promote a paradigm of broad computing availability enabled by network integration, but the promise seemed less deliverable than ever. 

The problems were not only in network-land. The sheer numbers of compute devices were causing their own problems, caused by two aspects of the IT marketplace at that time: 

  • Servers were stand-alone devices with their own cabinetry, power supplies and so forth, in part to operate in uncontrolled environments, they were still housed in many cases in data centre-controlled environments for security and ease of operation.
  • Cheap commodity hardware generated a simplistic architecture that instantiated server components physically dedicated to a solution function. E.g. For a cluster of 10 web-servers, you spun up 10 physical devices in a cluster. Likewise, with database servers, email servers, application servers and so forth.

These two factors produced huge numbers of boxes in “server farms”, and data centres struggled to cope with the exploding demand for power, heat and physical space as more and more organisations sought to implement connected computing solutions, and more and more applications were found to use this computing power within organisations. 

Solutions were found that provided the necessary functionality, but compatibility and interoperability were problematic.  The cost of supporting solutions that integrated disparate components grew rapidly. 

How did these problems get solved?  We will cover that in the next post.  Stay tuned. 

The post The Network is the Computer. Part 1 – Foundations: The Unfulfilled Promise appeared first on Aptira.

by Adam Russell at July 18, 2018 10:28 PM

RDO

Community Blog Round-Up: 18 July

We’ve got three posts this week related to OpenStack – Adam Young’s insight on how to verify if a patch has been tested as a reviewer, while Zane Bitter takes a look at OpenStack’s multiple layers of services, and then Nir Yechiel introduces us to the five things we need to know about networking on Red Hat OpenStack Platform 13. As always, if you know of an article not included in this round up, please comment below or track down leanderthal (that’s me! Rain Leander!) on Freenode irc #rdo.

Testing if a patch has test coverage by Adam Young

When a user requests a code review, the review is responsible for making sure that the code is tested. While the quality of the tests is a subjective matter, their presences is not; either they are there or they are not there. If they are not there, it is on the developer to explain why or why not.

Read more at https://adam.younglogic.com/2018/07/testing-patch-has-test/

Limitations of the Layered Model of OpenStack by Zane Bitter

One model that many people have used for making sense of the multiple services in OpenStack is that of a series of layers, with the ‘compute starter kit’ projects forming the base. Jay Pipes recently wrote what may prove to be the canonical distillation (this post is an edited version of my response):

Read more at https://www.zerobanana.com/archive/2018/07/17#openstack-layer-model-limitations

Red Hat OpenStack Platform 13: five things you need to know about networking by Nir Yechiel, Principal Product Manager, Red Hat

Red Hat OpenStack Platform 13, based on the upstream Queens release, is now Generally Available. Of course this version brings in many improvements and enhancements across the stack, but in this blog post I’m going to focus on the five biggest and most exciting networking features found this latest release.–

Read more at https://redhatstackblog.redhat.com/2018/07/12/red-hat-openstack-platform-13-five-things-you-need-to-know-about-networking/

by Rain Leander at July 18, 2018 09:05 AM

July 17, 2018

OpenStack Superuser

Building new foundations: OpenInfra Days Korea

South Korea, already the world’s top producer of mobile phones, displays and semiconductors, is also helping build key conversations around open infrastructure.

That was in evidence at OpenInfra Days Korea 2018,  a two-day conference with technical sessions, deep-dive sessions and hands-on-labs. About 700 people attended the first day; there were 500 attendees on day two.

Bustling crowd at OpenInfra Days Korea 2018. Photo: Seungjin Han.

Seongsoo Cho, Korea User Group co-leader and Seungkyu Ahn, leader of the Korea Kubernetes User Group, kicked off this year’s event emphasizing collaboration between technologies and user groups. The event featured sessions on various open infrastructure topics as well as many Kubernetes-related contributions from the OpenStack Korea User Group, the AWS Korea User Group, the Korea Azure User Group, the Google Cloud Platform Korea User Group and the Korea Developerworks User Group.  You can check out some of these sessions on YouTube.

Highlights from the event include:

  • SK Telecom talking about its participation in the Airship project with AT&T and introduced TACO (SKT All Container OpenStack), an OpenStack solution offering containers, orchestration and automation technologies that features featured self-healing and continuous upgrades.
  • ManTech, a Korean company focused on high availability and disaster recovery, shared their experience of digital transformation with open infrastructure technologies such as Docker and Kubernetes.
  • Samsung Electronics Korea talked the evolution to cloud native with the requirements of 5G telco network and how OpenStack has evolved.
  • Open Source Consulting, a Korean company that offers OpenStack deployments, emphasized stability and agility in businesses and shared their deployment efforts to space information promotion division in Korea and a Korean cryptocurrency exchange.


In comparison to the previous four OpenStack Days Korea events, there was a larger presence of sponsors and sessions from local companies. More than two-thirds of featured sponsor companies (15 out of 22) were Korean compared to 37.5 percent in 2017 and 23 percent in 2016.

The OSF’s Mark Collier, Lauren Sell and Chris Hoge with OpenInfra Days organizers. Photo: Sung Ki Park.

 

This time around, session topics were both broad and deep, ranging from OpenStack to AI infrastructure, GPU, multi- clouds, SDN/NFV, blockchain infrastructure and open hardware using ARM-based servers.

For the first time at the event, the Upstream Institute was offered by the OSF’s Ildiko Vansca and Kendall Nelson for about 15 participants.

The event is organized by the Korean OpenStack community with the help of many dedicated many volunteers. It would be impossible without sharing and transferring know-how and wisdom through three generations of the Korea user group. Many thanks to Ian Choi, Nalee Jang, Jaesuk Ahn, Seongsoo Cho, Taehee Jang, Hochul Shin, Jungwon Ku and all the other organizers and volunteers.

The volunteer crew from day two.

Find out more about the OpenStack User Group and how to get involved here.

Photo // CC BY NC

The post Building new foundations: OpenInfra Days Korea appeared first on Superuser.

by Superuser at July 17, 2018 11:44 PM

Adam Young

Testing if a patch has test coverage

When a user requests a code review, the review is responsible for making sure that the code is tested.  While the quality of the tests is a subjective matter, their presences is not;  either they are there or they are not there.  If they are not there, it is on the developer to explain why or why not.

Not every line of code is testable.  Not every test is intelligent.  But, at a minimum, a test should ensure that the code in a patch is run at least once, without an unexpected exception.

For Keystone and related projects, we have a tox job called cover that we can run on a git repo at a given revision.  For example, I can code review (even without git review) by pulling down a revision using the checkout link in  gerrit, and then running tox:

 

git fetch git://git.openstack.org/openstack/keystoneauth refs/changes/15/583215/2 && git checkout FETCH_HEAD
git checkout -b netloc-and-version
tox -e cover

I can look at the patch using show –stat to see what files were changed:

$ git show --stat
commit 2ac26b5e1ccdb155a4828e3e2d030b55fb8863b2
Author: wangxiyuan 
Date:   Tue Jul 17 19:43:21 2018 +0800

    Add netloc and version check for version discovery
    
    If the url netloc in the catalog and service's response
    are not the same, we should choose the catalog's and
    add the version info to it if needed.
    
    Change-Id: If78d368bd505156a5416bb9cbfaf988204925c79
    Closes-bug: #1733052

 keystoneauth1/discover.py                                 | 16 +++++++++++++++-
 keystoneauth1/tests/unit/identity/test_identity_common.py |  2 +-

and I want to skip looking at any files in keystoneauth1/tests as those are not production code. So we have 16 lines of new code. What are they?

Modifying someone elses’ code, I got to

 git show | gawk 'match($0,"^@@ -([0-9]+),[0-9]+ [+]([0-9]+),[0-9]+ @@",a){left=a[1];right=a[2];next};\
   /^\+\+\+/{print;next};\
   {line=substr($0,2)};\
   /^-/{left++; next};\
   /^[+]/{print right++;next};\
   {left++; right++}'

Which gives me:

+++ b/keystoneauth1/discover.py
420
421
422
423
424
425
426
427
428
429
430
431
432
433
437
+++ b/keystoneauth1/tests/unit/identity/test_identity_common.py
332

Looking in a the cover directory, I can see if a line is uncovered by its class:

class="stm mis"

For example:

$ grep n432\" cover/keystoneauth1_discover_py.html | grep "class=\"stm mis\""

432

For the lines above, I can use a seq to check them, since they are in order (with none missing)

for LN in `seq 420 437` ; do grep n$LN\" cover/keystoneauth1_discover_py.html ; done

Which produces:

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

I drop the grep “class=\”stm mis\”” to make sure I get something, then add it back in, and get no output.

by Adam Young at July 17, 2018 05:35 PM

Chris Dent

TC Report 18-29

Again a relatively slow week for TC discussion. Several members were travelling for one reason or another.

A theme from the past week is a recurring one: How can OpenStack, the community, highlight gaps where additional contribution may be needed, and what can the TC, specifically, do to help?

Julia relayed that question on Wednesday and it meandered a bit from there. Are the mechanics of open source a bit strange in OpenStack because of continuing boundaries between the people who sell it, package it, build it, deploy it, operate it, and use it? If so, how do we accelerate blurring those boundaries? The combined PTG will help, some.

At Thursday's office hours Alan Clark listened in. He's a welcome presence from the Foundation Board. At the last summit in Vancouver members of the TC and the Board made a commitment to improve communication. Meanwhile, back on Wednesday I expressed a weird sense of jealousy of all the nice visible things one sees the foundation doing for the newer strategic areas in the foundation. The issue here is not that the foundation doesn't do stuff for OpenStack-classic, but that the new stuff is visible and over there.

That office hour included more talk about project-need visibility.

Lately, I've been feeling that it is more important to make the gaps in contribution visible than it is to fill them. If we continue to perform above and beyond, there is no incentive for our corporate value extractors to supplement their investment. That way lies burnout. The health tracker is part of making things more visible. So are OpenStack wide goals. But there is more we can do as a community and as individuals. Don't be a hero: If you're overwhelmed or overworked tell your peers and your management.

In other news: Zane summarized some of his thoughts about Limitations of the Layered Model of OpenStack. This is a continuation of the technical vision discussions that have been happening on an etherpad and email thread.

by Chris Dent at July 17, 2018 04:17 PM

OpenStack Superuser

Airship: Making life cycle management repeatable and predictable

Although the name brings to mind a dirigible, Airship is a actually collection of loosely coupled, interoperable open-source tools that are nautically themed. It’s got a stellar crew working together — AT&T, South Korea Telecom, Intel and the OpenStack Foundation — with the aim of making life cycle management for open infrastructure repeatable and predictable.

Here’s a stem-to-stern rundown of what it is and what it can do for you.

What is Airship?

Airship provides for automated cloud provisioning and life cycle management in a declarative, predictable way, says AT&T’s lead cloud architect Alan Meadows. The telecom giant has been building on the foundation laid by the OpenStack-Helm project since 2017. Airship components help create sites, perform minor updates, make configuration changes and manage major software uplifts like OpenStack upgrades. The focus of the project has been on the implementation of a declarative platform to introduce OpenStack on Kubernetes and the life cycle management of the resulting cloud. “Simply put,” says Meadows, “the goal is to take bare mental infrastructure from data center loading dock to a functioning OpenStack cloud and then care for it beyond.”

Why use Airship?

Airship comes from the perspective that all site definitions should be completely declarative. Teams can manage YAML documents and submit them to the site where they are fully realized by software. End users don’t have to necessarily know what has changed in a YAML document. The full document bundle is submitted, then Airship figures out what needs to be done, whether that’s an initial site deployment or a simple update to the already existing site. Airship uses containers to deliver software, providing a required level of simultaneous coexistence and separation, letting teams handle any number of isolated dependency requirements. It also lets them progress containers that we release through development testing and finally production without changing them so they know what’s being deployed every time.

Airship allows for a single workflow for all life cycle management. That lets teams interact with the system in the exact same way whether it’s during the initial site deployment or future upgrades to live sites. Finally, Airship is a flexible solution. It’s a single platform that can support large cloud installations, small edge sites, and different software-defined-networks. It can also manage more than just OpenStack installations.

What tools make up the Airship project?

It must have been fun to bestow all of these projects with such swashbuckling names: Treasure Map is a documentation project. It outlines the various sub-projects and provides a reference architecture for the platform. Drydock is the bare-metal provisioning engine that works with Promenade to label and join the host to a Kubernetes cluster.

Divingbell is a lightweight solution for bare metal configuration for when elements of an operating system needs some extra non-containerized configuration. Deckhand acts like a centralized site design document storage back-end, keeping revision history of all site designs and changes. That allows teams to reuse common site design settings across different types of sites (and keeps the YAML documents compact with reusable elements). Armada is the enterprise solution, letting teams articulate every piece of software to run in a site, where to pull those charts from, in what order that will to be deployed, how to group them, how to test them, and how to roll them back if there’s a failure — all in a set of documents that teams can life cycle. Airship also uses Berth as a Kubernetes virtual machine runner, Promenade as a resilient, production-ready Kubernetes cluster generator and Pegleg, a tool to manage, build, aggregate, validate and test YAML documents for Airship document sets. You can check out the entire repository here.

What does an Airship-built cloud look like?

The image above shows during the presentation visually explains how Airship manages the entire life cycle of a project. Starting at the bottom and working up, there’s a bare metal host and a host operating system and network and storage configuration that is declared and laid down by Drydock. Promenade handles effectively two aspects of Kubernetes management: the manufacturing of the cluster and its life cycle, then it handles docker and Kubla configurations on new hosts as well as joining them to the cluster and labeling them appropriately so they receive the right workloads.

Airship then instantiates SEF and Calico  using Helm parts orchestrated by Armada to provide storage and network capabilities to Kubernetes and the rest of the Airship components. Finally, OpenStack-Helm is used to provide a functioning OpenStack installation. It also logs, monitors, and handles alerts. “The entire stack is installed and life cycled in the exact same way,” says Meadows, “including Airship itself.”

You can see the entire presentation from the OpenStack Summit Vancouver including Rodolfo Pacheco’s explanatory animations and Seungkyu Ahn’s case study from South Korea, below.

Photo // CC BY NC

The post Airship: Making life cycle management repeatable and predictable appeared first on Superuser.

by Rob LeFebvre at July 17, 2018 03:20 PM

Zane Bitter

Limitations of the Layered Model of OpenStack

One model that many people have used for making sense of the multiple services in OpenStack is that of a series of layers, with the ‘compute starter kit’ projects forming the base. Jay Pipes recently wrote what may prove to be the canonical distillation (this post is an edited version of my response):

Nova, Neutron, Cinder, Keystone and Glance are a definitive lower level of an OpenStack deployment. They represent a set of required integrated services that supply the most basic infrastructure for datacenter resource management when deploying OpenStack. Depending on the particular use cases and workloads the OpenStack deployer wishes to promote, an additional layer of services provides workload orchestration and workflow management capabilities.

I am going to explain why this viewpoint is wrong, but first I want to acknowledge what is attractive about it (even to me). It contains a genuinely useful observation that leads to a real insight.

The insight is that whereas the installation instructions for something like Kubernetes usually contain an implicit assumption that you start with a working datacenter, the same is not true for OpenStack. OpenStack is the only open source project concentrating on the gap between a rack full of unconfigured equipment and somewhere that you could run a higher-level service like Kubernetes. We write the bit where the rubber meets the road, and if we do not there is nobody else to do it! There is an almost infinite variety of different applications and they will all need different parts of the higher layers, but ultimately they must be reified in a physical data center and when they are OpenStack will be there: that is the core of what we are building.

It is only the tiniest of leaps from seeing that idea as attractive, useful, and genuinely insightful to believing it is correct. I cannot really blame anybody who made that leap. But an abyss awaits them nonetheless.

Back in the 1960s and early 1970s there was this idea about Artificial Intelligence: even a 2 year old human can (for example) recognise images with a high degree of accuracy, but doing (say) calculus is extremely hard in comparison and takes years of training. But computers can already do calculus! Ergo, we have solved the hardest part already and building the rest out of that will be trivial, AGI is just around the corner, and so on. The popularity of this idea arguably helped created the AI bubble, and the inevitable collision with the reality of its fundamental wrongness led to the AI Winter. Because, in fact, though you can build logic out of many layers of heuristics (as human brains do), it absolutely does not follow that it is trivial to build other things that also require layers of heuristics out of some basic logic building blocks. (In contrast, the AI technology of the present, which is showing more promise, is called Deep Learning because it consists literally of multiple layers of heuristics. It is also still considerably worse at it than any 2 year old human.)

I see the problem with the OpenStack-as-layers model as being analogous. (I am not suggesting there will be a full-on OpenStack Winter, but we are well past the Peak of Inflated Expectations.) With Nova, Keystone, Glance, Neutron, and Cinder you can build a pretty good Virtual Private Server hosting service. But it is a mistake to think that cloud is something you get by layering stuff on top of VPS hosting. It is relatively easy to build a VPS host on top of a cloud, just like teaching someone calculus. But it is enormously difficult to build a cloud on top of a VPS host (it would involve a lot of expensive layers of abstraction, comparable to building artificial neurons in software).

That is all very abstract, so let me bring in a concrete example. Kubernetes is event-driven at a very fundamental level: when a pod or a whole kubelet dies, Kubernetes gets a notification immediately and that prompts it to reschedule the workload. In contrast, Nova/Cinder/&c. are a black hole. You cannot even build a sane dashboard for your VPS—let alone cloud-style orchestration—over them, because it will have to spend all of its time polling the APIs to find out if anything happened. There is an entire separate project, that almost no deployments include, basically dedicated to spelunking in the compute node without Nova’s knowledge to try to surface this information. It is no criticism of the team in question, who are doing something that desperately needs doing in the only way that is really open to them, but the result is an embarrassingly bad architecture for OpenStack as a whole.

So yes, it is sometimes helpful to think about the fact that there is a group of components that own the low level interaction with outside systems (hardware, or IdM in the case of Keystone), and that almost every application will end up touching those directly or indirectly, while each using different subsets of the other functionality… but only in the awareness that those things also need to be built from the ground up as interlocking pieces in a larger puzzle.

Saying that the compute starter kit projects represent a ‘definitive lower level of an OpenStack deployment’ invites the listener to ignore the bigger picture; to imagine that if those lower level services just take care of their own needs then everything else can just build on top. That is a mistake, unless you believe that OpenStack needs only to provide enough building blocks to build VPS hosting out of, because support for all of those higher-level things does not just fall out for free. You have to consciously work at it.

Imagine for a moment that, knowing everything we know now, we had designed OpenStack around a system of event sources and sinks that are reliable in the face of hardware failures and network partitions, with components connecting into it to provide services to the user and to each other. That is what Kubernetes did. That is the key to its success. We need to enable something similar, because OpenStack is still necessary even in a world where Kubernetes exists.

One reason OpenStack is still necessary is the one we started with above: something needs to own the interaction with the underlying physical infrastructure, and the alternatives are all proprietary. Another place where OpenStack can provide value is by being less opinionated and allowing application developers to choose how the event sources and sinks are connected together. That means that users should, for example, be able to customise their own failover behaviour in ‘userspace’ rather than rely on the one-size-fits-all approach of handling everything automatically inside Kubernetes. This is theoretically an advantage of having separate projects instead of a monolithic design—though the fact that the various agents running on a compute node are more tightly bound to their corresponding services than to each other has the potential to offer the worst of both worlds.

All of these thoughts will be used as fodder for writing a technical vision statement for OpenStack. My hope is that will help align our focus as a community so that we can work together in the same direction instead of at cross-purposes. Along the way, we will need many discussions like this one to get to the root of what can be some quite subtle differences in interpretation that nevertheless lead to divergent assumptions. Please join in if you see one happening!

by Zane Bitter at July 17, 2018 03:17 PM

Stackmasters team

Join us for OpenStack’s 8th Birthday Celebrations!

It’s celebration time! OpenStack is turning 8 years old and if that doesn’t call for a grand get together of “Stackers” then we don’t know what does.

OpenStack: 8th Birthday Celebrations!

Here at Stackmasters, as always, we will be inviting the OpenStack community in Greece to our shard office space at Starttech Ventures HQ in Athens to join in the celebrations.

Here are the juicy details: all-comers are welcome to join us this Wednesday, July 18, from 19:00 – 21:00 at Starttech Ventures. There are still a number of open spots of the meetup, but they are going like hotcakes so RSVP here if you want to join the fun.

OpenStack: onwards and upwards

The Birthday meetup is an event we love to host each July. And it’s not just about celebrating the progress of the OpenStack project with those working with the technology, but to discuss best practices, tricks of the trade and a few zany stories that many of you often have.

Each year we share the latest news, trends and speculations of the OpenStack project with our local community members, as well as all its users. Additionally, we always welcome anyone interested in technology and the development of Open Infrastructure projects with open arms.

There are similar events organized by local OpenStack communities around the world in coordination with the OpenStack Foundation, so wherever you are you are sure to find one near you.

Athens OpenStack MeetUp Agenda

As is usually the case, it will be a laid-back affair. Here is the agenda of what we have in store for you:

18:45: Welcome and Introduction

19:00: 8 Years of OpenStack – Review by Thanassis Parathyras

19:45: Open discussion and call for future meetings

See you there!

As always, it will be our absolute pleasure to see those of you who have worked or want to get to know OpenStack better, as well as those who want to share the experiences you have with OpenStack and its greater community. A big shout out to the sponsors of the OpenStack Foundation, Stackmasters and Starttech Ventures.

RSVP today. And be there or be square!

Join us for OpenStack’s 8th Birthday Celebrations! was last modified: July 17th, 2018 by Stackmasters

The post Join us for OpenStack’s 8th Birthday Celebrations! appeared first on Stackmasters.

by Stackmasters at July 17, 2018 08:10 AM

July 16, 2018

OpenStack Superuser

The power of a moment: Making a difference through mentoring

I’ve been one of the lucky few who has never known a time without a mentor.

I entered the tech community through the Open Cloud Academy, a program in San Antonio run by Rackspace that teaches system administration to anyone who wants to learn regardless of technical background. During my time at the OCA, a handful of volunteers came by to help us learn how to troubleshoot and teach us what it means to be a sys admin. I didn’t know it at the time, but they were my first mentors.

Merriam-Webster defines a mentor as a trusted counselor or guide, but if I’m going to be honest (and I always try to be), for anyone who has had a mentor, the word means so much more. However, I also believe that we’ve been looking at mentorship in the wrong light for far too long. We view mentorships with these formal dynamics and put as much pressure on them to succeed as we do on business deals. That’s why so many mentorships never go beyond the planning stages. I once heard a saying that people come into your life for a reason, a season or a lifetime and that’s the perspective we need to adopt for mentorship.

When Rackers came by the OCA to help, they did so for a reason; they were there to offer guidance to what everyone hoped would be the next crop of Rackspace rookies. There was no formal mentorship program, no official ask of these employees. Some came because they have a volunteer’s heart. Others because they had once been students at the OCA and wanted to repay the favor.

As time passed, it became evident which pairing of students and volunteers worked best. We began to see groups working closer together and exchanging information so they could continue to collaborate outside the OCA. Though some would argue that’s not mentorship, I would argue that it’s time to expand the definition of mentor.

I believe a mentor is someone who can not only help you recognize your strengths, but also help you see the areas where you need to develop to achieve your goals. It’s challenging to create this type of relationship instantly, and because of this, we need to be open to allowing that dynamic to grow organically. Sometimes the first steps to a long-term mentorship can be a few moments where someone gives their time to help foster the growth of someone else. When this happens, the next step requires a moment of vulnerability from the mentee. They have to admit that they could benefit from additional help and then take the risk of asking for it. It can be as simple as saying “I learned so much in our time together. Would you mind if I followed up with you if I have more questions?” This simple comment can help establish the reason.

We view mentorships with these formal dynamics and put as much pressure on them to succeed as we do on business deals. That’s why so many never go beyond the planning stages.

This moment of vulnerability lets the mentor know that their time was well spent and that their effort has value. However, as mentees, our jobs are not done. We need to make an effort to continue our journey on our own, to put what we have learned to work.

I recently had a chance to test my own advice when I decided to learn to program by writing a small Python script to keep track of my grocery list. When I ran the idea by a colleague, I could instantly see the look of panic, accompanied by the comment “I wish I could help but I already have so much on my plate.” I’ve seen this look before; he thought I was asking to take on the responsibility of teaching me. So a few weeks later, when I ran into a few issues where the code didn’t behave how I thought it should, I took the time to prepare my questions before approaching him for help.

Instead of saying, “Can you help me with my code?” or “Ugh, this is broken, and I don’t know why. Could you help?” I narrowed down the area in which the issue was occurring and asked, “The logic in this if-then statement does not seem to be occurring the way I thought it would. If I send it to you, could you look it over?”   The phrasing of the question allowed him to understand that I wasn’t asking for a significant amount of time debugging my entire program or even asking to teach me how to code. I was merely asking for a few minutes of troubleshooting. The result was better than expected. My colleague was enthusiastic about how much I’d managed to accomplish on my own and he explained how my indentations caused a break in the logic of the code. He also offered a few other suggestions on how to improve the script.

A week later, after running into another issue implementing one of his recommendations, I asked if he’d look again. Because I’d previously shown that I was not only willing to put the work in myself but also valued input, his willingness to help was genuine. The mentorship relationship had been seeded without any formal commitment. In the two months since I decided to learn to program, this colleague has found a renewed interest in strengthening his own Python skills along with becoming one of my most prominent advocates and mentors.

The point of sharing my story is to challenge you to not look at mentorship as a formal agreement between two people, but to look for it in the small moments where someone offers to help. Be willing to open up and perhaps feel a moment of vulnerability by admitting how much you could benefit from continued support. The OpenStack and Open source community is full of talented, caring individuals who want to help the community grow; we have to be willing to invest the time to help these relationships evolve.

How to get involved

The OpenStack Mentoring program was recently relaunched.  The new mentoring program will focus on providing mentorship through goal-focused cohorts of mentors. This change will allow mentoring responsibilities to be shared among each group’s mentors. Mentoring cohorts allow for mentees to access a broader knowledge base while enabling them to form connections and help other members of the group. Sign up as a mentor or mentee here.<

About the author

Ell Marquez has been part of the open-source family for a few years now. In this time, she has found the support needed from her mentorship relationships to grow from a Linux Administrator to an OpenStack technical trainer at Rackspace. Recently, she took the leap to join Linux Academy as a technical evangelist.

Have you been a mentor or mentee? Superuser wants to hear your story, get in touch: editorATsuperuser.org

Photo // CC BY NC

The post The power of a moment: Making a difference through mentoring appeared first on Superuser.

by Ell Marquez at July 16, 2018 03:13 PM

Cisco Cloud Blog

Building a User Group Everyone Wants to Join

I’m fairly in awe of Lisa-Marie Namphy. She runs the world’s largest OpenStack user group and has had a hand in starting and/or running at least a dozen other tech meetups—all while holding down a very real full-time job...

by Ali Amagasu at July 16, 2018 01:11 PM

About

Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.

Subscriptions

Last updated:
August 16, 2018 04:22 AM
All times are UTC.

Powered by:
Planet