October 23, 2018

OpenStack Superuser

Rocking the innovation revolution with open source

PARIS — Most tech conferences rely on coffee to keep attendees alert, but OVH chairman and founder Octave Klaba picked up a guitar and hammered out a Metallica cover. Klaba then dropped the guitar and returned onstage to kick off the event to a standing-room only audience of 7,000 people, including livestream viewers, to outline the revolution his company is leading.

Innovation for Freedom

Today’s explosion of data is fueling a revolution,  Klaba says. OVH is riding this wave by changing their motto from “Innovation is Freedom” to “Innovation for Freedom.”  Although they tweaked the shortest word in the slogan, the shift is significant. This subtle change represents a focus on what their customers and their customers’ users want to achieve and leveraging open source and their product portfolio to deliver the innovation that continues to open doors and leave them open.

As a first step, OVH has been working closely with their four million clients to learn about their use cases, gather feedback and develop four “universes” that encompass their use cases and product portfolio.

OVH Market

  • Digital toolbox for companies with 20-30 employees. These organizations need to be able to work better together and work better with their clients. They will have access IP, email services and customer relation management. The toolbox improves productivity and provides a digital workflow.

OVHSpirit

  • Core of legacy activity. This is infrastructure for people who are into hardware and networks so they have the tools they need to build their private cloud.

OVHStack

  • This is the OpenStack-powered public cloud intended for dev-ops—it’s an API-driven world. This is growing everyday and while the OVH may be behind the big three in the United States, Klaba says they are catching up. OVHEnteprise
  • OVHEnterprise is for big companies who need OVHSpirit or OVHStack, but require a much larger scale.

“You can grow your experience depending on where you are in your digital transformation journey,” Klaba said. “With the universes, we have the foundation to address the specificities of each partner so that you are successful too.” While this strategy is a worldwide initiative, Kalab assured the audience that the execution would remain localized to account for regional legal restrictions.

Klaba turned to the audience and asked if anyone did not see themselves represented by the new strategy. Although zero hands were raised, he went the extra step of putting his email address on the keynote screen to welcome any feedback or concerns.

Retaining the DNA

To connect Klaba’s vision for to day-to-day activities, OVH CEO Michel Paulin underscored how the personality of OVH remain constant despite the shift in strategy. He assured attendees that OVH’s rapid growth will continue, delivering the best cloud at the best prices.

“Thanks to this DNA, OVH’s cloud is different, it’s smart,” Paulin said as he explained how “smart” was an acronym that represented their innovation-driven approach to cloud that is distributed across 28 data centers worldwide.

The OVH cloud strategy is:

  • Simple, easy-to-deploy. The four universes allow users to have all of the tools they need to deploy and move applications easily.
  • Multi-local as OVH has implementations for support in four continents and plus partnerships.
  • Accessible.
  • Reversible – your cloud should be flexible and liberated. OVH works with an open cloud foundation to ensure that the cloud is open so that customers have choice.
  • Transparent: OVH shoulders the responsibility for data – their customers’ data and their customers’ clients’ data.

Turning to the audience, Paulin reminded them that this sense of innovation should empower them. “Migrating your apps should not imprison you in an irreversible model,” he said.

Turning to their own innovation, he discussed how large-scale water cooling is a strategic initiative that they are investing in, as well as robots in their server building factories and ongoing research and development. OVH has invested 300 million euros (roughly $USD344 million) to continue this growth. He assured the audience that they will continue the growth without losing their DNA, saying that the investment today is the growth of tomorrow.

“The OVH team is mobilized,” he said. “With you, thanks to you, we’re going to be disruptive. It’s going to be seriously, seriously disruptive.”

Cool, clear water

Most people know OVH as a leader in cloud services, with a substantial OpenStack public cloud footprint. What they don’t know is that OVH also makes their own physical servers – a significant feat that only a handful of companies (tech titans like Google, Microsoft and Facebook) are bold enough to attempt.

Franois Sterin, OVH EVP chief industrial officer took the stage to discuss the innovation behind their hardware, announcing that they have recently produced their 1 millionth server. They also created a new robotized factory to sustain their future server production needs, giving them the velocity required to bring features faster to their users.

“Our teams are motivated and we are playing on a worldwide stage,” Sterlin said.

His proof lies in the water cooling. To create a more efficient energy consumption model, OVH relies on cooling processors with water to overcome the density of heat which is needed even more with use cases like artificial intelligence (AI) and big data which consume an increased amount of CPUs.

“We just invented autonomous bays, taking the water cooling behind the rack and added cooling doors so the servers are completely independent from the outside environment,” he said. “Agility, simplicity – this is all in our model. This is why we are called industrial and not just infrastructure.”

“Forever Trust in Who We Are, Nothing Else Matters”

To close out the keynotes, Klaba circled back to the inspiration behind OVH’s innovative approach to technology.

When he tells people about his company and his goals, people generally sympathize, but peg him as another hopeless dreamer. It’s seen as impossible to competing with giants while being based in Europe.

“Everyday there’s a challenge and yes, it’s work and it’s not always easy,” he said. “Europeans need to change the paradigm and change the way we work, not just replicating the US or Asia. Attempting the impossible is not crazy.”

Connecting to open source, Klaba discussed how open source is an ecosystem that is based on a sense of trust that cannot be achieved with proprietary solutions.

“A standard belonging to the community will generate more trust than what a proprietary solution can get on the stock exchange,” he said. “Here there’s momentum and we need to create something and we need to create it together.” He went on to encourage the creation of a virtual European giant of the Internet, based on a network of smaller European players linked by trust more than capitalistic ties. People may think this is a crazy objective, but it was definitely crazy to start OVH out of nothing in 1999 and growing it to the giant it is today.

To drive the point home, Klaba launched into a guitar solo and was then joined by the other band members for the only appropriate keynote closing: a hard-charging rendition of  Metallica’s “Nothing Else Matters.”

#impossible n’est pas fou! –#OVHsummit 2018 Tks Octave!! pic.twitter.com/HozIMortfL

— Vincent Carre (@vincarre) October 18, 2018

The post Rocking the innovation revolution with open source appeared first on Superuser.

by Allison Price at October 23, 2018 02:00 PM

Trinh Nguyen

At the OpenStack Korea User Group last Friday (19 Oct 2018)

Last Friday, I had an opportunity to tell the story of Searchlight to the OpenStack Korea User Group in Seoul. My ultimate goal is to attract new contributors and to revive Searchlight. In just 30 minutes I worked people through the history of Searchlight, its architecture, and the current situation. Everybody seems to get the idea of why Searchlight needs their help. Even though not all of the attendants could understand my English, with the help of Ian Y. Choi, the organizer, and core of the OpenStack Docs and I18N team, the communication was great. Hopefully, I will have another chance to discuss with everybody more about Searchlight.

by Trinh Nguyen (noreply@blogger.com) at October 23, 2018 09:00 AM

October 22, 2018

OpenStack Superuser

Inspiring the next generation of contributors to open infrastructure

Open source is fueled by the ongoing arrival of new contributors who offer fresh talent and diverse perspectives. Mentorship programs are critical in inspiring this next generation of contributors, and they enable those more experienced within a community to give back. The number and variety of mentorship programs that serve the OpenStack community is impressive, designed to fit a vast range of time and resource commitments and address the needs of newcomers, regardless of their entry point into the community.

Participants at the mentoring session of the Vancouver Summit.

One of these programs—the Speed Mentoring Workshop—was kicked off by the Women of OpenStack at the Austin Summit, and has since become a mainstay at the summits. Usually held towards the beginning of each conference, it’s a great way for newcomers to kick off the week, and it gives mentors a way to ‘pay it forward’ without an extensive time commitment. Featuring multiple 15-minute rotational rounds across career, community and technical tracks, these workshops are designed to address a wide range of needs and interests among those new to the community, or perhaps new to different teams and groups within the larger OpenStack community.

How to get involved

As we look forward to the OpenStack Summit in Berlin, we’re excited about hosting another workshop there to bring mentors and mentees together. These speed mentoring workshops would not be possible without a wealth of top-notch mentors who are generous of their time, knowledge and expertise, nor eager mentees willing to dive in, roll up their sleeves and contribute to the vibrant OpenStack community. Please join us and participate—we look forward to seeing you there!

OpenStack Summit Berlin

Speed Mentoring Lunch

Tuesday, November 13, 12:30-1:40 pm

Hall 7, Level 1, Room 7.1b / London 1

Click here for more details!

 

What people are saying about these programs

These speed mentoring sessions allow attendees to make contact with individuals willing to answer their questions long past the end of the session,” Ell Marquez, mentee.

“A healthy community survives through its members, which is why the speed mentoring sessions actively prepare go-getting team plays and future leaders. It also reminds me of my responsibilities towards others; how to build a healthy community,” Armstrong Foundjem, open source advocate.

“Passing on the wisdom learned from years of experience is an important element of this speed mentoring event. And both the mentor and mentee benefit from continuing and sustaining open source knowledge,” Nithya Ruff, senior director, Comcast Open Source Practice & Board Director, Linux Foundation.

“The breadth of people participating as mentors reflects the interest of our fellow stackers to help the next generation of stackers feel part of the community. The mentoring process offered during the OpenStack Summit is a user-friendly means to present fellow stackers with the tools, technologies and human connections to ease their growth in the community, find the right project, SIG or community to best use the talents they are offering,” Martial Michel, Ph.D., Chief Scientific Officer at Data Machines Corp., OpenStack Scientific Special Interest Group co-chair.

“Last year, Nalee Jang, a previous Korea user group leader guided me to attend a Women-of-OpenStack event during Boston Summit 2017. Thanks to her, I am now a successful leader in Korea user group. It has been great to share how diversity has affected my community career, and how to get more involved in OpenStack projects like Internationalization team (another diversity part),” Ian Y. Choi, mentor.

About the author
Nicole Huesman is a community & developer advocate at Intel. In the role, she works to ncrease awareness and strengthen impact of Intel’s role in open source across cloud, containers, IoT, robotics, and web, through solid marketing strategies and cohesive storytelling.

The post Inspiring the next generation of contributors to open infrastructure appeared first on Superuser.

by Nicole Huesman at October 22, 2018 02:08 PM

October 19, 2018

OpenStack Superuser

Pairing OpenStack and open source MANO for NFV deployments

OpenStack for NFV

As we know, OpenStack is mainly known to be the largest pool of open source projects which collectively form the software platform for cloud computing infrastructure. This infrastructure is used widely in private cloud use cases by many enterprises. After an introduction of NFV by ETSI, OpenStack has emerged as a key infrastructure platform for NFV. In most of the NFV deployments, OpenStack is used at VIM (virtual infrastructure manager) layer to give a standardized interface for managing, monitoring and assessing all resources within NFV infrastructure.

Various OpenStack projects (like Tacker, Neutron, Nova, Astara, Congress, Mistral, Senlin, etc.) are capable of managing virtualized infrastructure components of NFV environment. As an example, Tacker is utilized to build generic VNF manager (VNFM) and NFV orchestrator (NFVO) which helps in deployment and operation of VNFs within NFV infrastructure. Additionally, integration of OpenStack projects introduces various features to NFV infrastructure. Features include performance features like huge pages, CPU Pinning, NUMA topology and SR-IOV; service function chaining, networking slicing, scalability, high availability, resiliency and multisite enablement.

Telecom service providers and enterprises have implemented their NFV environment with OpenStack: AT&T, China Mobile, SK Telecom, Ericsson, Deutsche Telekom, Comcast, Bloomberg, etc.

Open Source Mano (OSM) for NFV

The MANO layer is responsible for orchestration and complete life cycle management of hardware resources and virtual network functions (VNFs). In other words, MANO layer coordinate NFV Infrastructure (NFVI) resources and map them efficiently to various VNFs. There are various options available as three-dimensional software stack for MANO. But ETSI hosted OSM is largely preferred due to large activity at the community level, highly mature framework, production readiness, easy to initiate and constant feeding of use cases by members.

Virtual Network Functions (VNFs), forms a network services may need updates for feature addition or patch for functionalities. OSM provides a method to invoke the VNF upgrade operation with minimal impact in the running network service.

With the continuous community support and involvement for feature innovation, OSM has now evolved to bring CI/CD (continuous integration and continuous delivery) framework at MANO layer.

The latest release (four) of OSM brought large set of features and enhancements to OSM framework which has impacted functionality, user experience, and maturity which enables various enhancements for NFV MANO from usability and interoperability perspective.

OSM has steadily adopted the cloud-native principles and can be easily deployed in the cloud as installation is container-based and run with the help of container orchestration engine. A new northbound interface is introduced which is aligned with ETSI NFV specification SOL005 provides dedicated control of the OSM system. Monitoring and closed‐loop capabilities have also been enhanced.

The next version of OSM release 5 is expected to be launch in November 2018 and arrive bundled with more 5G-related features, like network slicing and container-based VNFs.

Why OpenStack + open source MANO for the MANO layer in NFV?

Both OpenStack and OSM  have a large community that have a rapid pace for innovating NFV and high contributions by companies to enhance current features and develop new capabilities for core projects under it.

In the case of NFV, OpenStack standardized interfaces between NFV elements and infrastructure. OpenStack is used for commercial solutions offerings by companies like Canonical/Ubuntu, Cisco, Ericsson, Huawei, IBM, Juniper, Mirantis, Red Hat, Suse, VMware and Wind River. A large percentage of VIM deployments are based on OpenStack due to the simplicity in handling and operating various projects targeted towards providing full potential storage, compute and networking for NFVi.

With last two releases (three and four,) OSM has evolved a lot to support integration for cloud-native approach by enabling CI/CD frameworks into orchestration layers. Cloud readiness involvement of OSM is the key benefit along with OpenStack which has proven architecture for private as well as public clouds. OSM deployment into NFV infrastructure has become very lean where one can start with importing docker containers into production. On the other hand, OpenStack is known for enabling simplicity to manage virtualized and containerized infrastructure. Organizations can realize the full benefits from integration as NFV MANO using OSM and OpenStack due to lean and simple management and deployment.

References

https://www.openstack.org/assets/presentation-media/Achieving-end-to-end-NFV-with-OpenStack-and-Open-Source-MANO.pdf

https://osm.etsi.org/images/OSM-Whitepaper-TechContent-ReleaseFOUR-FINAL.pdf 

https://www.openstack.org/assets/telecoms-and-nfv/OpenStack-Foundation-NFV-Report.pdf


Article based on session by Gianpietro Lavado (solution architect, Whitestack) at OpenStack Summit 2018, Vancouver. He is leading contributor to ETSI Open Source MANO as well.

For more on NFV and OpenStack, check out the dedicated track at the upcoming Summit Berlin.

About the author

Sagar Nangare,a digital strategist at Calsoft Inc., is a marketing professional with over seven years of experience of strategic consulting, content marketing and digital marketing. He’s an expert in technology domains like security, networking, cloud, virtualization, storage and IoT.

This post first appeared on the Calsoft blog.

The post Pairing OpenStack and open source MANO for NFV deployments appeared first on Superuser.

by Superuser at October 19, 2018 02:06 PM

Chris Dent

Placement Update 18-42

After a gap from when I was away last week, here's this week's placement update. The situation this week remains much the same as last week: focus on specs and the bigger issues associated with extraction.

Most Important

The major factors that need attention are managing database migrations and associated tooling and getting the ball rolling on properly producing documentation. More on both of these things in the extraction section below.

What's Changed

mnaser found an issue with the migrations associated with consumer ids. A fix was created in nova and ported into placement but it raised some questions on what to do with those migrations in the extracted placement. Some work also needs to be done to check to make sure the solutions will work in postgresql, as it might tickle the way it is more strict about group by clauses.

Bugs

Specs

There's a spec review sprint this coming Tuesday. This may be missing some newer specs because I got exhausted keeping tabs on the ones that already exist.

Main Themes

Making Nested Useful

Work on getting nova's use of nested resource providers happy and fixing bugs discovered in placement in the process. This is creeping ahead, but feels somewhat stalled out, presumably because people are busy with other things.

I feel like I'm missing some things in this area. Please let me know if there are others. This is related:

Extraction

There continue to be three main tasks in regard to placement extraction:

  1. upgrade and integration testing
  2. database schema migration and management
  3. documentation publishing

The upgrade aspect of (1) is in progress with a patch to grenade and a patch to devstack. This is very close to working. A main blocker is needing a proper tool for managing the creation and migration of database tables (more below).

My experiments with using gabbi-tempest are getting a bit closer.

Successful devstack is dependent on us having a reasonable solution to (2). For the moment a hacked up script is being used to create tables. Ed has started some work on moving to alembic.

We have work in progress to tune up the documentation but we are not yet publishing documentation (3). We need to work out a plan for this. Presumably we don't want to be publishing docs until we are publishing code, but the interdependencies need to be teased out.

Other

Various placement changes out in the world.

End

Hi!

by Chris Dent at October 19, 2018 12:00 PM

October 18, 2018

OpenStack Superuser

How open source communities are coming together to build open infrastructure

At the upcoming Summit Berlin, you’ll find a large contingent of open-source projects meeting up to work together. They include Ansible, Ceph, Docker, Kata Containers, Kubernetes, ONAP, OpenStack, Open vSwitch, OPNFV and Zuul. Also check out the open source community track which covers community management, diversity and inclusion, mentoring, open source governance, ambassadors and roadmap development.

Here are a few picks from the packed schedule of 200 sessions plus workshops:

Kubernetes Administration 101: From zero to hero

You already know Openstack and some Docker but are new to Kubernetes? This daylong hands-on training will teach you the main concepts and daily administration tasks of Kubernetes and boost to your career.

Taught by Laszlo Budai, Component Soft Ltd., it will cover Linux containers and Kubernetes, accessing Kubernetes and access control, workloads, accessing applications, persistent storage. Space is limited, RSVP is required. Details here.

The evolution of Open vSwitch integration for OpenStack

Open vSwitch (OVS) has been an important component of the most commonly used networking backend for OpenStack Neutron for several years. Both the OpenStack and Open vSwitch projects have evolved quite a bit. OVN (Open Virtual Network) is a new implementation of virtual networking from Open vSwitch that can be used by OpenStack, but also other projects such as Kubernetes.

This session with Daniel Alvarez Sanchez and Numan Siddique of Red Hat walks through the journey of OpenvSwitch in OpenStack and covers the latest state of the OVN integration with OpenStack. We will discuss how OVN differs from the original OVS and OpenStack integration, as well as how to migrate an existing deployment to OVN. Details here.

Zuul at BMW: Large scale automotive software development

Since the introduction of software in cars, the complexity of automotive software is constantly rising. BMW manages parts of that complexity with Continuous Integration (CI) systems by automating all stages of the software lifecycle. But some huge software projects, like autonomous driving, start to be limited by the performance of available CI solutions.Tobias Henkel will give an overview of the CI requirements for software projects at BMW and how the automaker uses Zuul to develop software at large scale. Details here.

Open source orchestrators for NFV – What’s going on?

There are so many orchestrators, that operators wonder whether it makes sense to build one or choose from competing models including ONAP, OSM, OPNFV, Tacker, TOSCA, YAML and NETCONF.In the rapidly changing landscape of Open Source orchestration for NFV, it’s easy to get confused about which project is focused on what, what are that communities strengths, weaknesses and key areas of focus.

Join Vanessa Little, Layer123 NFV Advisory Panel, committee member, for an interactive session that offers a clear-eyed view on what’s out there and where these projects are headed. Details here.

Artificial intelligence-based container monitoring, healing and optimizing

Monitoring container ecosystem becomes critical in terms of large business applications with complex use cases, making it challenging for the human brains to troubleshoot problems. Though there are traditional monitoring tools available for containers, recurring problems caused by business use case flow cannot be monitored or healed in traditional Docker monitoring systems.

With the help of AI, containers can be monitored in a way where operations can inject rules depending on their use case and help save sapiens from troubleshooting. Moreover, AI provides flexibility for use-case architects to define their commonly known problems in a Docker environment and define rules accordingly to mitigate them dynamically by better prediction algorithms and gradually optimizing the containers in the long run.

This presentation from Cisco’s Sreekrishna Senthilkumar, Aman Sinha and Sachin Joshi offers knowledge about making use of the benefits of AI in container ecosystem in solving business aspects of container monitoring and healing. Details here.

Spectre/Meltdown at eBay Classifieds Group: Rebooting 80,000 cores

eBay Classifieds Group has a private cloud distributed in two geographical regions (plans for a tertiary zone), around 1,000 hypervisors and a capacity of 80,000 cores.

After Spectre and Meltdown, the team needed to patch hypervisors on four availability zones for each region with the latest kernel, KVM version and BIOS updates. During these updates the zones were unavailable and all the instances restarted automatically. The entire process was automated using Ansible playbooks created internally and using the Openstack API to leverage the operations.

Bruno Bompastor and Adrian Joian will walk attendees through all the work done to shut down, update and boot successfully an infrastructure fully patched and without data loss. They’ll also go into the Openstack challenges, missing features and workarounds. The pair will also discuss the management of our SDN (Juniper Contrail) and LBaaS (Avi Networks) when restarting this massive amount of cores. Details here.

// CC BY NC

The post How open source communities are coming together to build open infrastructure appeared first on Superuser.

by Superuser at October 18, 2018 03:48 PM

Fleio Blog

Fleio billing 1.1 adds OpenStack Rocky support and domain name registration

We’ve just released Fleio billing 1.1 introducing support for OpenStack Rocky, domain name registration and many more features. This is the second Fleio stable release and it marks our change in direction to a more general billing solution for cloud computing and classic web hosting solutions. Main new features include: We now officially support Rocky, […]

by adrian at October 18, 2018 06:10 AM

Adam Young

Creating a Self Trust In Keystone

Lets say you are an administrator of an OpenStack cloud. This means you are pretty much all powerful in the deployment. Now, you need to perform some operation, but you don’t want to give it full admin privileges? Why? well, do you work as root on your Linux box? I hope note. Here’s how to set up a self trust for a reduced set of roles on your token.

First, get a regular token, but use the –debug to see what the project ID, role ID, and your User ID actually are:

In my case, they are … long uuids.

I’ll trim them down both for obscurity as well as the make it more legible. Here is the command to create the trust.

openstack trust create --project 9417f7 --role 9fe2ff 154741 154741

Mine returned:

+--------------------+----------------------------------+
| Field              | Value                            |
+--------------------+----------------------------------+
| deleted_at         | None                             |
| expires_at         | None                             |
| id                 | 26f8d2                           |
| impersonation      | False                            |
| project_id         | 9417f7                           |
| redelegation_count | 0                                |
| remaining_uses     | None                             |
| roles              | _member_                         |
| trustee_user_id    | 154741                           |
| trustor_user_id    | 154741                           |
+--------------------+----------------------------------+

On my system, role_id 9fe2ff is the _member_role.

Note that, if you are Admin, you need to explicitly grant yourself the _member_ role, or use an implied role rule that says admin implies member.

Now, you can get a reduced scope token. Unset the variables that are used to scope the token, since you want to scope to the trust now.

$ unset OS_PROJECT_DOMAIN_NAME 
$ unset OS_PROJECT_NAME 
$ openstack token issue --os-trust-id  26f8d2eaf1404489ab8e8e5822a0195d
+------------+----------------------------------+
| Field      | Value                            |
+------------+----------------------------------+
| expires    | 2018-10-18T10:31:57+0000         |
| id         | f16189                           |
| project_id | 9417f7                           |
| user_id    | 154741                           |
+------------+----------------------------------+

This still requires you to authenticate with your userid and password. An even better mechanism is the new Application Credentials API. It works much the same way, but you use an explicitly new password. More about that next time.

by Adam Young at October 18, 2018 02:44 AM

October 17, 2018

OpenStack Superuser

What’s the value of contributing to open source?

Nordix is a new effort to encourage more organizations in Nordic countries to participate in open source and drive global innovation.

Johan Christenson, CEO of City Network, and Chris Price, president of Ericsson Software Technology, shared their personal passion for open source and introduced the organization they recently founded at the OpenStack Days Nordic event.

According to Christenson, very few enterprises engage open source in the region. Most are consuming proprietary software and services, which means they are missing out on innovation and opportunities. Without open source, technology choices and even business models are much more limited.

Christenson reminded the audience of the open-source legacy from the Nordics, including both Linux and MySQL. He also pointed to Ericsson as a shining example when it comes to open source contributors in the region.

But that can be a challenge for Price as he tries to spread the open-source message. When he talks to other organizations about getting involved, they often say “sure, but you’re Ericsson. You have the scope and scale. We can’t do that, because we only have one developer.”

“But there are lots of organizations with ‘one developer,’” said Price. “Our mission is to make it easier for them to get involved and drive this innovation.” And contributing doesn’t just mean code. There are plenty of other ways to get involved, including engaging in community mailing lists or events to make sure the right use cases are represented.

So what’s the value of using and contributing to open source software?

By contributing, you help influence the software roadmap and have more control over landing the features you need for your use case.

But it’s more than simply landing code. If you can articulate your use case and requirements, it’s an opportunity to collaboratively solve your problems. As a result, you don’t just get your feature, but can influence the community’s thinking, which will have a broader impact on future development because they will have more knowledge and context for your use case and approach.

There’s also the benefit of learning how to better operate and use the open source technology. Quoting professor Frank Nagle, who has done a significant amount of research about open source and crowdsourcing at Harvard Business School, “companies who contribute and give back learn how to better use the open-source software in their own environment.”

Nordix will focus on education and making it easier for people to get started contributing. One example is Upstream Institute, a hands-on workshop to help new open source contributors get set up with the right tools and land their first patch. The program was run by the OpenStack Foundation and community volunteers and was hosted alongside OpenStack Days Nordic. The next chance to participate (for free!) will be November 11-12 at the Berlin Summit.

If you are in the Nordic region and want to encourage more open source collaboration, check out www.nordix.org.

The post What’s the value of contributing to open source? appeared first on Superuser.

by Lauren Sell at October 17, 2018 02:07 PM

NFVPE @ Red Hat

Setup an NFS client provisioner in Kubernetes

Setup an NFS client provisioner in Kubernetes One of the most common needs when deploying Kubernetes is the ability to use shared storage. While there are several options available, one of the most commons and easier to setup is to use an NFS server.This post will explain how to setup a dynamic NFS client provisioner on Kubernetes, relying on an existing NFS server on your systems. Step 1. Setup an NFS server (sample for CentOS) First thing you will need, of course, is to have an NFS server. This can be easily achieved with some easy steps: Install nfs package:…

by Yolanda Robla Mota at October 17, 2018 01:14 PM

The Official Rackspace Blog

How Private Cloud as a Service Can Help Your Security Posture

Cloud computing has transformed how organizations need to think about securing their data and assets. As businesses continue to adopt cloud-based initiatives to support artificial intelligence, internet of things, big data and more, they remain apprehensive about protecting critical assets. After all, as our Chief Operations and Product Officer David Meredith noted in a post […]

The post How Private Cloud as a Service Can Help Your Security Posture appeared first on The Official Rackspace Blog.

by Dan Houdek at October 17, 2018 12:00 PM

October 16, 2018

OpenStack Superuser

Vote now for the Berlin Summit Superuser Award

Who do you think should win the Superuser Award for the Berlin Summit? Cast your vote before October 21 at 11:59 p.m. Pacific Standard Time.

When evaluating the nominees for the Superuser Award, take into account the unique nature of use case(s), as well as integrations and applications of OpenStack by each particular team.

Check out highlights from the five nominees and click on the links for the full applications:

  • Adform, CloudServices Team
    “We have three OpenStack deployments for different tiers in seven regions all over the world. In total there are over 4,500 VMs on over 200 hosts. It’s used by several hundred company developers to provide service to millions of users.”
  • City Network’s R&D, Professional Services, Education and Engineering teams
    “We run our public OpenStack based cloud in eight regions across three continents. All of our data centers are interconnected via private networks. In addition to our public cloud, we provide a pan-European cloud for verticals where regulatory compliance is paramount (e.g. banking and financial services, government, healthcare) addressing all regulatory challenges. Over 2,000 users of our infrastructure-as-a-service solutions run over 25,000 cores in production.”
  • Cloud&Heat
    Cloud&Heat developed their own server hardware, water-cooled and operated by OpenStack. The heat of the water is used for energy optimizing in households, offices and commercial enterprises, making a huge use case how OpenStack can save the planet.” Cloud&Heat was nominated by a third party and did not submit a complete application. Judges will take into account their partial application when evaluating the nominees.  More on how they operate here.
  • Linaro
    “Thanks to OpenStack and Ceph we have been able to share hardware with different open source teams and vendors that are trying to make their software multi-architecture aware…In addition, we are changing the culture of cross compiling for Arm architecture, assisting anyone that has to cross compile their Arm binaries to build natively.”
  • ScaleUp Technologies
    We currently have three production OpenStack installations in both Hamburg and Berlin, Germany. Together with another hosting partner we are currently building a third OpenStack cloud in Dusseldorf, Germany. These infrastructures will soon be connected via a dedicated 10-gigabit backbone ring. Earlier this year, we started working on some edge-related activities and are now building OpenStack-based edge clouds on hyper converged hardware for customers.”

Cast your vote here! The deadline is Sunday, October 21 at 11:59 p.m. Pacific Standard Time.

Previous winners include AT&T, CERN, China Mobile, Comcast, NTT Group, the Ontario Institute for Cancer Research and the Tencent TStack Team.

The Berlin Summit Superuser Awards are sponsored by Zenko, the open source multi-cloud data controller.

The post Vote now for the Berlin Summit Superuser Award appeared first on Superuser.

by Superuser at October 16, 2018 05:43 PM

Berlin Superuser Awards Nominee: Adform, CloudServices Team

It’s time for the community to help determine the winner of the OpenStack Berlin Summit Superuser Awards, sponsored by Zenko. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

Adform, CloudServices Team is one of  five nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline October 21 at 11:59 p.m. Pacific Standard Time.

Cast your vote here!

Who is the nominee?

Adform, CloudServices Team

Team members: Edgaras Apšega, Andrius Cibulskis, Donatas Fetingis, Dovilas Kazlauskas, Jonas Nemeikšis, Arvydas Opulskis, Tadas Sutkaitis, Matas Tvarijonas.

How has open infrastructure transformed your business? 

OpenStack, together with other open source solutions, implemented unified user experience in all tiers (dev, staging, production). It allowed users to use the same images and tools for faster and more automated development, deployment and testing. HA was increased significantly (availability zones, regions, storage cluster), and self-service opportunities arose as well. Scaling became fast and easy, and because of projects, resource control is more transparent and security is increased.

At the beginning, we had several hundred manually deployed VMs in different platforms. Now, we have over 4,500 VMs running in seven regions all over the world with HA storage clusters, unified images and flexible self-service for our clients.

How has the organization participated in or contributed to an open infrastructure community? 

We are participating in our local IT community events by sharing our discoveries, experience and infrastructure design. We had few talks about out infrastructure in few conferences. We reported some bug’s, participated in mailing list communications as well.

What open-source technologies does the organization use in its IT environment?

Besides OpenStack, our organization use technologies like CentOS, Kubernetes, Ceph, Prometheus, Zabbix, Grafana, Salt Stack, Puppet, Ansible, Nginx, Haproxy, Foreman, Jenkins, ELK and much more.

What’s the scale of the OpenStack deployment? 

We have three OpenStack deployments for different tiers in seven regions all over the world. In total there are over 4,500 VMs on over 200 hosts. It’s used by several hundred company developers to provide service to millions of users.

What kind of operational challenges have you overcome during your experience with open infrastructure? 

Because we started with vanilla OpenStack, we had to build our own configuration management and deployment mechanism on SaltStack. We had to design and implement HA, self-service tools for our users, metric and alert collection (from infrastructure hosts, compute nodes and VMs itself). Old VMs were on different platforms, so we needed migration tool to move these VMs to OpenStack.

How is this team innovating with open infrastructure? 

Team provides scalable private cloud that runs in seven data centers at three continents as a service. In our infrastructure we are using OpenStack, Kubernetes, Prometheus, Consul, Terraform, Nginx, Jenkins, Ceph and other solutions.

We mixed some technologies in between, for example: some Kubernetes clusters are running on OpenStack and OpenStack metrics exporters run on other bare metal Kubernetes cluster.

In addition to that, we run multi regional load balancers cluster which handles up to 800,000 ops.

How many Certified OpenStack Administrators (COAs) are on your team?

None yet.

Voting is limited to one ballot per person and closes October 21 at 11:59 p.m. Pacific Standard Time.

 

The post Berlin Superuser Awards Nominee: Adform, CloudServices Team appeared first on Superuser.

by Superuser at October 16, 2018 05:30 PM

Berlin Superuser Awards Nominee: City Network

It’s time for the community to help determine the winner of the OpenStack Berlin Summit Superuser Awards, sponsored by Zenko. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

City Network’s research and development, professional services, education and engineering teams are one of five nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline October 21 at 11:59 p.m. Pacific Standard Time.

Cast your vote here!

Who is the nominee?

The team: Marcus Murwall, Tobias Rydberg, Magnus Bergman, Tobias Johansson, Alexander Roos, Johan Hedberg, Joakim Olsson, Emil Sundstedt, Erik Johansson, Joel Svensson, Ioannis Karamperis, Johan Hedberg, Namrata Sitlani, Christoffer Carlberg, Daniel Öhberg, Florian Haas, Adolfo Brandes, Priscila Prado.

How has open infrastructure transformed your business? 

With emphasis on regulatory compliance and data protection, we are a European leader, promoter and enabler of OpenStack-based compliant cloud solutions. These solutions are tailored for regulatory challenged industries such as banks, insurance companies, healthcare and governments. The pace of innovation within these industries has always been dictated by the heavy demand for control, data protection, auditability and other factors specific to the nature of the information they care for. With our OpenStack-based Compliant Cloud solutions, we prove that some of the largest banks, insurance companies and digital identity management companies in the world can increase their pace of innovation while still being regulatory compliant.

How has the organization participated in or contributed to an open infrastructure community? 

  • Our CEO is a member of the OpenStack Foundation Board
  • City Network initiated and is OpenStack Days Nordic three years in a row. We are also involved in OpenStack Days Israel and India and attend multiple OpenStack Days events across the globe.
  • We have participated in every summit for the past six years, the PTGs, and contribute to the working groups; public cloud and the security project.
  • We provide OpenStack training and strive to bridge the OpenStack and Open edX communities through mutual collaboration with the common ambition of providing quality education to everybody with access to a browser and an internet connection.
  • Members of our team have contributed code since 2012, focused on training, documentation, code contribution and bug fixes to various Official projects.

What open-source technologies does the organization use in its IT environment?

We are very pro open source and use it in every case wherever it’s a viable option.

A selection of the open-source technologies we are currently using: CentOS, OpenBSD, Ubuntu, Nginx, Apache, PHP, Python, Ansible, MySQL, Mariadb, Mongodb, Ceph and Open edX.

What’s the scale of the OpenStack deployment? 

We run our public OpenStack based cloud in eight regions across three continents. All of our data centers are interconnected via private networks. In addition to our public cloud, we provide a pan-European cloud for verticals where regulatory compliance is paramount (e.g. banking and financial services, government, healthcare) addressing all regulatory challenges. Over 2,000 users of our infrastructure-as-a-service solutions run over 25,000 cores in production.

What kind of operational challenges have you overcome during your experience with open infrastructure? 

Since we have been running OpenStack as public IaaS, there have been a lot of hurdles to overcome as OpenStack is not yet fully adapted for public clouds. We had to build our own APIs in order to get network connectivity over several sites to work and also we had to add features such as volume copy and the ability to move volumes between sites. We have also had our fair share of issues with upgrading to new OpenStack versions, however we do feel as this process have been getting better with each upgrade.

We’re also embarking a huge OSA conversion this fall which we believe will make things much easier moving forwards.

How is this team innovating with open infrastructure? 

Our true innovation lies in the fact that we have managed to build a global open-source based cloud solution fit for regulatory challenged enterprises. These are enterprises who haven’t really been able to utilize the true potential of cloud computing until they met us.

An enterprise building their own private cloud because they don’t have a choice is one thing. But unless running a cloud is part of their core business, they are not fully focused on their mission.

A regulatory challenged enterprise who is able to utilize cloud computing on a pay-as-you-go model through a vendor, just like any other organization, is a whole other ball game. That organization can focus 100 percent on their core business and stand a fighting chance in our era of digitization.

How many Certified OpenStack Administrators (COAs) are on your team?

None.

 Voting is limited to one ballot per person and closes October 21 at 11:59 p.m. Pacific Standard Time.

 

 

The post Berlin Superuser Awards Nominee: City Network appeared first on Superuser.

by Superuser at October 16, 2018 05:30 PM

Berlin Superuser Awards Nominee: Linaro Datacenter and Cloud Group (LDCG)

It’s time for the community to help determine the winner of the OpenStack Berlin Summit Superuser Awards, sponsored by Zenko. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

Linaro Datacenter and Cloud Group (LDCG) is one of five nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline October 21 at 11:59 p.m. Pacific Standard Time.

Cast your vote here!

Who is the nominee?

The Linaro Datacenter and Cloud group (LDCG) team involved in cloud enablement:

  • Software-defined infrastructure (userspace layers and OpenStack): Marcin Juszkiewicz (RedHat), Xinliang Liu (HiSilicon), Tone Zhang (Arm), Kevin Zhao (Arm), Eugene Xie (Arm), Herbert Nan (HXT), Masahisa Kojima (Socionext), Gema Gomez (Linaro)
  • Linaro developer cloud (DevOps, production cloud): Jorge Niedbalski (Linaro)
  • Server architecture (firmware, kernel, HW enablement): Graeme Gregory (Linaro), Radoslaw Biernacki (Cavium), Ard Biesheuvel (Linaro), Leif Lindholm (Linaro)
  • LDCG director: Martin Stadler (Linaro)

How has open infrastructure transformed your business? 

Open infrastructure is helping enable new architectures and entire ecosystems in our case. Thanks to OpenStack and Ceph we’ve been able to share hardware with different open-source teams and vendors who are trying to make their software multi-architecture aware. OpenStack is also helping with CI/CD across the industry. In addition, we’re changing the culture of cross compiling for Arm architecture, assisting anyone that has to cross compile their Arm binaries to build natively. Metrics about our patches available here: http://patches.linaro.org/team/team-leg/?months=12

Without the Linaro Developer Cloud, many open-source projects that are using the cloud for their build and test capabilities wouldn’t be producing AArch64 binaries.

How has the organization participated in or contributed to an open infrastructure community? 

Linaro has participated in OpenStack events and has been contributing changes to make OpenStack-architecture aware, ensuring equivalent behavior on different architectures. As a public cloud operator we contribute cloud resources to many open source projects including openstack-infra.

Over the past several years we had a presence at PTG events, be part of upstream projects and contribute CI/gate testing when none was available for Arm64. We’ve also been approaching and working with different upstream projects to help them build their own Arm64 binaries without needing our help going forward. Enablement for us is giving projects the tools and access to the hardware they need to be able to be truly multi-architecture aware without relying on us going forward.

What open-source technologies does the organization use in its IT environment?

 Linaro uses and contributes to the Linux Kernel, Debian and Debian derivatives, libvirt, QEMU, Ceph, OpenStack, OpenBMC, CCIX, TianoCore, OpenHPC, Big Data projects, container technologies, Kubernetes and any other project that is required in between to enable all of these technologies.

The big data and data science team has donated two ARM-based developer cloud nodes to Apache Bigtop project for their CI/CD to produce ARM based deb and rpm packages and Docker images. Apache Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. OpenStack instances are also used to test portability of big data and data science projects (Apache Hadoop, Spark, HBase, Hive, Zookeeper, Cassandra, ElasticSearch, Arrow, etc) onto ARM and also Benchmark and optimize them. Big data projects are tested on top of Docker containers, all running on OpenStack.

What’s the scale of the OpenStack deployment? 

We operate three OpenStack clouds on three different geographic locations (UK, US, China) with ~100 Arm64 hosts with CPUs from a variety of different Linaro members (Cavium, HiSilicon, Qualcomm and others) adding up close to 1,600 CPU cores and five terabytes of memory in total. This service is being used by around 70 developers working for a range of external organizations on a variety of work across regions with a varied set of CI/CD pipelines. They are mostly working on open-source projects.

What kind of operational challenges have you overcome during your experience with open infrastructure? 

Our main challenge with OpenStack was that we needed OpenStack working to be able to test it. We have learned about and fixed issues on libvirt, kernel drivers, and firmware. We have been observing and fixing problems related to the VM life cycle on Arm64, no amount of CI/CD is equivalent to running VMs in production with users that rely on them for their operations and CI, so operating a cloud is also giving us a great insight into the problems that are lurking related to long standing VMs and systems.

We have learned about upgrades with Kolla-Ansible and are able to upgrade from a previously existing set up (fast forward upgrade) to the latest release without issues.

Rocky is our first truly interoperable release (2018.02 guidelines).

How is this team innovating with open infrastructure? 

We are enabling as many data center related technologies as we can to work smoothly on Arm64 and be architecture aware.

How many Certified OpenStack Administrators (COAs) are on your team?

None.

 Voting is limited to one ballot per person and closes October 21 at 11:59 p.m. Pacific Standard Time.

 

The post Berlin Superuser Awards Nominee: Linaro Datacenter and Cloud Group (LDCG) appeared first on Superuser.

by Superuser at October 16, 2018 05:29 PM

Berlin Superuser Awards Nominee: ScaleUp Technologies

It’s time for the community to help determine the winner of the OpenStack Berlin Summit Superuser Awards, sponsored by Zenko. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

ScaleUp Technologies is one of  five nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline October 21 at 11:59 p.m. Pacific Standard Time.

Cast your vote here!

Who is the nominee?

ScaleUp Technologies

Team members: Frank Gemein, Christoph Streit, Oliver Klippel, Gihan Behrmann, Julia Streit.

How has open infrastructure transformed your business? 

As ScaleUp is a hosting provider, we started offering a cloud hosting solution back in 2009. After issues with the cloud platform technology that we used back then (regarding licensing, etc.), we began using OpenStack for our cloud services in 2014.

This experience has showed us that it’s better for us to rely on an open-source project such as OpenStack with a very vibrant community, compared to a proprietary solution.

How has the organization participated in or contributed to an open infrastructure community? 

We have also been very interested in giving back to the community, pretty much from the beginning, as we have ourselves received a lot of help and support from the OpenStack community. As we’re a rather small company and team, we do not have enough resources to contribute code to OpenStack. But we have talked about our learnings and experiences in running OpenStack at several occasions (OpenStack summits in Austin, Boston; local OpenStack conferences, etc.)

We have been running the OpenStack meetup group in Hamburg since 2017 and since 2018 we’ve also been also running the meetup group in Berlin. In addition, we have also started offering OpenStack workshops teaching the fundamentals of OpenStack (free of charge to local meetup groups). We have had several of these workshops in 2017 in multiple cities.

What open-source technologies does the organization use in its IT environment?

We use a lot of open source tools: Linux (mostly Ubuntu and Debian), for monitoring and analysis we rely on Check_MK (Nagios) and Elastic Search with Kibana and for communications we rely on tools like Postfix and Mattermost.

What’s the scale of the OpenStack deployment? 

We currently have three production OpenStack installations in both Hamburg and Berlin, Germany. Together with another hosting partner we’re currently building a third OpenStack cloud in Dusseldorf, Germany. These infrastructures will soon be connected via a dedicated 10 gigabit backbone ring.

Earlier this year we started working on some edge related activities and are now building OpenStack based edge clouds on hyper-converged hardware for customers.

What kind of operational challenges have you overcome during your experience with open infrastructure? 

We mostly struggle with having only a small team. Therefore we have worked on ways to involve our other system administrators not working on OpenStack yet, by breaking down tasks. As there are many known open source/Linux tools used in OpenStack, many issues/problems can be fixed by tackling these problems (such as database problems, problems with libvirt/KVM, etc). We have had a presentation about this way of working on one of the last OpenStack Summits.

How is this team innovating with open infrastructure? 

Since we getting involved with OpenStack, we’ve also started working on other new technologies. For example, we are currently working on a managed Kubernetes platform running on top of OpenStack.

How many Certified OpenStack Administrators (COAs) are on your team?

None.

Voting is limited to one ballot per person and closes October 21 at 11:59 p.m. Pacific Standard Time.

 

The post Berlin Superuser Awards Nominee: ScaleUp Technologies appeared first on Superuser.

by Superuser at October 16, 2018 05:29 PM

Pablo Iranzo Gómez

Contributing to OSP upstream a.k.a. Peer Review

Table of contents

  1. Introduction
  2. Upstream workflow
    1. Peer review
    2. CI tests (Verified +1)
    3. Code Review+2
    4. Workflow+1
    5. Cannot merge, please rebase
  3. How do we do it with Citellus?

Introduction

In the article "Contributing to OpenStack" we did cover on how to prepare accounts and prepare your changes for submission upstream (and even how to find low hanging fruits to start contributing).

Here, we'll cover what happens behind the scene to get change published.

Upstream workflow

Peer review

Upstream contributions to OSP and other projects are based on Peer Review, that means that once a new set of code has been submitted, several steps for validation are required/happen before having it implemented.

The last command executed (git-review) on the submit sequence (in the prior article) will effectively submit the patch to the defined git review service (git-review -s does the required setup process) and will print an URL that can be used to access the review.

Each project might have a different review platform, but usually for OSP it's https://review.openstack.org while for other projects it can be https://gerrit.ovirt.org, https://gerrithub.io, etc (this is defined in .gitreview file in the repository).

A sample .gitreview file looks like:

[gerrit]
host=review.gerrithub.io
port=29418
project=citellusorg/citellus.git

For a review example, we'll use one from gerrithub from Citellus project:

https://review.gerrithub.io/#/c/380646/

Here, we can see that we're on review 380646 and that's the link that allows us to check the changes submitted (the one printed when executing git-review).

CI tests (Verified +1)

Once a review has been submitted, usually the bots are the first ones to pick them and run the defined unit testing on the new changes, to ensure that it doesn't break anything (based on what is defined to be tested).

This is a critical point as:

  • Tests need to be defined if new code is added or modified to ensure that later updates doesn't break this new code without someone being aware.
  • Infrastructure should be able to test it (for example you might need some specific hardware to test a card or network configuration)
  • Environment should be sane so that prior runs doesn't affect the validation.

OSP CI can be checked at 'Zuul' http://zuul.openstack.org/ where you can 'input' the number for your review and see how the different bots are running CI tests on it or if it's still queued.

If everything is OK, the bot will 'vote' your change as Verified +1 allowing others to see that it should not break anything based on the tests performed

In the case of OSP, there's also third-party CI's that can validate other changes by third party systems. For some of them, the votes are counting towards or against the proposed change, for others it's just a comment to take into account.

Even if sometimes you know that your code is right, there's a failure because of the infrastructure, in those cases, writing a new comment saying recheck, will schedule a new CI test run.

This is common usually during busy periods when it's harder for the scheduler to get available resources for the review validation. Also, sometimes there are errors in the configuration of CI that must be fixed in order to validate those changes.

Note: you can run some of the tests on your system to validate faster if you've issues by running tox this will setup virtual environment for tests to be run so it's easier to catch issues before upstream CI does (so it's always a good idea to run tox even before submitting the review with git-review to detect early errors).

This is however not always possible as some changes include requirements like testing upgrades, full environment deployments, etc that cannot be done without the required preparation steps or even the infrastructure.

Code Review+2

This is probably the 'longest' process, it requires peers to be added as 'reviewer' (you can get an idea on the names based on other reviews submitted for the same component) or they will pick up new reviews as the pop un on notification channels or pending queues.

On this, you must prepare mentally for everything... developers could suggest to use a different approach, or highlight other problems or just do some small nit comments to fixes like formating, spacing, var naming, etc.

After each comment/change suggested, repeat the workflow for submitting a new patchset, but make sure you're using the same review id (that's by keeping the commit id that is appended): this allows the Code Review platform to identify this change as an update to a prior one, and allow you for example to compare changes across versions, etc. (and also notify the prior reviewers of new changes).

Once reviewers are OK with your code, and with some 'Core' developers also agreeing, you'll see some voting happening (-2..+2) meaning they like the change in its actual form or not.

Once you get Code Review +2 and with the prior Verified +1 you're almost ready to get the change merged.

Workflow+1

Ok, last step is to have someone with Workflow permissions to give a +1, this will 'seal' the change saying that everything is ok (as it had CR+2 and Verified+1) and change is valid...

This vote will trigger another build by CI, and when finished, the change will be merged into the code upstream, congratulations!

Cannot merge, please rebase

Sometimes, your change is doing changes on the same files that other programmers did on the code, so there's no way to automatically 'rebase' the change, in this case the bad news is that you need to:

git checkout master # to change to master branch
git pull # to push latest upstream changes
git checkout yourbranch # to get back to your change branch
git rebase master # to apply your changes on top of current master

After this step, it might be required to manually fix the code to solve the conflicts and follow instructions given by git to mark them as reviewed.

Once it's done, remember to do like with any patchset you submited afterwards:

git commit --amend # to commit the new changes on the same commit Id you used
git-review # to upload a new version of the patchset

This will start over the progress, but will, once completed to get the change merged.

How do we do it with Citellus?

In Citellus we've replicated more or less what we've upstream... even the use of tox.

Citellus does use https://gerrithub.io (free service that hooks on github and allows to do PR)

We've setup a machine that runs Jenkins to do 'CI' on the tests we've defined (mostly for python wrapper and some tests) and what effectively does is to run tox, and also, we do use https://travis-ci.org free Tier to repeat the same on other platform.

Tox is a tool that allows to define several commands that are executed inside python virtual environments, so without touching your system libraries, it can get installed new ones or removed just for the boundaries of that test, helping into running:

  • pep8 (python formating compliance)
  • py27 (python 2.7 environment test)
  • py35 (python 3.5 environment test)

The py tests are just to validate the code can run on both base python versions, and what they do is to run the defined unit testing scripts under each interpreter to validate.

For local test, you can run tox and it will go trough the different tests defined and report status... if everything is ok, it should be possible that your new code review passes also CI.

Jenkins will do the +1 on verified and 'core' reviewers will give +2 and 'merge' the change once validated.

Hope you enjoy!

Pablo

by Pablo Iranzo Gómez at October 16, 2018 05:32 AM

October 15, 2018

OpenStack Superuser

Meet the newest members of the Superuser Editorial Advisory Board

We’re excited to announce three new members of our Board. They’ll be joining the other five members to weigh in on this edition of the Superuser Awards as well as contributing ideas and shaping editorial content.

Here’s a bit more about the new members, in alphabetical order:

Mark Korondi

“When I’m consulting with clients I recommend FOSS solutions for all their needs when possible. And building datacenter infrastructure is possible with OpenStack…I am keen to work with them on these kind of projects and integrations into their current system. I especially like the concept of the open, standardized platform APIs which makes their architecture future-proof and removes a lot of vendor-lock-in. As an OpenStack enthusiast, I attend the local meetup groups and even organize quite a large event, the OpenStack CEE days in Budapest. I take the opportunity to explain and help to understand why the emphasis is on the _open_ infrastructure.” He’s also been involved with OpenStack Swift and taught at the  OpenStack Upstream Institute
Linkedin profile
Twitter: kmarc

Trinh Nguyen

“For a very long time in Vietnam, the big companies and government control the Internet infrastructure because of their resources and capital. That stops many potential individuals and small companies from growing and offering innovative services that can make it a better world. Open infrastructure will open a lot of opportunities for the under privileges people in Vietnam and other countries. As a technologist, I see myself have the responsibility to help to push the open infrastructure movement forward and change the world.” He’s worked on Tacker, Freezer, Fenix, Searchlight, Kolla and is currently the project team lead for Searchlight.
His website: http://me.dangtrinh.com

Ben Silverman

“My interests vary from open hardware architecture to open infrastructure platforms (OpenStack mostly) to open NFV and telco specific operating environments. I am hands-on with many of these platforms since I lead a solution and sales engineering group inside of my company as well as by night I’m frequently busy in my own labs…I’ve been involved with OpenStack for the past 5 years and have contributed heavily to the OpenStack Foundation’s documentation. Most recently I was tasked with re-writing most of the content in the OpenStack Architecture Guide… I am also active with OpenStack Edge Telco and Use Case special interest groups and involved directly and indirectly with the architecture discussions that bridge OpenStack Edge concepts with CORD and OPNFV.
LinkedIn profile
Twitter: @bensilverm
Books “OpenStack for Architects,” “OpenStack: Design and Implement Cloud Infrastructure”

 

 

We are always interested in content about open infrastructure – get in touch: editorATopenstack.org

The post Meet the newest members of the Superuser Editorial Advisory Board appeared first on Superuser.

by Nicole Martinelli at October 15, 2018 02:03 PM

Trinh Nguyen

Searchlight weekly report - Stein R-26



It's been a busy week so there's not much work has been done this week (Stein R26, Oct 08-12). Here is some news from the community this week that somewhat affects the Searchlight project:
  • Assigning new liaisons to projects: basically, the TC will assign who takes care of which project. That person will update on this page the project's statuses, important activities etc.
  • Proposed changes for library releases: in short, for each cycle-with-intermediary library deliverable, if it was not released during that milestone timeframe, the release team would automatically generate a release request early in the week of the milestone deadline.
  • Proposed changes for cycle-with-milestones deliverables: to summarize the discussion:
    • No longer be required to request a release for each milestone
    • Beta releases would be optional
    • Release candidates would still require a tag. Need PTL or release liaison's "+1"
    • Requiring a person for each team to add their name to a "manifest" of sorts for the release cycle
    • Rename the cycle-with-milestones release model to something like cycle-with-rc

This week we will continue working on these tasks:

1. Complete these stories:

by Trinh Nguyen (noreply@blogger.com) at October 15, 2018 06:47 AM

October 12, 2018

OpenStack Superuser

Kayobe and Rundeck: Operational hygiene for infrastructure as code

Rundeck is an infrastructure automation tool, aimed at simplifying and streamlining operational process when it comes to performing a particular task, or ‘job’. That sounds pretty grand, but basically what it boils down to is being able to click a button on a web-page or hit a simple API in order to drive a complex task; For example – something that would otherwise involve SSH’ing into a server, setting up an environment, and then running a command with a specific set of options and parameters which, if you get them wrong, can have catastrophic consequences.

This can be the case with a tool as powerful and all-encompassing as Kayobe. The flexibility and agility of the CLI is wonderful when first configuring an environment, but what about it when it comes to day two operations and business-as-usual (BAU)? How do you ensure that your cloud operators are following the right process when re-configuring a service? Perhaps you introduced ‘run books’, but how do you ensure a rigorous degree of consistency to this process? And how do you glue it together with some additional automation? So many questions!

Of course, when you can’t answer any or all of these questions, it’s difficult to maintain a semblance of ‘operational hygiene’. Not having a good handle on whether or not a change is live in an environment, how it’s been propagated, or by whom, can leave infrastructure operators in a difficult position. This is especially true when it’s a service delivered on a platform as diverse as OpenStack.

Fortunately, there are applications which can help with solving some of these problems – and Rundeck is precisely one of those.

Integrating Kayobe

Kayobe has a rich set of features and options, but often in practice – especially in BAU – there’s perhaps only a subset of these options and their associated parameters that are required. For our purposes at StackHPC, we’ve mostly found those to be confined to:

  • Deployment and upgrade of Kayobe and an associated configuration;
  • Sync. of version controlled kayobe-config;
  • Container image refresh (pull);
  • Service deployment, (re)configuration and upgrade.

This isn’t an exhaustive list, but these have been the most commonly run jobs with a standard set of options i.e those targetting a particular service. A deployment will eventually end up with a ‘library’ of jobs in Rundeck that are capable of handling the majority of Kayobe’s functionality, but in our case and in the early stages we found it useful to focus on what’s immediately required in practical terms, refactoring and refining as we go.

Structure and usage

Rundeck has no shortage of options when it comes to triggering jobs, including the ability to fire off Ansible playbooks directly – which in some ways makes it a poor facsimile of AWX. Rundeck’s power though comes from its flexibility, so having considered the available options, the most obvious solution seemed to be utilizing a simple wrapper script around kayobe itself, which would act as the interface between the two – managing the initialization of the working environment and capable of passing a set of options based on a set of selections presented to the user.

Rundeck allows you to call jobs from other projects, so we started off by creating a library project which contains common jobs that will be referenced elsewhere such as this Kayobe wrapper. The individual jobs themselves then take a set of options and pass these through to our script, with an action that reflects the job’s name. This keeps things reasonably modular and is a nod towards DRY principles.

The other thing to consider is the various ‘roles’ of operators (and I use this in the broadest sense of the term) within a team, or the different hats that people need to wear during the course of their working day. We’ve found that three roles have been sufficient up until now – the omnipresent administrator, a role for seeding new environments, and a ‘read-only’ role for BAU.

Finally it’s worth mentioning Rundeck’s support for concurrency. It’s entirely possible to kick off multiple instances of a job at the same time, however this is something to be avoided when implementing workflows based around tools such as Kayobe.

With those building blocks in place we were then able to start to build other jobs around these on a per-project (environment) basis.

Example

Let’s run through a quick example, in which I pull in a change that’s been merged upstream on GitHub and then reconfigure a service (Horizon).

The first step is to synchronize the version-controlled configuration repository from which Kayobe will deploy our changes. There aren’t any user-configurable options for this job (the ‘root’ path is set by an administrator) so we can just go ahead and run it:

 

The default here is to ‘follow execution’ with ‘log output’, which will echo the (standard) output of the job as it’s run:

Note that this step could be automated entirely with webhooks that call out to Rundeck to run that job when our pull request has been merged (with the requisite passing tests and approvals).

With the latest configuration in place on my deployment host, I can now go ahead and run the job that will reconfigure Horizon for me:

 

And again, I can watch Kayobe’s progress as it’s echoed to stdout for the duration of the run:

Note that jobs can be aborted, just in case something unintended happens during the process.

Of course, no modern DevOps automation tool would be complete without some kind of Slack integration. In our #rundeck channel we get notifications from every job that’s been triggered, along with its status:

Once the service reconfiguration job has completed, our change is then live in the environment – consistency, visibility and ownership maintained throughout.

CLI

For those with an aversion to using a GUI, as Rundeck has a comprehensive API you’ll be happy to learn that you can use a CLI tool in order to interact with it and do all of the above from the comfort of your favourite terminal emulator. Taking the synchronisation job as an example:

[stack@dev-director nick]$ rd jobs list | grep -i sync
2d917313-7d4b-4a4e-8c8f-2096a4a1d6a3 Kayobe/Configuration/Synchronise

[stack@dev-director nick]$ rd run -j Kayobe/Configuration/Synchronise -f
# Found matching job: 2d917313-7d4b-4a4e-8c8f-2096a4a1d6a3 Kayobe/Configuration/Synchronise
# Execution started: [145] 2d917313-7d4b-4a4e-8c8f-2096a4a1d6a3 Kayobe/Configuration/Synchronise <http://10.60.210.1:4440/project/AlaSKA/execution/show/145>
Already on 'alaska-alt-1'
Already up-to-date.

Conclusions and next steps

Even with just a relatively basic operational subset of Kayobe’s features being exposed via Rundeck, we’ve already added a great deal of value to the process around managing OpenStack infrastructure as code. Leveraging Rundeck gives us a central point of focus for how change, no matter how small, is delivered into an environment. This provides immediate answers to those difficult questions posed earlier, such as when a change is made and by whom, all the while streamlining the process and exposing these new operational functions via Rundeck’s API, offering further opportunities for integration.

Our plan for now is to try and standardise – at least in principle – our approach to managing OpenStack installations via Kayobe with Rundeck. Although it’s already proved useful, further development and testing is required to refine workflow and to expand its scope to cover operational outliers, and on the subject of visibility the next thing on the list for us to integrate is ARA.

If you fancy giving Rundeck a go, getting started is surprisingly easy thanks to the official Docker images as well as some configuration examples. There’s also this repository which comprises some of our own customisations, including minor fix for the integration with Ansible.

Kick things off via docker-compose and in a minute or two you’ll have a couple of containers, one for Rundeck itself and one for MariaDB:

nick@bluetip:~/src/riab> docker-compose up -d
Starting riab_mariadb_1 ... done
Starting riab_rundeck_1 ... done
nick@bluetip:~/src/riab> docker-compose ps
     Name                  Command             State                Ports
---------------------------------------------------------------------------------------
riab_mariadb_1   docker-entrypoint.sh mysqld   Up      0.0.0.0:3306->3306/tcp
riab_rundeck_1   /opt/boot mariadb /opt/run    Up      0.0.0.0:4440->4440/tcp, 4443/tcp

Point your browser at the host where you’ve deployed these containers and port 4440, and all being well you’ll be struck with the login page.

Feel free to reach out on Twitter or via IRC (#stackhpc on Freenode) with any comments or feedback!

This post first appeared on the blog of Stack HPC.

Superuser is always interested in community content, email: editorATopenstack.org

// CC BY NC

The post Kayobe and Rundeck: Operational hygiene for infrastructure as code appeared first on Superuser.

by Superuser at October 12, 2018 02:06 PM

Opensource.com

From hype to action: Next steps for edge computing

Get an update on the status of edge computing from the OpenStack Foundation working group.

by ildikov at October 12, 2018 07:02 AM

How OpenStack Barbican deployment options secure your cloud

Choose the right OpenStack Barbican deployment option to protect the privacy and integrity of your cloud.

by vakwetu at October 12, 2018 07:01 AM

October 11, 2018

Ben Nemec

Validator Tool for oslo.config

This is an announcement that we recently merged a new feature to oslo.config for validating the contents of config files. This has been an oft-requested feature, but in the past it was difficult to implement because config opts are registered dynamically at runtime and there's no good way to know for sure when all of them are present.

To address that, we made use of the somewhat new feature to generate machine-readable sample config. That data should contain all of the options for each service, so it provides a complete (mostly - more on that later) list of options that we can use to validate config files. If any options are not being provided to the sample config generator then that is a bug and should be addressed in the service anyway.

The tool will warn about any deprecated options present in the file and error on any completely missing ones. It can either use the sample-config-generator configuration file directly or use a pre-generated machine-readable sample config. One limitation of the current iteration of the tool is that it doesn't handle dynamic groups, so for projects that use those it may report some false positives. This should be solvable, but for the moment it is something to be aware of.

If this is something you were interested in, please try it out and let us know how it works for you. The latest release of oslo.config on pypi should have the tool, and since it doesn't necessarily need to be run on the production system the bleeding edge version can be installed somewhere else. Only the machine-readable sample config needs to be generated based on the production version of the code, and that capability has been in oslo.config for a few cycles now.

Hopefully this will be useful, but as mentioned above if you run into any issues with it please let the Oslo team know so we can get them addressed. Thanks.

by bnemec at October 11, 2018 07:43 PM

OpenStack Superuser

How one of Sweden’s largest online lenders optimizes for speed

STOCKHOLM – At OpenStack Day Nordics, Klas Ljungkvist and Patrice Mangelsdorf from Sweden’s SBAB gave the audience a roadmap for how they’ve doubled the features they deliver to customers year over year by moving to a micro-services architecture and OpenStack private cloud by City Network.

For a little background, state-owned SBAB has been around since the 80s focused on home mortgages, private savings and some business-to-business products. An early digital-first pioneer, they never built brick-and-mortar offices but operated strictly with phone services and limited online support as early as the 90s.

As the banking industry has digitized very rapidly, SBAB was well positioned to capitalize through online banking and mobile apps.  In the most recent quarter available, lending rose to $US 38.9 billion (SEK 351.52) and the company continued its innovative path with “green mortgages,” or discounted interest rates for customers choosing houses with an high energy ratings.

With about 100 of its 500 employees focused on technology, including data science and delivering new services, SBAB needs to move quickly and deliver value for its customers to compete. Prior to 2017, SBAB followed a pattern of large, monolithic monthly releases that required the team to take systems offline over the weekend to push everything into production and then turn the lights back on for customers come Monday morning.

Ljungkvist and Mangelsdorf presenting at OpenStack Days Nordics. Photo via @itsme_twi99y

SBAB  embarked down a micro-services path, organizing their technology department into about 12 teams that work autonomously to push features and achieve common goals. Each team of 10 counts about five or six engineers, a business analyst, a product owner and a specialist. They took a page from the agile playbook with an approach to build things small and get them out to the customers as soon as possible and then react quickly to feedback or failure.

But while it’s easy to say, it’s difficult to get people to change their mindset in practice. To help achieve the goals, each team owns their own component stack and 95 percent of the work doesn’t need to touch the other teams. That means if one team has a problem, the other teams can still keep running. There’s also a focus to automate and speed every process, starting with the OpenStack platform, as well as leveraging tools like Ansbile, Docker and Jenkins.

“If you think things are going ‘well enough’ and you’re not continually pushing to move faster and deliver more value, you’re going to fall in the digital market” –Klas Ljungkvist

So why did SBAB choose OpenStack and specifically City Network? The financial services industry is regulated in Sweden, so they needed compliant infrastructure, but they were suffering from vendors who were not delivering fast enough. They wanted more control, but primarily they wanted more speed. SBAB now has a team of four-to-five people who work with City Network to manage the OpenStack infrastructure, but after investing time to set the groundwork it “generally flows pretty well on it’s own. They’re not having to muddle in the infrastructure on a regular basis.”

Fast forward to September 2018, SBAB has pushed more than 300 features, doubling their productivity year over year and blowing out their key performance indicators. The team has now increased their goal to 400 features per month by December and is making good progress toward the new goal.

Klas closed the session by warning not to become complacent. If you think things are going ‘well enough’ and you’re not continually pushing to move faster and deliver more value, you’re going to fall in the digital market. It’s all about speed.

The post How one of Sweden’s largest online lenders optimizes for speed appeared first on Superuser.

by Lauren Sell at October 11, 2018 02:04 PM

Aptira

SDN: Repave vs Update in an SD-WAN Environment

Aptira SDN: Repave vs Update in an SD-WAN Environment

Critical Differences Between Deploying SD-LAN and SD-WAN in the SDN World

As we saw in part 1 of this seriesthere are challenges with country-scale SD-WANs in addition to those faced by their more data centre focused SD-LAN counterparts. This post covers how two common maintenance paradigms from the server infrastructure automation space, Repave and Update, are being used in both of these SDN scenarios.  

First, a quick overview of the two widely used methods of maintaining the state of larger IT systems. The aim is to provide organizations a way of installing and updating their infrastructure easily without adversely affecting users in the process. 

Repaving

This method of system maintenance sees all intended changes as a trigger to reboot and reinstall operating systems and software. Updates are implemented in the same way as fresh installs, and follow these general steps: 

  1. Migrate all user services (VMs, networks, HA configurations) seamlessly away from the host in question
  2. Reboot and reinstall the server with a new known image, including software changes 
  3. Migrate user services back to the host after it joins the pool of resources again

In environments at scale, where there is redundancy and high availability built in, this method can be extremely efficient and pose less risk than constant small updates. The infrastructure supporting the servers to be updated (load balancers, shared storage) render the hosts themselves a fungible resource. 

Updating 

In contrast, rolling out small changes to hosts over time without completely rebuilding generally means that user services don’t need to go through a migration step before or after updates. There are obviously circumstances where a repave or migration is necessary, but updates are more normally of the following form: 

  1. System state is checked against new baseline configuration 
  2. Updates to services and software are applied to live hosts 

This approach can work extremely well, but also increases the risk of systems diverging from the known baseline over time. For this reason, in systems at scale, this method is less regularly used once the appropriate support services are in place. 

These practices are well-known in the server automation spaces, but how are they applied in SDN environments? 

If we look once again to OpenFlow based SDNs, similar concepts are being applied to the way flows are handled when a switch connects or reconnects to a controller. 

OpenDaylight views these connection events as a trigger to remove the existing flow rules and repave the flows from their in-memory model. Any existing flows are deleted, affecting network traffic until new ones are laid down. In a Data Centre centric installation, this is not so much of a problem – the chances of losing connectivity between controllers and switches is low. Between negligible latency and robust redundancy at a local scale, instances of falsely flagging a switch as offline and starting a repave are rare. The resulting outages can be transparently handled by redundant links from the connected server infrastructure. 

In a SD-WAN however, the outcome can be catastrophically different. 

Controllers are generally more centrally located, with child switches distributed over large management networks. Multiple co-located switches sit in clusters in these management networks, receiving updates from their remote SDN controllers. Customer services connecting to these SD-WAN sites are typically minimally redundant edge connections using a local POP to hook into the larger infrastructure. 

Now imagine an SDN management connection is broken or congested to these remote sites. The switches will happily continue using in-memory flow tables up until the point when the management connection is restored. At this point, the controller views the servers as having reconnected, and starts repaving the entire remote site in a series of parallel updates to all the switches. Not only does this kill all traffic flows, it can easily lead to management network congestion, and full network failure cascades as thousands of flows are laid in. 

OpenKilda and similar controllers take the update approach in this scenario.  

As switches reconnect after an outage (real or perceived by the controller), an inventory of the current flow table is created, while leaving the current table intact. This allows for a much subtler approach to manipulating the flows in the event of an outage. Firstly, the controller can take its time to reconcile the actual outage state. Extra telemetry from neighboring switches or the management network itself can provide critical context, instead of a binary action taken on a reconnection. 

If updates are required, they can be managed individually, allowing much smaller sets of changes with a smaller blast radius. 

The differences between SD-LAN and SD-WAN are subtle but crucial, and context is important. 

Join us next week when we discuss the advantages of separating the OpenFlow control channel from the Path Computation Engine. 

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post SDN: Repave vs Update in an SD-WAN Environment appeared first on Aptira.

by Aptira at October 11, 2018 07:54 AM

October 10, 2018

OpenStack Superuser

Five ways to make your OpenStack Summit talk a standout

You prepared, you submitted, you were accepted; congratulations! The OpenStack community is intelligent and engaged, so expectations are always high. Whether this is your 50th or first talk at an OpenStack Summit, here’s five little ways to make sure your talk is a success.

Focus on the non-obvious

Assume your audience is smart and that they’ve heard a talk about your subject before. Even if it’s a 101 talk where your goal is educating about the basics, what can you say that will be unique to your presentation? What could they not find out by Googling your topic? Make sure to present something new and unexpected.

A good presentation sells better than a sales pitch

Unfortunately, the quickest way to empty a room—particularly in the OpenStack community—is to use talk time to push a service or product. This might conflict with company expectations––someone probably wants to see an ROI on your talk and maybe even sent over talking points. Instead, create interest in your company or product by being an outstanding representative and demonstrating smarts, innovation and the ability to overcome the inevitable challenges. The “sales pitch” is not what you say about a product, but it is you and how you present.

Shorten your career path story

It’s very common for talks to begin with “first, a little about me,” which often sounds like reading a resume. While this can create an audience connection, it eats up valuable presentation time and takes the focus off the topic. Instead, share only the relevant pieces of your career to set up your expertise and the audience’s expectations.

Take a look at the difference between these examples:

Frequently done: “My name is Anne and I’m currently a marketing coordinator at the OpenStack Foundation. I started off in renewable energy, focusing on national energy policy and community engagement; then I became a content writer for a major footwear brand; then worked at an international e-commerce startup; and now I’m here! In my free time I race bicycles and like riding motorcycles.”

The audience has learned a lot about me (probably too much!), but it doesn’t give them a single area of expertise to focus on. It distracts the audience from the topic of my talk.

Alternative: “My name is Anne and as the marketing coordinator at the OpenStack Foundation, I work on our social media team.”

I’ve established my professional connection to the topic, explained why they should listen and foreshadowed that we’ll be talking about social media marketing.

Conversation, not recitation

Memorizing a script and having the script in front of you (like on a phone) is a common device to try to soothe presentation nerves. Ironically this makes your presentation more difficult and less enjoyable for the audience. When you trip up on a word (and we all do!), it can cause you to lose the paragraph that precedes it. Reading off a device will make your presentation sound artificial.

Instead, rehearse your presentation but use slide graphics or brief bullets to keep you on message. Pretend you’re having a conversation with the audience; just a cup of coffee over a very large table.

P.S. Make sure you budget time for conversation with your audience, and bring a few thought-provoking questions of your own to get the discussion started.

Humor doesn’t always work in international audiences

OpenStack has a wonderfully international community, which means that many people in your audience may not be native or fluent in the language you are presenting in. Idioms, turns of phrase or plays on words can be particularly difficult to understand. Instead of leaning on humor, tell a story about how something came to be, or a critical error that we can all see the humor in.

Looking forward to the incredible talks slated for the upcoming Summit; good luck, presenters!

 

The post Five ways to make your OpenStack Summit talk a standout appeared first on Superuser.

by Anne Bertucio at October 10, 2018 04:11 PM

StackHPC Team Blog

Deploying Performant Parallel Filesystems: Ansible and BeeGFS

BeeGFS is a parallel file system suitable for High Performance Computing with a proven track record in scalable storage solution space. In this article, we explore how different components of BeeGFS are pieced together and how we have incorporated them into an Ansible role for a seamless storage cluster deployment experience.

BeeGFS logo

We've previously described ways of integrating OpenStack and High-Performance Data. In this post we'll focus on some practical details for how to dynamically provision BeeGFS filesystems and/or clients running in cloud environments. There are actually no dependencies on OpenStack here - although we do like to draw our Ansible inventory from Cluster-as-a-Service infrastructure.

As described here, BeeGFS has components which may be familiar concepts to those working in parallel file system solution space:

  • Management service: for registering and watching all other services
  • Storage service: for storing the distributed file contents
  • Metadata service: for storing access permissions and striping info
  • Client service: for mounting the file system to access stored data
  • Admon service (optional): for presenting administration and monitoring options through a graphical user interface.
Ansible logo

Introducing our Ansible role for BeeGFS...

We have an Ansible role published on Ansible Galaxy which handles the end-to-end deployment of BeeGFS. It takes care of details all the way from deployment of management, storage and metadata servers to setting up client nodes and mounting the storage point. To install, simply run:

ansible-galaxy install stackhpc.beegfs

There is a README that describes the role parameters and example usage.

An Ansible inventory is organised into groups, each representing a different role within the filesystem (or its clients). An example inventory-beegfs file with two hosts bgfs1 and bgfs2 may look like this:

[leader]
bgfs1 ansible_host=172.16.1.1 ansible_user=centos

[follower]
bgfs2 ansible_host=172.16.1.2 ansible_user=centos

[cluster:children]
leader
follower

[cluster_beegfs_mgmt:children]
leader

[cluster_beegfs_mds:children]
leader

[cluster_beegfs_oss:children]
leader
follower

[cluster_beegfs_client:children]
leader
follower

Through controlling the membership of each inventory group, it is possible to create a variety of use cases and configurations. For example, client-only deployments, server-only deployments, or hyperconverged use cases in which the filesystem servers are also the clients (as above).

A minimal Ansible playbook which we shall refer to as beegfs.yml to configure the cluster may look something like this:

---
- hosts:
  - cluster_beegfs_mgmt
  - cluster_beegfs_mds
  - cluster_beegfs_oss
  - cluster_beegfs_client
  roles:
  - role: stackhpc.beegfs
    beegfs_state: present
    beegfs_enable:
      mgmt: "{{ inventory_hostname in groups['cluster_beegfs_mgmt'] }}"
      oss: "{{ inventory_hostname in groups['cluster_beegfs_oss'] }}"
      meta: "{{ inventory_hostname in groups['cluster_beegfs_mds'] }}"
      client: "{{ inventory_hostname in groups['cluster_beegfs_client'] }}"
      admon: no
    beegfs_mgmt_host: "{{ groups['cluster_beegfs_mgmt'] | first }}"
    beegfs_oss:
    - dev: "/dev/sdb"
      port: 8003
    - dev: "/dev/sdc"
      port: 8103
    - dev: "/dev/sdd"
      port: 8203
    beegfs_client:
      path: "/mnt/beegfs"
      port: 8004
    beegfs_interfaces:
    - "ib0"
    beegfs_fstype: "xfs"
    beegfs_force_format: no
    beegfs_rdma: yes
...

To create a BeeGFS cluster spanning the two nodes as defined in the inventory, run a single Ansible playbook to handle the setup and the teardown of BeeGFS storage cluster components by setting beegfs_state flag to present or absent:

# build cluster
ansible-playbook beegfs.yml -i inventory-beegfs -e beegfs_state=present

# teardown cluster
ansible-playbook beegfs.yml -i inventory-beegfs -e beegfs_state=absent

The playbook is designed to fail if the path specified for BeeGFS storage service under beegfs_oss is already being used for another service. To override this behaviour, pass an extra option as -e beegfs_force_format=yes. Be warned that this will cause data loss as it formats the disk if a block device is specifed and also erase management and metadata server data if there is an existing BeeGFS deployment.

Highlights of the Ansible role for BeeGFS:

  • The idempotent role will leave state unchanged if the configuration has not changed compared to the previous deployment.
  • The tuning parameters for optimal performance of the storage servers recommended by the BeeGFS maintainers themselves are automatically set.
  • The role can be used to deploy both storage-as-a-service and hyperconverged architecture by the nature of how roles are ascribed to hosts in the Ansible inventory. For example, the hyperconverged case would have storage and client services running on the same nodes while in the disaggregated case, the clients are not aware of storage servers.

Other things we learnt along the way:

  • BeeGFS is sensitive to hostname. It prefers hostnames to be consistent and permanent. If the hostname changes, services refuse to start. As a result, this is worth being mindful of during the initial setup.
  • This is unrelated to BeeGFS specifically but we had to set a -K flag when formatting NVME devices in order to prevent it from discarding blocks under instructions from Dell otherwise the disk would disappear with the following error message:
[ 7926.276759] nvme nvme3: Removing after probe failure status: -19
[ 7926.349051] nvme3n1: detected capacity change from 3200631791616 to 0

Looking Ahead

The simplicity of BeeGFS deployment and configuration makes it a great fit for automated cloud-native deployments. We have seen a lot of potential in the performance of BeeGFS, and we hope to be publishing more details from our tests in a future post.

We are also investigating the current state of Kubernetes integration, using the emerging CSI driver API to support the attachment of BeeGFS filesystems to Kubernetes-orchestrated containerised workloads.

Watch this space!

In the meantime, if you would like to get in touch we would love to hear from you. Reach out to us via Twitter or directly via our contact page.

by Bharat Kunwar at October 10, 2018 09:52 AM

October 09, 2018

RDO

Stein PTG Summary for Documentation and i18n

Ian Y. Choi and I already shared a summary of docs and i18n updates from the Stein Project Teams Gathering with the openstack-dev mailing list, but I also wanted to post the updates here for wider distribution. So, here comes what I found the most interesting out of our docs- and i18n-related meetings and discussions we had in Denver from 10 through 14 September.

The overall schedule for all our sessions with additional comments and meeting minutes can be found in OpenStack Etherpad.

First things first, so the following is our obligatory team picture (with quite a few members missing); picture courtesy of OpenStack Foundation folks:

Operators documentation

We met with the Ops community to discuss the future of Ops docs. The plan is for the Ops group to take ownership of the operations-guide (done), ha-guide (in progress), and the arch-design guide (to do).

These three documents are being moved from the openstack-manuals repository to their own repos, owned by the newly formed Operations Documentation SIG.

See also ops-meetup-ptg-denver-2018-operations-guide for more notes.

Documentation site and design

We discussed improving the docs.openstack.org site navigation, guide summaries (in particular, install-guide), adding a new index page for project team contrib guides, and more. We met with the OpenStack Foundation staff to discuss the possibility of getting assistance with site design work.

We are also looking into accepting contributions from the Strategic Focus Areas folks to make parts of the docs toolchain like openstackdocstheme more easily reusable outside of the official OpenStack infrastructure. Support for some of the external project docs has already landed in git.

We got feedback on our front page template for project team docs, with Ironic being the pilot for us.

We got input on restructuring and reworking specs site to make it easier for users to understand that specs are not feature descriptions nor project docs, and to make it more consistent in how the project teams publish their specs. This will need to be further discussed with the folks owning the specs site infra.

Support status badges showing at the top of docs.openstack.org pages may not work well for projects following the cycle-with-intermediary release model, such as swift. We need to rethink how we configure and present the badges.

There are also some UX bugs present in badges (for instance, bug 1788389).

Translations

We met with the infra team to discuss progress on translating project team docs and, related to that, generating PDFs.

With the Foundation staff, we discussed translating Edge and Container whitepapers and similar material.

More details in Ian’s notes.

Reference, REST API docs and Release Notes

With the QA team, we discussed the scope and purpose of the /doc/source/reference documentation area in project docs. Because the scope of /reference might be unclear and can be used inconsistently by project teams, the suggestion is to continue with the original migration plan and migrate REST API and possibly Release Notes under /doc/source, as documented in doc-contrib-guide.

Contributor Guide

The OpenStack Contributor Guide was discussed in a separate session, see FC_SIG_ptg_stein for notes.

Thanks!

Finally, I’d like to thank everybody who attended the sessions, and a special thanks goes to all the PTG organizers and the OpenStack community in general for all their work!

by Petr Kovar at October 09, 2018 02:14 PM

OpenStack Superuser

Inside telecom and NFV: Must-see sessions at the Berlin Summit

Join the people building and operating open infrastructure at the OpenStack Summit Berlin in November.  The Summit schedule features over 200 sessions organized by use cases including: artificial intelligence and machine learning, high performance computing, edge computing, network functions virtualization, container infrastructure and public, private and multi-cloud strategies.

Here we’re highlighting some of the sessions you’ll want to add to your schedule about NFV.  Check out all the sessions, workshops, lightning talks and working groups on this topic here.

Airship-Deckhand: Reliable, predictable configuration management

Airship is a collection of interoperable components which offers cloud providers a way to manage their cloud provisioning and lifecycle management in a declarative, reliable and predictable way. Airship-Deckhand epitomizes the platform’s declarative-driven approach by managing the life cycle of Airship configuration documents. It provides mechanisms for reliably storing, validating and rendering Airship documents in preparation for their deployment. It does this by leveraging OpenStack components such as Barbican for secret storage and Keystone for authentication and multi-tenancy.

In this talk, Felipe Monteiro and Matt McEuen of AT&T, provide an overview of the Airship-Deckhand project, its role within Airship and how it can be leveraged as a standalone component for realizing storage, configuration and life cycle management for documents, doing so with the scale, speed, resiliency and operation predictability demanded of network clouds at scale.
Details here.

The present and future of a fully open source smart OpenStack cloud

OVS hardware offloads to improve the efficiency of a virtualized OpenStack cloud has been presented in the past at various summits. The Queens release marked a major milestone in delivering smart offload technology to end users. During this talk Franck Baudin of Red Hat and Ash Bhalgat of Mellanox Technologies will recap what’s available to configure OVS hardware offloads using open source components, OpenDaylight and Neutron ML2 Plugin. They’ll also share early field guidance on deployment of a smart cloud. Details here.

The need to succeed: Tearing down NFV interoperability walls

While there’s little doubt that NFV can revolutionize the telecom industry, the final question marks have not vanished completely. Many industry professionals see that NFV is lagging behind its true potential. One sticking point is the lack of real multi-vendor interoperability. Due to this, the deployment of a new NFV can easily expand into months instead of the assumed easy and quick play. In order to take the technology to the next level, verified interoperability is a must.

This talk by Carsten Rossenhoevel of EANTC  hopes to resolve some of these questions. EANTC, a widely recognized independent test center, has been around long enough to see how important interoperability is.  Rossenhoevel will use real-life test examples featuring leading vendors to underline where interoperability is essential and why. Details here.

Using the Telekom OpenStack platform for radio access network virtualization

A future-facing mobile network architecture requires flexibility, adaptability and programmability to deliver services of expected quality. Radio Access Network (RAN) is a technology that connects individual devices to other parts of a network via radio links. The presentation shows by professor Michael Einhaus and Andreas Hartmann from
Deutsche Telekom shows how they virtualized RAN software components using container technology from a decentralized hardware based platform into the Open Telekom Cloud. Thus, a flexible, resilient and scalable ecosystem has been created to perform further research on topics including network traffic performance. The demo session will show the concept and how to use the application in the OpenStack environment. Details here.

Verizon use case: Remote hardware lessons learned

The notion that compute nodes can be deployed anywhere, ranging from a data center to a donut shop to the top of a light pole or the bottom of the ocean. An often overlooked consideration for use cases is the significant difficulties of deploying, managing and servicing the hardware itself in remote locations. While we can be reasonably sure that the hardware deployed in a data center will be well cared for, many remote locations have substantial hardware and environmental constraints. In this intermediate session, Verizon’s Beth Cohen and Jason Kett along with Dell’s Glen McGowan will share Verizon’s experience in packaging and deploying the uCPE device hardware in support of its Virtual Network Service product. Details here.

See you at the OSF Summit in Berlin, November 15-18 2018! Register here.

// CC BY NC

The post Inside telecom and NFV: Must-see sessions at the Berlin Summit appeared first on Superuser.

by Nicole Martinelli at October 09, 2018 02:02 PM

Cisco Cloud Blog

Cloud Unfiltered, Episode 57: Navigating the Waves of Innovation, with Lew Tucker

Do you know who Lew Tucker is? Of course you do. He’s the VP and CTO of Cloud Computing at Cisco, but that’s probably not why you know him. You...

by Ali Amagasu at October 09, 2018 01:20 PM

October 08, 2018

OpenStack Superuser

How to get high performance for network connected solid-state drives with NMVe over fabrics

What is NVMe over fabrics, anyway?

The evolution of NVMe interface protocol is a boon to SSD-based storage arrays. It further powered SSDs to obtain high performance and reduced latency for accessing data. Benefits further extended by NVMe over fabrics network protocol which brings NVMe feature retained over network fabric while accessing the storage array remotely. Let’s take a look at how.

While leveraging NVMe protocol with storage arrays consists of high-speed NAND and SSDs,  latency was experienced when NVMe-based storage arrays access through shared storage or storage area networks (SAN). In SAN, data should be transferred between the host (initiator) and the NVMe-enabled storage array (target) over Ethernet, RDMA technologies (iWARP/RoCE), or Fibre Channel. Latency caused due to a translation of SCSI commands into NVMe commands in process of transportation of data. To address this bottleneck, NVM express introduced NVMe over fabrics protocol to get replaced with iSCSI as storage networking protocol. With this, NVMe benefits taken onto network fabrics in SAN kind of architecture to have a complete end to end NVMe based storage model which is highly efficient for new age workloads. NVMe-oF supports all available network fabrics technologies like RDMA (RoCE, iWARP), Fibre Channel (FC-NVMe), Infiniband, Future Fabrics, and Intel Omni-Path architecture.

NVMe over fabrics and OpenStack

OpenStack consists of a library of open-source projects for the centralized management of data center operations. OpenStack provides an ideal environment to implement efficient NVMe based storage model for high throughput. OpenStack Nova and Cinder are components used in proposed NVMe-oF with OpenStack solution. This consists of creation and integration of Cinder NVME-oF target driver along with OpenStack Nova.

OpenStack Cinder is a block storage service project for OpenStack deployments mainly used to create services which provide persistent storage to cloud-based applications. It provides APIs to users to access storage resources without disclosing storage location information.

OpenStack Nova is component within OpenStack which helps is providing on-demand access to compute resources like virtual machines, containers and bare metal servers. In NVMe-oF with OpenStack solutions Nova in attaching NVMe volumes to VMs.

Support of NVMe-oF in OpenStack was made available from the Rocky release. A proposed solution requires RDMA NICs and supports kernel initiator and kernel target.

NVMe-oF targets supported

Based on the proposed solution above we get two choices to implement NVMe-oF with OpenStack. First, with a Kernel NVMe-of target driver supported from OpenStack Rocky release onward. The second implementation is Intel’s SPDK-based NVMe-oF implementation containing SPDK NVMe-oF target driver and SPDK LOLVOL (logical volume manager), back end which is anticipated in the upcoming OpenStack Stein release.

Kernel NVMe-oF Target (supported from the OpenStack Rocky release)

Here is the implementation consist of support for kernel target and kernel initiator. But kernel based NVMe-oF target implementation has limitations in terms of number of IOPs per CPU core. Also, kernel-based NVMe-oF suffers a latency issue due to CPU interrupts, many systems calls to read data and time take to transfer data between threads.

SPDK NVMe-oF Target (expected in the upcoming Openstack Stein release)

Why SPDK?

SPDK architecture achieved high performance for NVMe-oF with OpenStack by moving all necessary application drivers to userspaces (apart from the kernel) and enables operation in polled mode rather interrupt mode and lockless (avoiding the use of CPU cycles synchronizing data between threads) processing.

Let’s take a look what that means.

In SPDK implementation, storage drivers which are utilized for storage operations like storing, updating, deleting data are isolated from kernel space where general purpose computing processes run. This isolation of storage drivers from kernel saves an amount of time required for processing in the kernel and enables CPU cycles to spend more time for execution of storage drivers at user space. This avoids interruption and locking of storage driver with other general purpose computing drivers in kernel space.

In typical I/O model, application request read/write data access and waits till I/O cycle to complete. In polled mode, once application places a request for data access it goes at other execution and comes back after a defined interval to check completion of an earlier request. This reduces latency and process overheads and further improves the efficiency of I/O operations.

By summarizing, SPDK specially designed to extract performance from non-volatile media, containing tools and libraries for scalable and efficient storage applications utilized user space, and polled mode components to enable millions IO/s per core. SPDK architecture is open source BSD licensed blocks optimized for bringing out high throughput from the latest generation of CPUs and SSDs.

Why an SPDK NVMe-oF Target?

Following performance benchmarking report of NVMe-oF using SPDK, it has been noted that:

  • Throughput scales up and latency decreases almost linearly with the scaling of SPDK NVMe-oF target and initiator I/O cores.
  • SPDK NVMe-oF target performed up to 7.3x better w.r.t IOPS/core than Linux Kernel NVMe-oF target while running 4K 100 percent random write workload with increasing number of connections (16) per NVMe-oF subsystem.
  • SPDK NVMe-oF initiator is 3x 50GbE faster than Kernel NVMe-oF initiator with null bdev based back end.
  • SPDK reduces NVMe-oF software overhead up to 10 times.

  • SPDK saturates 8 NVMe SSDs with a single CPU core

SPDK NVMe-oF implementation

This is the first implementation of NVMe-oF integrating with OpenStack (Cinder and Nova) that leverages NVMe-oF target driver and SPDK LVOL (Logical Volume Manager) based SDS storage backend. This provides a high-performance alternative to kernel LVM and kernel NVMe-oF target.

If compared with Kernel-based implementation, SPDK reduces NVMe-oF software overheads and yields high throughput and performance. Let’s see how this will be added to upcoming OpenStack  Stein release.

This article is based on a session at OpenStack Summit 2018 Vancouver – OpenStack and NVMe-over-Fabrics – Network connected SSDs with local performance. The session was presented by Tushar Gohad (Intel), Moshe Levi (Mellanox) and Ivan Kolodyazhny (Mirantis). You can catch the demo on video here.

About the author

Sagar Nangare,a digital strategist at Calsoft Inc., is a marketing professional with over seven years of experience of strategic consulting, content marketing and digital marketing. He’s an expert in technology domains like security, networking, cloud, virtualization, storage and IoT.

This post first appeared on the Calsoft blog. Superuser is always interested in community content, get in touch: editorATopenstack.org

The post How to get high performance for network connected solid-state drives with NMVe over fabrics appeared first on Superuser.

by Superuser at October 08, 2018 05:23 PM

David Moreau Simard

AnsibleFest 2018: Community project highlights

With two days of AnsibleFest instead of one this time around, we had 100% more time to talk about Ansible things ! I got to attend great sessions, learn a bunch of things, chat and exchange war stories about Ansible, ARA, Zuul, Tower and many other things. It was awesome and I wanted to take the time to share a bit about some of the great Ansible community projects that were featured during the event.

October 08, 2018 12:00 AM

October 07, 2018

Emilien Macchi

OpenStack Containerization with Podman – Part 3 (Upgrades)

For this third episode, here are some thoughts on how upgrades from Docker to Podman could work for us in OpenStack TripleO. Don’t miss the first and second episodes where we learnt how to deploy and operate Podman containers.

I spent some time this week to investigate how we could upgrade the OpenStack Undercloud that is running Docker containers to run Podman containers, without manual intervention nor service disruption. The way I see it as this time (the discussion is still ongoing), is we could remove the Docker containers in Paunch, just before starting the Podman containers and service in Systemd. It would be done per container, in serial.

for container in containers:
    docker rm container
    podman run container
    create systemd unit file && enable service

In the follow demo, you can see the output of openstack undercloud upgrade with a work in progress prototype. You can observe the HAproxy running in Docker, and during the Step 1 of containers deployment, the container is stopped (top right) and immediately started in Podman (bottom right).

You might think “that’s it?”. Of course not. There are still some problems that we want to figure out:

  • Migrate containers not managed by Paunch (Neutron containers, Pacemaker-managed containers, etc).
  • Whether or not we want to remove the Docker container or just stop (in the demo the containers are removed from Docker).
  • Stopping Docker daemon at the end of the upgrade (will probably be done by upgrade_tasks in Docker service from TripleO Heat Templates).

The demo is a bit long as it shows the whole upgrade output. However if you want to see when HAproxy is stopped from Docker and started in Podman, go to 7 minutes. Also don’t miss the last minute of the video where we see the results (podman containers, no more docker containers managed by Paunch, and SystemD services).

Thanks for following this series of OpenStack / Podman related posts. Stay in touch for the next one! By the way, did you know you could follow our backlog here? Any feedback on these efforts are warmly welcome!

by Emilien at October 07, 2018 04:49 PM

Aptira

OpenKilda – Distributed is the Future

Aptira + OpenKilda - Distrubuted

In a world where most scalable technologies have learned through hard experience that distributing functionality and data processing is the only viable way to handle large footprints, it is difficult to imagine why legacy SDN Controllers such as OpenDaylight would concentrate both control and telemetry processing in small clusters.

In light of the rapid advances in the SDN Controller market over the last 5 years, it is important to take a step back and look at how the landscape has changed. Platforms initially designed for Data Centre centric SD-LAN use cases are regularly being rolled out as solutions in the SD-WAN space, without bringing in complementary tools from fields such Big Data and Distributed Processing to help with issues of scale.

When deployed to manage smaller Enterprise networks or Data Centres, maintaining the relatively few flows needed and handling the resulting stream of status updates over low latency control backplanes can be managed like any other monolithic platform.

Problems begin to present themselves where SD-WAN is implemented over broad areas with less reliable control networks. Many assumptions around network health in a Data Centre are simply not valid when networks are distributed across countries over lower bandwidth, high latency interconnections. Not receiving a status message from a local switch for 2 seconds may indicate a failure in a Data Centre, but indicate nothing if there is congestion across the Atlantic.

Additionally, in a network of many 10s or 100s of switches, the flow of telemetry and subsequent processing quickly overwhelms clusters where a single member is required to parse, interpret and respond to changes in topology and status.

There are also issues caused by the Model-Based Configuration paradigm in OpenDaylight and similar controllers, where having a switch diverge from the in-memory model causes all flows to be dropped and reprogrammed. In our experience this has been known to result in cascading network outages as switches madly play catchup with the controller, creating a feedback loop to be handled, in addition to the initial and subsequent changes.

Fortunately, a relatively new SDN Controller, OpenKilda, has been developed and successfully deployed at scale to solve these problems.

How does OpenKilda solve these problems? Stay tuned – we’ll cover that in the next post next week.

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post OpenKilda – Distributed is the Future appeared first on Aptira.

by Aptira at October 07, 2018 10:49 AM

October 06, 2018

Trinh Nguyen

Searchlight weekly report - Stein R-27



I think it's good to have a weekly report for others to know what is going on with Searchlight. So from now on, I will write a weekly report every Saturday starting this week as well as setting the goals for next week.

During the week of Oct 1 - Oct 5 or Stein R-27 we have fixed one big bug and several minor things:

1. The bug: nodejs-npm-run-test failed on master

Fix merged: https://review.openstack.org/#/c/607085/

2. Minor changes:
Our goals for the week of Oct 08 - Oct 12 or Stein R-26 are:

1. Complete these stories:

by Trinh Nguyen (noreply@blogger.com) at October 06, 2018 01:47 PM

October 05, 2018

Emilien Macchi

OpenStack Containerization with Podman – Part 2 (SystemD)

In the first post, we demonstrated that we can now use Podman to deploy a containerized OpenStack TripleO Undercloud. Let’s see how we can operate the containers with SystemD.

Podman, by design, doesn’t have any daemon running to manage the containers lifecycle; while Docker runs dockerd-current and docker-containerd-current which take care of a bunch of things, such as restarting the containers when they are in failure (and configured to do it, with restart policies).

In OpenStack TripleO, we still want our containers to restart when they are configured to, so we thought about managing the containers with SystemD. I recently wrote a blog post about how Podman can be controlled by SystemD, and we finally implemented it in TripleO.

The way it works, as of today, is that any container managed by Podman with a restart policy in Paunch container configuration, will be managed by SystemD.

Let’s take the example of Glance API. This snippet is the configuration of the container at step 4:

    step_4:
      map_merge:
        - glance_api:
            start_order: 2
            image: *glance_api_image
            net: host
            privileged: {if: [cinder_backend_enabled, true, false]}
            restart: always
            healthcheck:
              test: /openstack/healthcheck
            volumes: *glance_volumes
            environment:
              - KOLLA_CONFIG_STRATEGY=COPY_ALWAYS

As you can see, the Glance API container was configured to always try to restart (so Docker would do so). With Podman, we re-use this flag and we create (+ enable) a SystemD unit file:

[Unit]
Description=glance_api container
After=paunch-container-shutdown.service
[Service]
Restart=always
ExecStart=/usr/bin/podman start -a glance_api
ExecStop=/usr/bin/podman stop -t 10 glance_api
KillMode=process
[Install]
WantedBy=multi-user.target

How it works underneath:

  • Paunch will run podman run to start the container, during the deployment steps.
  • If there is a restart policy, Paunch will create a SystemD unit file.
  • The SystemD service is named by the container name, so if you were used to the old services name before the containerization, you’ll have to refresh your mind. By choice, we decided to go with the container name to avoid confusion with the podman ps output.
  • Once the containers are deployed, they need to be stopped / started / restarted by SystemD. If you run Podman CLI to do it, SystemD will take over (see in the demo).

Stay in touch for the next post in the series of deploying TripleO and Podman!

by Emilien at October 05, 2018 10:10 PM

Robert Collins

OpenStack and ease of development

In my last post, about cultural norms in OpenStack, I said that ease of development was a self inflicted issue. This was somewhat contentious 🙂 I’ve had some interested expressed a deeper dive. In that post I articulated three cultural problems and two technical ones.

What does success for developers look like?

I think independent of the scope of OpenStack, the experience for developers should have roughly the same features:

  1. global reasoning for changes should be rarely needed, (or put another way, the architecture should make it possible to think about changes without trying to consider all of OpenStack and still get high quality results). (this helps new developers make good decisions)
  2. the component being worked on should build quickly (keep local development cycles brisk)
  3. have comprehensive local unit tests (keep local development effective; low rate of defects escaping to functional/integration tests)
  4. be able to utilise project resources to perform adhoc exploration, integration, functional and scale tests (this allows developers to have sensibly sized development machines, while still ensuring what they build works in a system representative of our users).
  5. the lead time from getting the change finished locally, to the developer no longer needing to shepard the change through the system should be low think about it, should be low (I won’t scare people by saying what I think it should be 🙂 . this feature keeps cognitive load on developers from becoming a burden)
  6. failures after review should be a) localised, b) rare enough that the overhead of corrective action is tolerable and c) recovery should take place within a small number of hours at most (this keeps the project as a whole healthy and that means individual developers will be rarely impacted by failures from other developers changes)

We already do ok on a number of these things: the above is not a gap analysis.

Sidebar – Accelerate

About now I feel I have to mention Accelerate, a book that is the result of detailed research into software delivery performance – and its follow-up report the DORA 2018 state of devops report. The Puppet state-of-devops report is useful as well, though they focus on different aspects – ones that are less generalisable to open source development in my view. And interestingly, particularly around team choice, they seem to have reached entirely different conclusions around team choice :).

The particularly interesting thing for me is that this is academic grade research, showing causation and tying that back to specific practices: this gives us a solid basis for planning changes, rather than speculation that something will work.

These reports and research are looking into software delivery – which for OpenStack spans organisations: we build, then users deploy. So its not entirely clear that things generalise, nor is it clear how one might implement all the predictive practices because of that.

For instance, while Continuous Integration is something we can imagine doing in OpenStack (sorry folks, preflight testing and CI are really very very different things);

Continuous Deployment would be a much more ambitious undertaking. Imagine it though: commit through to deployed on users clouds in a matter of hours. Wouldn’t that be something. Chrome and Firefox are two open source projects that have been evolving in this direction for some time, and we could well study them to learn what they have found to work and not work.

All that said, the construct – the metrics – that predict software delivery performance are:

  1. Release frequency
  2. Mean time to recovery
  3. Lead time (commit to value consumable)

There’s a separate construct (the Westrum organisational culture construct) for culture, and they also measured the effect on e.g. implementing Continuous Delivery on those metrics.

I highly recommend reading the book – perhaps start with the 2018 report for a taste, but the book has much more detail.

Where are the gaps

I haven’t looked particularly closely at the coupling in OpenStack recently, so for 1) I think folk actually landing changes should assess this. My sense it that we’re ok on this, but not great. In particular, anytime there is a big cross project, lots of involved commits, lots of sequencing – thats something that needed global reasoning.

For 2), most of our stuff is in Python today, so build times aren’t a big issue.

For 3), we’re in pretty decent shape unit test wise, though they tend to be very slow (minutes or more to run), and I worry about  skew between mocks and actual servers.

For 4) we do allow utilisation of project resources via gerrit pre-review tests and pre-merge tests, but there’s no provision for adhoc utilisation (that I know of), and as I described in my last post, I think we could get a lot more leverage out of the cloud resources if we had the ability to wire components under test into an existing, scaled, cloud.

For 5) I’d need to do some more detailed visualisation, or add a feature to stackalytics, but the sense from folk I speak too is that lead times are still enormous. I suspect there are two, or even three, distributions hiding in there (e.g. one for regular devs, and one for infrequent/new) – but we can gather data on this. One important aspect is whether we should measure from ‘code committed( in dev branch) to merged to master’, or ‘code committed to delivered’. Its my view that measuring to delivery is critical, if we truely want to be driving benefits to our users. There is a corner case where those two things converge – trunk based development – but that is particularly challenging for open source projects. For instance, http://stackalytics.com/report/reviews/nova/open shows under the ‘Change requests waiting for reviewers since the last vote or mark’ an average age time of 144 days, with a max age time of 709 days: thats 2 years, 4 releases. Thats measuring time to git; if we measure time to delivered, then we need to add the time that changes sit in git before being included in a release – up to 6 months, though the adhoc releases many project are doing now is a great help. The stats shown though aren’t particularly useful – a) reviews that have merged already are not included in the stats and b) there’s not enough information to  start reasoning about why they have the age they do.

For 6) our changes at the moment, recovery is burdened by the slow merging process – the minimum time to recovery is the sum of the unavoidable steps in the merge / delivery process. Failure frequency (things breaking after the merge completes / is released) is fairly low, but we’re not particularly good at blast radius management – the all-or-nothing nature of change rollout today means there is no mitigation when things go wrong.

So I think there are significant gaps with room to improve on three things there:

  1. More efficient test/adhoc project resource utilisation
  2. Lead times
  3. Blast radius

Smarter testing

I covered this in my previous post in moderate detail, but its worth drilling in further at this point. I don’t think there is a silver bullet here; the necessary machinery to test a new database engine version with an existing cloud is very different in detail to that required to test a new nova-compute build. Lets consider just being able to test a new nova-compute with an existing cloud. Essentially we want to wire in a new shard of nova-compute. Fortunately nova-compute is intrinsically sharded: thats its very model of operation.

blog-testing.png

Though its not strictly relevant here consider that other components (like the DB) have no sharding mechanism in place today, so wiring in a new shard for that would be “tricky”.

The details may have changed since I last dug deep, but from memory nova-compute needs access to the the message bus to communicate with the rest of nova, access to glance and the swift or other store that images are in, and obviously nova-compute needs appropriate local resources to run whatever compute workload it is going to serve out.

So wiring that in from a test node to an existing cloud seems pretty simple. We probably don’t want the services listening unsecured on the internet, so we’ll need a credential distribution system (e.g. vault), and automation to look those up and wire in the nova-compute instance with appropriate credentials.

There may be trust issues: are all components equally privileged in the system? This also shows up as a bug risk – how much damage could a broken but not malicious nova-compute do?

Harder cases – DDL

One common harder case is DDL – schema changes at the DB layer. I don’t have a good canned answer here, but roughly speaking in the context of tests we need to be able to:

  1. Try applying the DDL across the whole DB
  2. Run the code that works with the DB with the modified schema
  3. Be able to do that for many different patches

Right now we machinery to do 1) against a static copy of various cloud’s DBs. 2) and 3) are almost at cross purposes: it may be necessary to serialise those tests: they are fewer than other code changes. One possible implementation would be to use an expand-contract SQL server migration strategy to expand to a new server, run the DDL, verify the cloud metrics don’t regress, then migrate back using the source servers schema (and ignoring missing columns [because if they’ve been dropped in the new schema, then code is already not querying them].

Another possibility, given that these changes are rarer, is not to optimise the testing of them.

Harder cases – exotic components

Power machines, ESXi hypervisors, and other not generally-available hypervisors would all be good to expose to developers – make it possible for them to verify changes to the code that interacts with them – in real time. Ideally with more access than the current hands-off gerrit-test-job only approach.

Lead times

Today, I’m going to treat ‘in a release’ as delivered. I’m picking this definition because:

  • We can choose to make more releases
  • We don’t need to build consensus or whole new delivery stacks to try and get customers upgraded
  • We can always come back and defined delivered with more scope later

Lean methodology provides a number of tools for analysing lead times – it has been used successfully in many organisations; sufficiently robust and consistent in its results that Accelerate even cites adopting lean management practices as being predictive for performance. And then there is the whole what-does-delivered mean.

And yes, we are not a company, we are many volunteers, but that merely adds corner cases – most of our volunteers are given tasks to work on w/in OpenStack, and have the time to work with an effective SDLC and change management process.

As I mentioned above, without some more detailed modelling, its hard to say for sure what leads to the high lead times; but there are some things we can identify easily enough…

  1. We don’t treat each commit as a release. We do say that trunk should never be broken, but we’re not sure enough of our execution to actually tag each commit as a release and publish for consumption.
    1. Consider what we would need to solve to do this.
  2. We aren’t practicing CI. In particular:
    1. Merges (required to repair things that snuck in) often take much more than 10 minutes
    2. We’re not integrating the work-in-progress from developers early enough to avoid reintegration costs.
  3. We’re not practicing trunk based development: every outstanding patch chain is a branch, just in a different representation, and our branch lifetime clearly exceeds a day… and we have a large stabilisation period during the development cycle.
  4. Reviews – needs a deeper analysis to say if this is or isn’t a driver. I suspect it is, because nothing I hear or see shows this to have changed in any fundamental way.
  5. We don’t work in small batches: 6 month cycles is huge batches.
  6. We’re pretty poor at enabling team experimentation. I think this is due to layering: for example, we have N different API servers, so if one team wants to experiment, they create customer confusion due to yet-another-API idiom. If we had just one API server, changes to that would be happening from just one team, gaining much better integration and discussion characteristics. (For an example of having just one API server in a distributed system, consider k8s, which has just one primary API server – the kubelet API is not really customer facing.)
  7. We don’t manage work in progress well: this may not seem important, but its a Lean foundational practice. Think of it as a combination of not exceeding your bandwidth, and minimising context switches.

So what should we do to drive lead times down?

I propose setting a vision: 95% of patches that are either maintenance or part of an agreed current feature merge (or are completely rejected) the same day that they are uploaded to gerrit. (Patches that are for some completely random thing may obviously require considerable more effort to reason about).

Then work back from that: what do we need to have in place to do that safely.
Yes its hard. Thats more of a reason to do it.

Delivering that will require better safety ropes (e.g. clearer contracts for components, better linting (maybe mypy), more willingness to roll forward, consistent review latency (this is more about scheduling than how many reviews any one person does).

The benefits could be immense though: if OpenStack is a juggernaut today, consider what it could be if we could respond nimbly to new user demands.

Blast radius containment

So this is about things like making releases and deployments much more robust to mistakes. For instance, imagine if every server could run in a shadow mode – where it receives traffic, operates on it, but marks any external operations it does as not-real. Then if it blows up we can detect that without destablising a running version. (And the long running supported test cloud would give a perfect place to do this). So rollouts rather than being atomic, become a series of small steps. The simplest form is just taking a stateless scale-out service and running 2 builds in parallel. Thats better than a binary old/new. Canary builds, rolling upgrades similarly.

Now, since we defined ‘delivered’ as in a release, not ‘in use’, maybe we should ignore that operational blast radius and instead limit ourselves to the development side.

Even here is a lot more sophistication that we can add: consider that for libraries our ‘fleet’ is basically every developer. Pinning all those dependencies like we do is a good step. What if we actually could deliver updates to 1% of our devs, then 10%, then all?

So we could have a pipeline:

  1. Unit test a consumer, raise its version for 1% of consumers.
  2. Watch for failures, raise the % until 100%

This would require a metrics channel (opt-in!), and some way of signalling the versions to choose from to development environments.

We could use multiple branches as another mechanism: if everyone works off of trunk, we optimise trunk merges to be no more than (say) 20 minutes, and code self promotes to a tested branch, then release branch over a couple of hours. Failures would generate a proposed rollback straight into gerrit.

Wrapup

There’s a high cost of change in OpenStack – I don’t mean individual code changes, I mean changing e.g. policies, languages, architecture – lots of code, and thousands of affected people. A result of a high cost of change is a high risk of change: if a change makes things worse, it can take as long to back it out as it took to bring it in.

I’ll freely admit that I’m partly off in architecture-astronaut land here: there’s a huge gap of detail between what I’m describing and what would be needed to make it happen.

I have confidence in the community though, if we can just pull some vision together about what we want, we have the people and knowledge to execute on it.

by rbtcollins at October 05, 2018 09:26 PM

OpenStack Superuser

Modern cloud native architectures: Microservices, Containers and Serverless – Part 2

This is part two of a series that aims to shed some light and provide practical exposure on key topics in the modern software industry, namely cloud native applications. This post covers containers and serverless applications. Part one covers architecture for micro-services and cloud native applications.

Containers

The technology of software containers is the next key technology that needs to be discussed to explain cloud native applications. A container is simply the idea of encapsulating some software inside an isolated user space or “container.”

For example, a MySQL database can be isolated inside a container where the environmental variables and the configurations that it needs will live. Software outside the container will not see the environmental variables or configuration contained inside the container by default. Multiple containers can exist on the same local virtual machine, cloud virtual machine, or hardware server.

Containers provide the ability to run numerous isolated software services, with all their configurations, software dependencies, runtimes, tools and accompanying files on the same machine. In a cloud environment, this ability translates into saved costs and efforts, as the need for provisioning and buying server nodes for each micro-services will diminish, since different micro-services can be deployed on the same host without disrupting each other. Containers  combined with micro-services architectures are powerful tools to build modern, portable, scalable and cost efficient software. In a production environment, more than a single server node combined with numerous containers would be needed to achieve scalability and redundancy.

Containers also add more benefits to cloud native applications beyond micro-services isolation. With a container, you can move your micro-services, with all the configuration, dependencies and environmental variables that it needs, to fresh server nodes without the need to reconfigure the environment, achieving powerful portability.

Due to the power and popularity of the software containers technology, some new operating systems like CoreOS, or Photon OS, are built from the ground up to function as hosts for containers.

One of the most popular software container projects in the software industry is Docker. Major organizations such as Cisco, Google, and IBM utilize Docker containers in their infrastructure as well as in their products.

Another notable project in the software containers world is Kubernetes. Kubernetes is a tool that allows the automation of deployment, management, and scaling of containers. It was built by Google to facilitate the management of their containers, which are counted by the billions per week. Kubernetes provides some powerful features such as load balancing between containers, restart for failed containers, and orchestration of storage utilized by the containers. The project is part of the cloud native foundation along with Prometheus.

Container complexities

In case of containers, sometimes the task of managing them can get rather complex for the same reasons as managing expanding numbers of micro-services. As containers or micro-services grow in size, there needs to be a mechanism to identify where each container or micr-oservices is deployed, what their purpose is and what they need in resources to keep running.

Serverless applications

Serverless architecture is a new software architectural paradigm that was popularized with the AWS Lambda service. To fully understand serverless applications, it helps to go over an important concept known as function- as-a-service, or FaaS for short. FaaS is the idea that a cloud provider such as Amazon or even a local piece of software such as Fission.io or funktion can provide a service where a user can request a function to run remotely in order to perform a very specific task.  After the function concludes, those results return back to the user. No services or stateful data are maintained and the function code is provided by the user to the service that runs the function.

The idea behind properly designed cloud native production applications that utilize the serverless architecture is that instead of building multiple micro-services expected to run continuously in order to carry out individual tasks, build an application that has fewer micro-services combined with FaaS, where FaaS covers tasks that don’t need services to run continuously.

FaaS is a smaller construct than a micro-service. For example, in case of the event booking application we covered earlier, there were multiple micro-services covering different tasks. If we use a serverless applications model, some of those micro-services would be replaced with a number of functions that serve their purpose.

Here’s a diagram that showcases the application utilizing a serverless architecture:

In this diagram, the event handler micro-services as well as the booking handler micro-services were replaced with a number of functions that produce the same functionality. This eliminates the need to run and maintain the two existing micro-services.

Serverless architectures have the advantage that no virtual machines and/or containers need to be provisioned to build the part of the application that utilizes FaaS. The computing instances that run the functions cease to exist from the user point of view once their functions conclude. Furthermore, the number of micro-services and/or containers that need to be monitored and maintained by the user decreases, saving cost, time, and effort.

Serverless architectures provide yet another powerful software building tool in the hands of software engineers and architects to design flexible and scalable software. Known FaaS are AWS Lambda by Amazon, Azure Functions by Microsoft, Cloud Functions by Google and many more.

Another definition for serverless applications is the applications that utilize the BaaS or backend as a service paradigm. BaaS is the idea that developers only write the client code of their application, which then relies on several software pre-built services hosted in the cloud, accessible via APIs. BaaS is popular in mobile app programming, where developers would rely on a number of backend services to drive the majority of the functionality of the application. Examples of BaaS services are: Firebase and Parse.

Disadvantages of serverless applications

Similarly to micro-services and cloud native applications, the serverless architecture is not suitable for all scenarios.

The functions provided by FaaS don’t keep state by themselves which means special considerations need to be observed when writing the function code. This is unlike a full micro-service, where the developer has full control over the state. One approach to keep state in case of FaaS, in spite of this limitation, is to propagate the state to a database or a memory cache like Redis.

The startup times for the functions are not always fast since there is time allocated to sending the request to the FaaS service provider then the time needed to start a computing instance that runs the function in some cases. These delays have to be accounted for when designing serverless applications.

FaaS do not run continuously like micro-services, which makes them unsuitable for any task that requires continuous running of the software.

Serverless applications have the same limitation as other cloud native applications where portability of the application from one cloud provider to another or from the cloud to a local environment becomes challenging because of vendor lock-in

Conclusion

Cloud computing architectures have opened avenues for developing efficient, scalable, and reliable software. This paper covered some significant concepts in the world of cloud computing such as micro-services, cloud native applications, containers, and serverless applications. Microservices are the building blocks for most scalable cloud native applications; they decouple the application tasks into various efficient services. Containers are how micro-services could be isolated and deployed safely to production environments without polluting them.  Serverless applications decouple application tasks into smaller constructs mostly called functions that can be consumed via APIs. Cloud native applications make use of all those architectural patterns to build scalable, reliable, and always available software.

If you are interested in learning more, check out his book Cloud Native programming with Golang to explore practical techniques for building cloud-native apps that are scalable, reliable, and always available.

About Author

Mina Andrawos is an experienced engineer who has developed deep experience in Go from using it personally and professionally. He regularly authors articles and tutorials about the language, and also shares Go’s open source projects. He has written numerous Go applications with varying degrees of complexity.

Other than Go, he has skills in Java, C#, Python, and C++. He has worked with various databases and software architectures. He is also skilled with the agile methodology for software development. Besides software development, he has working experience of scrum mastering, sales engineering, and software product management.

The post Modern cloud native architectures: Microservices, Containers and Serverless – Part 2 appeared first on Superuser.

by Superuser at October 05, 2018 02:17 PM

Chris Dent

Placement Update 18-40

Here's this week's placement update. We remain focused on specs and pressing issues with extraction, mostly because until the extraction is "done" in some form doing much other work is a bit premature.

Most Important

There have been several discussions recently about what to do with options that impact both scheduling and configuration. Some of this was in the thread about intended purposes of traits, but more recently there was discussion on how to support guests that want an HPET. Chris Friesen summarized a hangout that happened yesterday that will presumably be reflected in an in-progress spec.

The work to get grenade upgrading to placement is very close. After several iterations of tweaking, the grenade jobs are now passing. There are still some adjustments to get devstack jobs working, but the way is relatively clear. More on this in "extraction" below, but the reason this is a most important is that this stuff allows us to do proper integration and upgrade testing, without which it is hard to have confidence.

What's Changed

In both placement and nova, placement is no longer using get_legacy_facade(). This will remove some annoying deprecation warnings.

The nova->placement database migration script for MySQL has merged. The postgresql version is still up for review.

Consumer generations are now being used in some allocation handling in nova.

Questions

  • What should we do about nova calling the placement db, like in nova-manage and nova-status.

  • Should we consider starting a new extraction etherpad? The old one has become a bit noisy and out of date.

Bugs

Specs

Many of these specs don't seem to be getting much attention. Can the dead ones be abandoned?

So many specs.

Main Themes

Making Nested Useful

Work on getting nova's use of nested resource providers happy and fixing bugs discovered in placement in the process. This is creeping ahead. There is plenty of discussion going along nearby with regards to various ways they are being used, notably GPUs.

I feel like I'm missing some things in this area. Please let me know if there are others. This is related:

Extraction

There continue to be three main tasks in regard to placement extraction:

  1. upgrade and integration testing
  2. database schema migration and management
  3. documentation publishing

The upgrade aspect of (1) is in progress with a patch to grenade and a patch to devstack. This is very close to working. The remaining failures are with jobs that do not have openstack/placement in $PROJECTS.

Once devstack is happy then we can start thinking about integration testing using tempest. I've started some experiments with using gabbi for that. I've explained my reasoning in a blog post.

Successful devstack is dependent on us having a reasonable solution to (2). For the moment a hacked up script is being used to create tables. This works, but is not sufficient for deployers nor for any migrations we might need to do.

Moving to alembic seems a reasonable thing to do, as a part of that.

We have work in progress to tune up the documentation but we are not yet publishing documentation (3). We need to work out a plan for this. Presumably we don't want to be publishing docs until we are publishing code, but the interdependencies need to be teased out.

Other

Going to start highlighting some specific changes across several projects. If you're aware of something I'm missing, please let me know.

End

I'm going to be away next week, so if any my pending code needs some fixes and is blocking other stuff, please fix it. Also, there will be no pupdate next week (unless someone else does one).

by Chris Dent at October 05, 2018 01:30 PM

John Likes OpenStack

Updating ceph-ansible in a containerized undercloud

Update

What's below won't be the case for much longer because ceph-ansible will be come a dependency of TripleO and the mistral-executor container will bind mount the ceph-ansible source directory on the container host. What's in this post could still be used as an example of updating a package in a TripleO container but don't be mislead about it being the way to update ceph-ansible any longer.

Original Content

In Rocky the TripleO undercloud will run containers. If you're using TripleO to deploy Ceph in Rocky, this means that ceph-ansible shouldn't be installed on your undercloud server directly because your undercloud server is a container host. Instead ceph-ansible should be installed on the mistral-executor container because, as per config-download, That is the container which runs ansible to configure the overcloud.

If you install ceph-ansible on your undercloud host it will lead to confusion about what version of ceph-ansible is being used when you try to debug it. Instead install it on the mistral-executor container.

So this is the new normal in Rocky on an undercloud that can deploy Ceph:


[root@undercloud-0 ~]# rpm -q ceph-ansible
package ceph-ansible is not installed
[root@undercloud-0 ~]#

[root@undercloud-0 ~]# docker ps | grep mistral
0a77642d8d10 192.168.24.1:8787/tripleomaster/openstack-mistral-api:2018-08-20.1 "kolla_start" 4 hours ago Up 4 hours (healthy) mistral_api
c32898628b4b 192.168.24.1:8787/tripleomaster/openstack-mistral-engine:2018-08-20.1 "kolla_start" 4 hours ago Up 4 hours (healthy) mistral_engine
c972b3e74cab 192.168.24.1:8787/tripleomaster/openstack-mistral-event-engine:2018-08-20.1 "kolla_start" 4 hours ago Up 4 hours (healthy) mistral_event_engine
d52708e0bab0 192.168.24.1:8787/tripleomaster/openstack-mistral-executor:2018-08-20.1 "kolla_start" 4 hours ago Up 4 hours (healthy) mistral_executor
[root@undercloud-0 ~]#

[root@undercloud-0 ~]# docker exec -ti d52708e0bab0 rpm -q ceph-ansible
ceph-ansible-3.1.0-0.1.rc18.el7cp.noarch
[root@undercloud-0 ~]#

So what happens if you're in a situation where you want to try a different ceph-ansible version on your unercloud?

In the next example I'll update my mistral-executor container from ceph-ansible rc18 to rc21. These commands are just variations of the upstream documentationbut with a focus on updating the undercloud, not overcloud, container. Here's the image I want to update:


[root@undercloud-0 ~]# docker images | grep mistral-executor
192.168.24.1:8787/tripleomaster/openstack-mistral-executor 2018-08-20.1 740bb6f24755 2 days ago 1.05 GB
[root@undercloud-0 ~]#
I have a copy of ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm in my current working directory

[root@undercloud-0 ~]# mkdir -p rc21
[root@undercloud-0 ~]# cat > rc21/Dockerfile <
> FROM 192.168.24.1:8787/tripleomaster/openstack-mistral-executor:2018-08-20.1
> USER root
> COPY ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm .
> RUN yum install -y ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm
> USER mistral
> EOF
[root@undercloud-0 ~]#
So again that file is (for copy/paste later):

[root@undercloud-0 ~]# cat rc21/Dockerfile
FROM 192.168.24.1:8787/tripleomaster/openstack-mistral-executor:2018-08-20.1
USER root
COPY ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm .
RUN yum install -y ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm
USER mistral
[root@undercloud-0 ~]#
Build the new container

[root@undercloud-0 ~]# docker build --rm -t 192.168.24.1:8787/tripleomaster/openstack-mistral-executor:2018-08-20.1 ~/rc21
Sending build context to Docker daemon 221.2 kB
Step 1/5 : FROM 192.168.24.1:8787/tripleomaster/openstack-mistral-executor:2018-08-20.1
---> 740bb6f24755
Step 2/5 : USER root
---> Using cache
---> 8d7f2e7f9993
Step 3/5 : COPY ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm .
---> 54fbf7185eec
Removing intermediate container 9afe4b16ba95
Step 4/5 : RUN yum install -y ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm
---> Running in e80fce669471

Examining ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm: ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch
Marking ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch.rpm as an update to ceph-ansible-3.1.0-0.1.rc18.el7cp.noarch
Resolving Dependencies
--> Running transaction check
---> Package ceph-ansible.noarch 0:3.1.0-0.1.rc18.el7cp will be updated
---> Package ceph-ansible.noarch 0:3.1.0-0.1.rc21.el7cp will be an update
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
Package
Arch Version Repository Size
================================================================================
Updating:
ceph-ansible
noarch 3.1.0-0.1.rc21.el7cp /ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch 1.0 M

Transaction Summary
================================================================================
Upgrade 1 Package

Total size: 1.0 M
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Updating : ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch 1/2
Cleanup : ceph-ansible-3.1.0-0.1.rc18.el7cp.noarch 2/2
Verifying : ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch 1/2
Verifying : ceph-ansible-3.1.0-0.1.rc18.el7cp.noarch 2/2

Updated:
ceph-ansible.noarch 0:3.1.0-0.1.rc21.el7cp

Complete!
---> 41a804e032f5
Removing intermediate container e80fce669471
Step 5/5 : USER mistral
---> Running in bc0db608c299
---> f5ad6b3ed630
Removing intermediate container bc0db608c299
Successfully built f5ad6b3ed630
[root@undercloud-0 ~]#
Upload the new container to the registry:

[root@undercloud-0 ~]# docker push 192.168.24.1:8787/tripleomaster/openstack-mistral-executor:2018-08-20.1
The push refers to a repository [192.168.24.1:8787/tripleomaster/openstack-mistral-executor]
606ffb827a1b: Pushed
fc3710ffba43: Pushed
4e770d9096db: Layer already exists
4d7e8476e5cd: Layer already exists
9eef3d74eb8b: Layer already exists
977c2f6f6121: Layer already exists
00860a9b126f: Layer already exists
366de6e5861a: Layer already exists
2018-08-20.1: digest: sha256:50aae064d930e8d498702673c6703b70e331d09e966c6f436b683bb152e80337 size: 2007
[root@undercloud-0 ~]#
Now we see new the f5ad6b3ed630 container in addition to the old one:

[root@undercloud-0 ~]# docker images | grep mistral-executor
192.168.24.1:8787/tripleomaster/openstack-mistral-executor 2018-08-20.1 f5ad6b3ed630 4 minutes ago 1.09 GB
192.168.24.1:8787/tripleomaster/openstack-mistral-executor 740bb6f24755 2 days ago 1.05 GB
[root@undercloud-0 ~]#
The old container is still running though:

[root@undercloud-0 ~]# docker ps | grep mistral
373f8c17ce74 192.168.24.1:8787/tripleomaster/openstack-mistral-api:2018-08-20.1 "kolla_start" 6 hours ago Up 6 hours (healthy) mistral_api
4f171deef184 192.168.24.1:8787/tripleomaster/openstack-mistral-engine:2018-08-20.1 "kolla_start" 6 hours ago Up 6 hours (healthy) mistral_engine
8f25657237cd 192.168.24.1:8787/tripleomaster/openstack-mistral-event-engine:2018-08-20.1 "kolla_start" 6 hours ago Up 6 hours (healthy) mistral_event_engine
a7fb6df4e7cf 740bb6f24755 "kolla_start" 6 hours ago Up 6 hours (healthy) mistral_executor
[root@undercloud-0 ~]#
Merely updating the image doesn't restart the container and neither does `docker restart a7fb6df4e7cf`. Instead I need to stop it and start it but there's a lot that goes into starting these containers with the correct parameters.

The upstream docs section on Debugging with Paunch shows me a command to get the exact command that was used to start my container. I just needed to use `paunch list | grep mistral` first to know I need to look at the tripleo_step4.


[root@undercloud-0 ~]# paunch debug --file /var/lib/tripleo-config/docker-container-startup-config-step_4.json --container mistral_executor --action print-cmd
docker run --name mistral_executor-glzxsrmw --detach=true --env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS --net=host --health-cmd=/openstack/healthcheck --privileged=false --restart=always --volume=/etc/hosts:/etc/hosts:ro --volume=/etc/localtime:/etc/localtime:ro --volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro --volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume=/dev/log:/dev/log --volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro --volume=/etc/puppet:/etc/puppet:ro --volume=/var/lib/kolla/config_files/mistral_executor.json:/var/lib/kolla/config_files/config.json:ro --volume=/var/lib/config-data/puppet-generated/mistral/:/var/lib/kolla/config_files/src:ro --volume=/run:/run --volume=/var/run/docker.sock:/var/run/docker.sock:rw --volume=/var/log/containers/mistral:/var/log/mistral --volume=/var/lib/mistral:/var/lib/mistral --volume=/usr/share/ansible/:/usr/share/ansible/:ro --volume=/var/lib/config-data/nova/etc/nova:/etc/nova:ro 192.168.24.1:8787/tripleomaster/openstack-mistral-executor:2018-08-20.1
[root@undercloud-0 ~]#
Now that I know the command I can see my six-hour old conatiner:

[root@undercloud-0 ~]# docker ps | grep mistral_executor
a7fb6df4e7cf 740bb6f24755 "kolla_start" 6 hours ago Up 12 minutes (healthy) mistral_executor
[root@undercloud-0 ~]#
stop it

[root@undercloud-0 ~]# docker stop a7fb6df4e7cf
a7fb6df4e7cf
[root@undercloud-0 ~]#
ensure it's gone

[root@undercloud-0 ~]# docker rm a7fb6df4e7cf
Error response from daemon: No such container: a7fb6df4e7cf
[root@undercloud-0 ~]#
and then run the command I got from above to start the container and finally see my new container

[root@undercloud-0 ~]# docker ps | grep mistral-executor
d8e4073441c0 192.168.24.1:8787/tripleomaster/openstack-mistral-executor:2018-08-20.1 "kolla_start" 14 seconds ago Up 13 seconds (health: starting) mistral_executor-glzxsrmw
[root@undercloud-0 ~]#
Finally I confirm that my container has the new ceph-ansible package:

(undercloud) [stack@undercloud-0 ~]$ docker exec -ti d8e4073441c0 rpm -q ceph-ansible
ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch
(undercloud) [stack@undercloud-0 ~]$
I was then able to deploy my overcloud and see that the rc21 version fixed a bug.

by John (noreply@blogger.com) at October 05, 2018 12:13 PM

Opensource.com

How to 'Kubernetize' an OpenStack service

Kuryr-Kubernetes provides networking for Kubernetes pods by using OpenStack Neutron and Octavia.

by dulek at October 05, 2018 07:01 AM

October 04, 2018

Emilien Macchi

OpenStack Containerization with Podman – Part 1 (Undercloud)

In this series of blog posts, we’ll demonstrate how we can replace Docker by Podman when deploying OpenStack containers with TripleO.

Group of seals, also named as a pod
Group of seals, also named as a pod

This first post will focus on the Undercloud (the deployment cloud) which contains the necessary components to deploy and manage an “Overcloud” (a workload cloud). During the Rocky release, we switched the Undercloud to be containerized by default, using the same mechanism as we did for the Overcloud. If you need to be convinced by Podman, I strongly suggest to see this talk but in short, Podman bring more security and make systems more lightweight. It also brings containers into a Kubernetes friendly environment.

Note: Deploying OpenStack on top of Kubernetes isn’t in our short-term roadmap and won’t be discussed during these blog posts for now.

To reproduce this demo, you’ll need to follow the official documentation which explains how to deploy an Undercloud but change the undercloud.conf to have container_cli = podman (instead of default docker for now).

In the next post, we’ll talk about operational changes when containers are managed with Podman versus Docker.

by Emilien at October 04, 2018 11:22 PM

NFVPE @ Red Hat

A Kubernetes Operator Tutorial? You got it, with the Operator-SDK and an Asterisk Operator!

So you need a Kubernetes Operator Tutorial, right? I sure did when I started. So guess what? I got that b-roll! In this tutorial, we’re going to use the Operator SDK, and I definitely got myself up-and-running by following the Operator Framework User Guide. Once we have all that setup – oh yeah! We’re going to run a custom Operator. One that’s designed for Asterisk, it can spin up Asterisk instances, discover them as services and dynamically create SIP trunks between n-number-of-instances of Asterisk so they can all reach one another to make calls between them. Fire up your terminals, it’s time to get moving with Operators.

by Doug Smith at October 04, 2018 04:55 PM

OpenStack Superuser

How open source projects are pushing the shift to edge computing

Gnanavelkandan Kathirvel of AT&T is sure of one thing: it will take a large group of open-source projects working together to push computing closer to the edge.

He’s behind the telecom’s efforts at Akraino Edge Stack, a Linux Foundation project that aims to create an open-source software for edge.  The AT&T contribution is designed for carrier-scale edge computing applications running in virtual machines and containers to support reliability and performance requirements.

To accomplish this, Arkraino will count on collaboration from other open source projects including ONAP, OpenStack, Airship, Kubernetes, Docker, Ceph, ONF, EdgeXFoundry and more. To ensure that there are no holes in the functionalities will require strong collaboration between Arkraino and upstream open-source communities. “Besides the upstream work, the community will innovate and develop solutions that don’t belong in the upstream communities to support a broad spectrum of edge use cases,” he adds.

The push for edge is driven by “clear business value:”  by combining edge cloud services with 5G networks, next-generation applications will have access to near-real-time processing, leading to new opportunities. Telecom providers also have brick-and-mortar assets like central offices that could provide prime locations to allow third-party cloud providers to host their edge clouds, he notes.

As for the eternal question of edge pushing out the cloud, Kathirvel says there’s room for everyone.

“You could contrast edge computing with the highly-centralized computing resources of cloud computing supported by service providers and web companies,” Kathirvel says. “There will still be a need for centralized cloud computing, but we will need complementary edge computing to help enable next-generation edge technologies.”

Get involved

Projects

OpenStack Edge Working group offers a number of resources and points of contact:

EdgeX Foundry

Here’s a guide on how to get started, plus a calendar packed with community meetings (from devices to dev-ops) – here’s the full monthly schedule.

Airship Project
Airship in a Bottle lets you try all of the services in a single environment appropriate for testing.

Mailing lists: lists.airshipit.org

Freenode IRC: #airshipit

Akraino Edge Stack

Mailing lists
Wiki

Events

At the upcoming Berlin Summit, there’s an entire track dedicated to edge, plus a pre-Summit hackathon.

Of particular interest, a session titled “Comparing Open Edge Projects” offers  a detailed look into the architecture of Akraino, StarlingX and OpenCord and compares them with ETSI MEC RA. Speakers include 99cloud’s Li Kai, Shuquan Huang and Intel’s Jianfeng JF Ding.

 

Via CIO

// CC BY NC

The post How open source projects are pushing the shift to edge computing appeared first on Superuser.

by Nicole Martinelli at October 04, 2018 02:08 PM

Chris Dent

Gabbi in the Gate

Imagine being able to add integration API tests to an OpenStack project by creating a directory and adding a YAML file in that directory that is those tests. That's the end game of what I'm trying to do with gabbi-tempest and some experiments with zuul jobs.

Gabbi is a testing tool for HTTP APIs that models the requests and responses of a series of HTTP requests in a YAML file. Gabbi-tempest integrates gabbi with tempest, an integration test suite for OpenStack, to provide some basic handling for access to the service catalog and straightforward authentication handling within the gabbi files. Tempest sets up the live services, and provides some images and flavors to get you started.

Here's a simple example that sets some defaults for all requests and then verifies the version discovery doc for the placement service using JSONPath:

defaults:
    request_headers:
        x-auth-token: $ENVIRON['SERVICE_TOKEN']
        content-type: application/json
        accept: application/json
        openstack-api-version: 'compute latest, placement latest'
        verbose: True

tests:

    - name: get placement version
      GET: $ENVIRON['PLACEMENT_SERVICE']
      response_json_paths:
          $.versions[0].id: v1.0

This is used in a work-in-progress change to placement. It is a work in progress because there are several pieces which need to come together to make the process as clean and delightful as possible.

The desired endpoint is that for a project to turn this kind of testing on they would:

  1. Add an entry to templates in their local .zuul.yaml, something like openstack-tempest-gabbi. This would cause a few different things:

    • openstack-tempest-gabbi jobs added to both gate and check.
    • That job would set or extend the GABBI_TEMPEST_PATH environment variable to include a gabbits directory from the code checkout of the current service. That environment variable defines where the gabbi-tempest plugin looks for YAML files.
    • And then run tempest:
      • with the gabbi-tempest plugin
      • With tempest_test_regex: 'gabbi' to limit the tests to just gabbit tests (not necessary if other tempest tests are desired).
      • With tox_envlist: all, which is the tox environment that is the current correct choice when wanting to use a test regex.
  2. Create a directory in their repo, perhaps gabbits and put one or more gabbi YAML files like the example above in there.

We're some distance from that but there are pieces in progress that move things in that direction. I'm hoping that the above gives a better description of what I'm hoping to achieve and encourages people to help, because I need some.

Some of the in progress pieces:

by Chris Dent at October 04, 2018 12:30 PM

Opensource.com

What we learned building a Zuul CI/CD cloud

Learn about valuable insight one team gained while developing a community cloud to support OpenStack.

by studarus at October 04, 2018 07:03 AM

October 03, 2018

NFVPE @ Red Hat

Interconnecting VNFs on your laptop

For the past few weeks, I have been working on creating a virtualized environment that would allow me to test different configurations and topologies with Virtualized...

by Ricky Noriega at October 03, 2018 10:00 PM

OpenStack Superuser

Why now is the time to automate faster than evil

The demand for open source is growing exponentially. But some see threats growing along with the opportunity. The fourth edition of Sonatype’s 2018 “State of the Software Supply Chain Report”  offers a look at the risks and benefits. This time around, the 38-page report also highlights new methods hackers are using to infiltrate software supply chains, adds analysis across languages and ecosystems as well as explores how government regulations may impact the future of software development. You can download the report, free with email registration, here.

Research conducted by Sonatype shows a hockey-stick of demand for open source. Using download requests for Java components as a proxy, those requests started at a billion in 2008 and reached 87 billion in just nine years. Still, the first chapter of the report titled “We’re all Equifax” highlights a couple of key findings from their 2018 “DevSecOps Community Survey” of 2,076 IT professionals, namely that 30 percent of respondents claimed a breach stemming from the use of vulnerable open source components and that since 2014 (when Heartbleed made headlines) open-source related breaches have climbed 121 percent.

“Dev-ops-native organizations with the ability to continuously deploy software releases have an automation advantage that allows them to stay one step ahead of the hackers,” note the report authors. In a chapter dedicated to automation, the study finds that in mature dev-ops organizations, there’s bee a 15 percent year-over-year jump to 57 percent in automated security, implemented throughout each stage of the software development life cycle. The top three investments for automated security noted were web application firewalls, container security and open-source governance.

The idea of automating ahead of evil comes from  Forrester’s March 2018 “Top Recommendations For Your Security Program,” where the analysts advised security teams to think about what will happen when the volume of attacks is bolstered by artificial intelligence and machine learning. Manual methods to detect, investigate and respond to threats will guarantee failure in the near future, they predict.

“When it comes to using open source components to manufacture modern software, the bottom line is this — precise intelligence is critical,” say the Sonotype authors. They back this up this claim with original research analyzing 6,000 open source components to understand the efficacy of CPE (common platform enumeration)- based vulnerability matching identifying 1,034 true positives, 5,330 false positives (when CPE was part of the coordinate name) and 2,969 false negatives. Crucial to stemming these problems are employing newer components and managed software supply chains, they note.

Check out the full report here.

The post Why now is the time to automate faster than evil appeared first on Superuser.

by Nicole Martinelli at October 03, 2018 04:16 PM

October 02, 2018

OpenStack @ NetApp

My Name is Rocky

The Rocky release of OpenStack went public on August 28th and NetApp® is proud to have contributed in its development. NetApp continues to cement its place as a leading storage solutions provider for OpenStack and has added the following new features in Cinder and Manila. Cinder Multiattachment of volumes: NetApp drivers can now support multiattachment ... Read more

The post My Name is Rocky appeared first on thePub.

by Bala RameshBabu at October 02, 2018 09:43 PM

OpenStack Superuser

Jump-start your cloud skills with an Outreachy internship

Outreachy offers three-month paid internships for people from groups that are traditionally underrepresented in tech.

Outreachy interns work remotely with mentors from Free and Open Source Software (FOSS) communities on projects ranging from programming, user experience, documentation, illustration and graphical design, to data science. Which organizations could you pair up with? An all-star group of FOSS organizations support Outreachy, including Mozilla, The Linux Foundation, Red Hat, Gnome and The Cloud Native Computing Foundation, among others.

The deadline for applications for the December-March internships is Oct. 30, 2018. Anyone who faces systematic bias or discrimination in the technology industry of their country is invited to apply.

The OpenStack coordinators for this round of internships are Mahati Chamarthy and Samuel de Medeiros Queiroz. OpenStack has two projects that need help this time around:

  • Create Redfish Ansible module
    Required skills: Python, Ansible
    Optional skills: Redfish
  • Extend sushy to support RAID
    Required skills: Python, OpenStack
    Optional skills: Redfish

Selected applicants are paired with a mentor, usually a full-time contributor to the project and will spend three months learning what it’s like to work in the open source world. Interns should be available for a full-time, 40 hours a week internship from December 4, 2018 to March 4, 2019.

Here’s what some past Outreachy interns have to say about the program:

— vkmc (@vkmc) August 22, 2017

Pay it forward

If you would like to help out as a mentor, Outreachy is currently looking for FOSS communities and mentors to participate in our next internship round. Read the  community application guide for more information about participating in the program.

You can find out more about OpenStack’s Outreachy program here.

 

The post Jump-start your cloud skills with an Outreachy internship appeared first on Superuser.

by Superuser at October 02, 2018 04:44 PM

Chris Dent

TC Report 18-40

I'm going to take a break from writing the TC reports for a while. If other people (whether on the TC or not) are interested in producing their own form of a subjective review of the week's TC activity, I very much encourage you to do so. It's proven an effective way to help at least some people maintain engagement.

I may pick it up again when I feel like I have sufficient focus and energy to produce something that has more value and interpretation than simply pointing at the IRC logs. However, at this time, I'm not producing a product that is worth the time it takes me to do it and the time it takes away from doing other things. I'd rather make more significant progress on fewer things.

In the meantime, please join me in congratulating and welcoming the newly elected members of the TC: Lance Bragstad, Jean-Philippe Evrard, Doug Hellman, Julia Kreger, Ghanshyam Mann, and Jeremy Stanley.

by Chris Dent at October 02, 2018 03:22 PM

October 01, 2018

StackHPC Team Blog

Kayobe and Rundeck

Operational Hygiene for Infrastructure as Code

The Rundeck logo

Rundeck is an infrastructure automation tool, aimed at simplifying and streamlining operational process when it comes to performing a particular task, or ‘job’. That sounds pretty grand, but basically what it boils down to is being able to click a button on a web-page or hit a simple API in order to drive a complex task; For example - something that would otherwise involve SSH’ing into a server, setting up an environment, and then running a command with a specific set of options and parameters which, if you get them wrong, can have catastrophic consequences.

This can be the case with a tool as powerful and all-encompassing as Kayobe. The flexibility and agility of the CLI is wonderful when first configuring an environment, but what about it when it comes to day two operations and business-as-usual (BAU)? How do you ensure that your cloud operators are following the right process when reconfiguring a service? Perhaps you introduced 'run books', but how do you ensure a rigorous degree of consistency to this process? And how do you glue it together with some additional automation? So many questions!

Of course, when you can't answer any or all of these questions, it's difficult to maintain a semblance of 'operational hygiene'. Not having a good handle on whether or not a change is live in an environment, how it's been propagated, or by whom, can leave infrastructure operators in a difficult position. This is especially true when it's a service delivered on a platform as diverse as OpenStack.

Fortunately, there are applications which can help with solving some of these problems - and Rundeck is precisely one of those.

Integrating Kayobe

Kayobe has a rich set of features and options, but often in practice - especially in BAU - there's perhaps only a subset of these options and their associated parameters that are required. For our purposes at StackHPC, we've mostly found those to be confined to:

  • Deployment and upgrade of Kayobe and an associated configuration;
  • Sync. of version controlled kayobe-config;
  • Container image refresh (pull);
  • Service deployment, (re)configuration and upgrade.

This isn't an exhaustive list, but these have been the most commonly run jobs with a standard set of options i.e those targetting a particular service. A deployment will eventually end up with a 'library' of jobs in Rundeck that are capable of handling the majority of Kayobe's functionality, but in our case and in the early stages we found it useful to focus on what's immediately required in practical terms, refactoring and refining as we go.

Structure and Usage

Rundeck has no shortage of options when it comes to triggering jobs, including the ability to fire off Ansible playbooks directly - which in some ways makes it a poor facsimile of AWX. Rundeck's power though comes from its flexibility, so having considered the available options, the most obvious solution seemed to be utilising a simple wrapper script around kayobe itself, which would act as the interface between the two - managing the initialisation of the working environment and capable of passing a set of options based on a set of selections presented to the user.

Rundeck allows you to call jobs from other projects, so we started off by creating a library project which contains common jobs that will be referenced elsewhere such as this Kayobe wrapper. The individual jobs themselves then take a set of options and pass these through to our script, with an action that reflects the job's name. This keeps things reasonably modular and is a nod towards DRY principles.

The other thing to consider is the various 'roles' of operators (and I use this in the broadest sense of the term) within a team, or the different hats that people need to wear during the course of their working day. We've found that three roles have been sufficient up until now - the omnipresent administrator, a role for seeding new environments, and a 'read-only' role for BAU.

Finally it's worth mentioning Rundeck's support for concurrency. It's entirely possible to kick off multiple instances of a job at the same time, however this is something to be avoided when implementing workflows based around tools such as Kayobe.

With those building blocks in place we were then able to start to build other jobs around these on a per-project (environment) basis.

Example

Let's run through a quick example, in which I pull in a change that's been merged upstream on GitHub and then reconfigure a service (Horizon).

The first step is to synchronise the version-controlled configuration repository from which Kayobe will deploy our changes. There aren't any user-configurable options for this job (the 'root' path is set by an administrator) so we can just go ahead and run it:

Screenshot showing image sync options

The default here is to 'follow execution' with 'log output', which will echo the (standard) output of the job as it's run:

Screenshot showing completed image sync

Note that this step could be automated entirely with webhooks that call out to Rundeck to run that job when our pull request has been merged (with the requisite passing tests and approvals).

With the latest configuration in place on my deployment host, I can now go ahead and run the job that will reconfigure Horizon for me:

Screenshot showing service reconfiguration options

And again, I can watch Kayobe's progress as it's echoed to stdout for the duration of the run:

Screenshot showing service reconfiguration output

Note that jobs can be aborted, just in case something unintended happens during the process.

Of course, no modern DevOps automation tool would be complete without some kind of Slack integration. In our #rundeck channel we get notifications from every job that's been triggered, along with its status:

Screenshot showing Rundeck job status in Slack

Once the service reconfiguration job has completed, our change is then live in the environment - consistency, visibility and ownership maintained throughout.

CLI

For those with an aversion to using a GUI, as Rundeck has a comprehensive API you'll be happy to learn that you can use a CLI tool in order to interact with it and do all of the above from the comfort of your favourite terminal emulator. Taking the synchronisation job as an example:

[stack@dev-director nick]$ rd jobs list | grep -i sync
2d917313-7d4b-4a4e-8c8f-2096a4a1d6a3 Kayobe/Configuration/Synchronise

[stack@dev-director nick]$ rd run -j Kayobe/Configuration/Synchronise -f
# Found matching job: 2d917313-7d4b-4a4e-8c8f-2096a4a1d6a3 Kayobe/Configuration/Synchronise
# Execution started: [145] 2d917313-7d4b-4a4e-8c8f-2096a4a1d6a3 Kayobe/Configuration/Synchronise <http://10.60.210.1:4440/project/AlaSKA/execution/show/145>
Already on 'alaska-alt-1'
Already up-to-date.

Conclusions and next steps

Even with just a relatively basic operational subset of Kayobe's features being exposed via Rundeck, we've already added a great deal of value to the process around managing OpenStack infrastructure as code. Leveraging Rundeck gives us a central point of focus for how change, no matter how small, is delivered into an environment. This provides immediate answers to those difficult questions posed earlier, such as when a change is made and by whom, all the while streamlining the process and exposing these new operational functions via Rundeck's API, offering further opportunities for integration.

Our plan for now is to try and standardise - at least in principle - our approach to managing OpenStack installations via Kayobe with Rundeck. Although it's already proved useful, further development and testing is required to refine workflow and to expand its scope to cover operational outliers, and on the subject of visibility the next thing on the list for us to integrate is ARA.

If you fancy giving Rundeck a go, getting started is surprisingly easy thanks to the official Docker images as well as some configuration examples. There's also this repository which comprises some of our own customisations, including minor fix for the integration with Ansible.

Kick things off via docker-compose and in a minute or two you'll have a couple of containers, one for Rundeck itself and one for MariaDB:

nick@bluetip:~/src/riab> docker-compose up -d
Starting riab_mariadb_1 ... done
Starting riab_rundeck_1 ... done
nick@bluetip:~/src/riab> docker-compose ps
     Name                  Command             State                Ports
---------------------------------------------------------------------------------------
riab_mariadb_1   docker-entrypoint.sh mysqld   Up      0.0.0.0:3306->3306/tcp
riab_rundeck_1   /opt/boot mariadb /opt/run    Up      0.0.0.0:4440->4440/tcp, 4443/tcp

Point your browser at the host where you've deployed these containers and port 4440, and all being well you'll be struck with the login page.

Feel free to reach out on Twitter or via IRC (#stackhpc on Freenode) with any comments or feedback!

by Nick Jones at October 01, 2018 04:00 PM

OpenStack Superuser

Zuul case study: Packet Host

Zuul drives continuous integration, delivery and deployment systems with a focus on project gating and interrelated projects. In a series of interviews, Superuser asks users about why they chose it and how they’re using it.

John Studarus, OpenStack Ambassador, contributed this case study.

Contributing to open source projects, such as OpenStack, traditionally involves individuals and companies alike providing code contributions adding new features and fixing bugs. For almost two years, I’ve been running one-off OpenStack clouds for demos and labs at user group meetings across the United States on hardware donated from a bare metal service provider, Packet Host. Six months ago, they asked how they could make a larger donation to the community which led us to our path: building a community cloud to support OpenStack.

Every day, there are hundreds of code commits to the OpenStack code base that need to be tested as part of the continuous integration system managed by Zuul. Each commit runs through a series of tests (or gates) before a human review and then the gates run again before a code merge. All of these gates run across a pool of virtual machines instances (over 900 instances at peak times) donated by a number of public cloud providers. All of the OpenStack CI is dependent on donated computing resources. The OpenStack Infra team coordinates all of these cloud providers and served as our point of contact for donating these resources.

We set out to build a cloud where all the computing resources would be dedicated to the OpenStack Infra program. Building out our cloud, we had to meet the minimum requirements set by the OpenStack Infra team, namely support for a 100 concurrent VM instances each with 8 GB RAM, eight vCPUs and 80 GB storage. Packet Host allocated us 11 bare metal servers and an IPv4 /29 subnet to be used for floating IPs. With the computing and network resources in place, we moved ahead with the OpenStack architecture and implementation.

Since the test instance runs and the local mirror utilizes AFS, a distributed file system, all use ephemeral storage, the decision was made not to setup any persistent storage on the cloud. By eliminating the need to use Cinder storage service, more bare metal resources could be allocated to the Nova compute service thus supporting more concurrent test instances.

Working with the OpenStack Infra team has really opened my eyes up to the capabilities of Zuul and the frameworks they’ve put together. I had the opportunity to catch up with the OpenInfra Team at the most recent PTG. They realize that Zuul can put a strain on any cloud and are happy to work through issues that arise. Best of all, they run a great set of tools providing such metrics such as failed launch attempts and time to ready allowing me to identify issues as soon as possible.

A CI system, such as Zuul, puts an extreme load on a cloud environment as it continuously spins up and down virtual instances. While a typical instance might be up for weeks or months, a CI instance through Zuul lives, on average, a few hours. This means the control plane will always be busy stopping and starting services. Through the tools provided by the OpenStack Infra team, I was able to identify performance issues. In the first few months of operations, we quickly realized we had to upsize the control plane to handle the workload and reconfigure the image storage space to handle the disk images created daily by Zuul.

One of the limiting factors of this cloud is the availability of IPv4 addressing. Each test instance requires a floating IP address to communicate back to Zuul. Since we do have the compute resources, RAM and CPU, to group the cloud, we intend to start provisioning test instances with IPv6 addresses. Zuul and the OpenStack Infra project both already support IPv6.

While we’re continuing to improve this community run cloud, we’re also looking forward to what else we can provide across this donated hardware. Nodepool has driver capabilities to handle resources outside of OpenStack which we’re very interested in using automated bare metal support. We’re also hoping to extend CI resources to other open source projects through this same Zuul and Nodepool framework as well.

Technically and philanthropically, setting up and running this cloud has been rewarding! It’s been a great experience working with the OpenStack-Infra team and seeing everything they’re doing with Zuul. The knowledge I’ve gained running a cloud to support the OpenStack Infra team has far exceed my experiences running one-off clouds for user group demonstrations. If you’re an OpenStack cloud provider (public or private) and have an interest in donating resources to OpenStack, I encourage you to reach out to me or the OpenStack Infra team for more information.

// CC BY NC

The post Zuul case study: Packet Host appeared first on Superuser.

by Superuser at October 01, 2018 02:02 PM

September 28, 2018

Chris Dent

Placement Update 18-39

Welcome to a placement update. This week is mostly focused on specs and illuminating some of the pressing issues with extraction.

Most Important

Last week's important tasks remain important:

  • Work on specs and setting priorities.
  • Working towards upgrade tests (see more on that in the extraction section below).

What's Changed

Tetsuro is a core reviewer in placement now. Yay! Welcome.

Mel produced a summary of the PTG with some good links and plans.

Questions and Links

No answer to last week's question that I can recall, so here it is again:

  • [Last week], belmoreira showed up in #openstack-placement with some issues with expected resource providers not showing up in allocation candidates. This was traced back to max_unit for VCPU being locked at == total and hardware which had had SMT turned off now reporting fewer CPUs, thus being unable to accept existing large flavors. Discussion ensued about ways to potentially make max_unit more manageable by operators. The existing constraint is there for a reason (discussed in IRC) but that reason is not universally agreed.

    There are two issues with this: The "reason" is not universally agreed and we didn't resolve that. Also, management of max_unit of any inventory gets more complicated in a world of complex NUMA topologies.

Eric has raised a question about the intended purpose of traits.

Bugs

Specs

Main Themes

Making Nested Useful

Work on getting nova's use of nested resource providers happy and fixing bugs discovered in placement in the process.

Consumer Generations

gibi is still working hard to drive home support for consumer generations on the nova side. Because of some dependency management that stuff is currently in the following topic:

Extraction

There are few large-ish things in progress with the extraction process which need some broader attention:

  • Matt is working on a patch to grenade to deal with upgrading, with a migration of data.

  • We have work in progress to tune up the documentation but we are not yet publishing documentation. We need to work out a plan for this. Presumably we don't want to be publishing docs until we are publishing code, but the interdependencies need to be teased out.

  • We need to decide how we are going to manage database schema migrations (alembic is the modern way) and we need to create the tooling for running those migrations (as well as upgrade checks). This includes deciding how we want to manage command line tools (using nova's example or something else).

Until those things happen we don't have a "thing" which people can install and run, unless they do some extra hacking around which we don't want to impose upon people any longer than necessary.

Other

As with last time, I'm not going to make a list of links to pending changes that aren't already listed above. I'll start doing that again eventually (once priorities are more clear), but for now it is useful to look at open placement patches and patches from everywhere which mention placement in the commit message.

End

Taking a few days off is a great way to get out of sync.

by Chris Dent at September 28, 2018 04:00 PM

OpenStack Superuser

Drawing up the architecture for mobile edge and OpenStack

The main goal for the Edge Computing Group at the PTG was to draw an overall architecture diagram to capture the basic setup and requirements towards building an environment suitable for Edge use cases from a set of OpenStack services. Our primary focus was around Keystone and Glance, but discussions with other project teams such as Nova, Ironic and Cinder also took place.

The edge architecture diagrams we drew are part of a minimum viable product (MVP) that defines a minimum set of services and requirements to create a functional system. This architecture will evolve as we collect more use cases and requirements.

To describe edge use cases on a higher level using mobile edge as a use case, we identified three main building blocks:

  • Main or regional data center (DC)
  • Edge sites
  • Far edge sites or cloudlets

We examined these architecture diagrams with the following user stories in mind:

  • As an OpenStack deployer, I want to minimize the number of control planes necessary to manage across a large geographical region.
  • As an OpenStack deployer, I want disk images to be pulled to a cluster on demand, without needing to sync every disk image everywhere.
  • As an OpenStack user, I expect that instance autoscale continues to function in an edge site if connectivity is lost to the main data center.
  • As an OpenStack user, I want to manage all of my instances in a region (from a regional DC to far-edge cloudlets) via a single API endpoint.

The above list is not finalized and not exclusive, rather of a starting point that we captured during the discussions as a first step.

We concluded by talking about service requirements in two major categories:

  1. Edge sites that are fully operational in case of a connection loss between the regional data center and the edge site which requires control plane services running on the edge.
  2. Having full control on the edge site is not critical in the event of a connection loss between the regional data center and an edge site which can be satisfied by having the control plane services running only in the regional data center.

In the first case, the orchestration of the services becomes harder and is not necessarily solved yet, while in the second example users have centralized control but lose functionality on the edge sites without access back to the regional DC.

We did not discuss further items such as high availability (HA) at the PTG or go into details about networking during the architectural discussion.

We agreed to prefer Federation for Keystone and came up with two work items to cover missing functionality:

  • Keystone to trust a token from an ID Provider master and when the auth method is called, perform an idempotent creation of the user, project and role assignments according to the assertions made in the token
  • Keystone should support the creation of users and projects with predictable UUIDs (e.g.: hash of the name of the users and projects). This greatly simplifies image federation and telemetry gathering.

For Glance, we explored image caching and spent some time discussing the option to also cache metadata so a user can boot new instances at the edge in case of a network connection loss which would result in being disconnected from the registry:

  • As a Glance user, I want to upload an image in the main data center and boot that image in an edge data center. Fetch the image to the edge data center with its metadata

We’re still in the progress of documenting these discussions and drawing architecture diagrams and flows for Keystone and Glance.

In addition to the above, we also went through the Dublin PTG Wiki capturing requirements:

  • We agreed to consider the list of requirements on the Wiki finalized for now.
  • We agreed to move there the additional requirements listed on the Use Case Wiki page.

For details on discussions about related OpenStack projects, check out the following Etherpads for notes:

In addition, here are notes from the StarlingX sessions.

We’re still working on cleaning up the MVP architecture and discussing comments and questions before moving it to a Wiki page.

You can find out more about how to get involved — meetings, IRC channel and mailing lists — on the Edge Computing Group Wiki.

 

The post Drawing up the architecture for mobile edge and OpenStack appeared first on Superuser.

by Ildiko Vancsa at September 28, 2018 02:13 PM

September 27, 2018

OpenStack Superuser

Zuul case study: The OpenStack Foundation

Zuul drives continuous integration, delivery and deployment systems with a focus on project gating and interrelated projects. In a series of interviews, Superuser asks users about why they chose it and how they’re using it.

The OpenStack Foundation has been using Zuul for about six years now. Here Superuser talks to Clark Boylan, infrastructure engineer at the OSF, and Doug Hellman, developer, editor, author and veteran OSF community member, about the origin story of Zuul.

How would you describe the days before Zuul?

CB: Prior to Zuul we ran a Jenkins master. We enforced gating with Jenkins, but to avoid one change breaking another we had to serialize the gating of each change. Since this gating took about an hour per change that meant that we could only merge 24 changes per day. This bottleneck became a pain point we knew we would have to address.

DH: Before Zuul, we would regularly land patches that worked in isolation, but that would then break things when combined with changes in other repositories. When I first started working on OpenStack, I could run devstack to set up a local test environment and it would work in the morning but if I wasn’t careful and updated in the afternoon it would fail, at least for an hour or so until someone fixed the problem. We no longer have that problem, in part due to better test coverage, but largely due to Zuul’s speculative merging and multi-repository testing features that allow us to ensure that changes across several repositories are tested together.

What’s the origin story for Zuul – when was it clear it was needed and what problems inside OpenStack was it started to solve?

CB: OpenStack ended up in a situation where it wanted to keep gating changes to OpenStack but have the ability to merge more than 24 changes a day. Running tests more quickly, or running fewer tests were either not possible or less than ideal options. Instead we (mostly Jim) set out to build a system (Zuul) which could parallelize the serial testing of OpenStack. The trick here was to build potential future states and if those passed we could merge them in aggregate rather than waiting for each to pass in succession. If changes failed testing they are evicted from the aggregate and tests are restarted without the failing change.

How did you switch from Jenkins to Zuul?

CB: The initial Zuul system relied on Jenkins to execute jobs for us. This meant that Zuul was the coordinator for Jenkins and the two systems worked together. Around the 2016 Austin Summit, we realized that we could replace Jenkins with an Ansible-based execution system to improve performance and reliability. (We had essentially gotten tired of restarting our Jenkins masters every week to clear out a thread leak). The success of this Ansible based system highly influenced the decision to do the major Zuul v3 rewrite which also uses Ansible to execute jobs.

When was it clear that it was useful to the larger community?

CB: We’ve seen various entities try to take Zuul and use it either as an Internal developer tool or as a CI product for their customers. HP actually did both with forj.io as a product including Zuul and Gozer as an internal CI system for HP developers. I want to say this happened very early on, like within the first year or so.

DH: Red Hat also has a product called Software Factory which includes Zuul, along with several other components. Having multiple product offerings built around Zuul tells me that it’s definitely ready to be used by more than the OpenStack community.

When did you realize it would’ve been a successful spin-off – more like “Breaking Bad” and “Better Call Saul” than “Brady Brides?”

CB: We had seen people trying to use it and others talking about using it but being worried about it being OpenStack specific or requiring specific tools like Gerrit. I think we figured we could accommodate those needs and address those concerns by adding support for common tools like Github and other Nodepool cloud drivers while continuing to support the existing needs of OpenStack.

What can you share about metrics?

CB: I usually go to Grafana for this stuff.

DH: Clark mentioned that early on we were able to land about 24 patches in a day. During the Rocky development cycle (roughly February through August, 2018) we have been averaging over 180 patches approved and merged per day. That does not include patches that were proposed and tested, but not approved and merged.

How do you handle complexity across so many different sub-projects under the OpenStack umbrella?

CB: We try to make things as consistent as possible and provide pre-canned job definitions. This is made somewhat easier than expected because OpenStack, as a whole, is pretty consistent (same programming language, same documentation system, most projects provide a REST API and so on). Zuul supports plenty more and we run tests for Go and Java and other languages and projects as well.

DH:  The enhancements in Version 3, especially the ability to manage test job definitions and other settings in the same repository as the application source code, make it possible for multiple unrelated teams to use a single Zuul deployment without relying on a large group of dedicated operators to write and manage the test jobs.

Several of our OpenStack teams have written their own complex functional and integration test jobs. They didn’t have to start from scratch, because they could build on the common job definitions as a foundation and they could extend those jobs independently to test anything they needed.

What are the takeaways for other users about the power and current limitations?

CB: As for limitations, Github reporting and general dashboard information could be improved. The job execution itself is quite powerful.

DH: Zuul is flexible, and as with any flexible system one needs to take care to plan well to minimize complexity and maximize reuse. It can be easy to fall into a trap of taking the expedient approach when creating similar but slightly different jobs by copying the job definitions. Taking a little bit of time to treat the jobs like you would any other source code by building a library of reusable tools will pay off in terms of reuse, letting you build new jobs more quickly.

What are you hoping the Zuul community will focus on / deliver in the next year?

DH: I’m looking forward to some of the planned API enhancements to let us query the job configuration for a repository. We look at that data directly today, when we enforce policies related to the standard set of jobs that the community has agreed that all projects need to run for minimal testing. Having an API to query that data will let us simplify the Zuul jobs we have that enforce the policies today.

The post Zuul case study: The OpenStack Foundation appeared first on Superuser.

by Nicole Martinelli at September 27, 2018 04:38 PM

Ghanshyam Mann

OpenStack QA Stein PTG Summary

 

We had another great PTG in Denver for Stein cycle. It was good time to meet all co-developers and discuss the stein cycle plan. I am writing the QA PTG summary report. Detailed discussion can be found in respective topic Etherpad  or in main Etherpad.

  QA Help Room

QA team was present in Help Room on Monday. We were happy to help few queries from Octavia multinode job and Kuryr-kubernetes testing part. Other than that, there was not much that day except few other random queries.

  Rocky Retrospective

We discussed the Rocky Retrospective as first thing on Tuesday. We went through 1. what went well and 2. what needs to improve and gather some concrete action items.

Patrole has good progress in Rocky cycle with code as well as documentation. Also we were able to fill the compute microversion gap almost all till Rocky.

    Action Items:
  • Need to add Tempest CLI documentation and other usefull stuff from tripleo Doc to Tempest Doc – chandankumar
  • Run all tests in tempest-full-parallel job and move it to periodic job pipeline – afazekas
  • Need to merge the QA office hour, check with andrea for 17 UTC office hour and if ok then, close that and modify the current office hour from 9 UTC to 8 UTC . – gmann
  • Need to ask chandankumar or manik for bug triage volunteer. – gmann
  • Create the low hanging items list and publish for new contributors – gmann

We will be tracking the above AI in our QA office hour to finish them on time.

Detailed Discussion

  Stable interfaces from Tempest Plugins

We discussed about having stable interface from Tempest plugins like Tempest so that other plugins can consume those. Service client is good example of those which are required to do cross project testing. For example: congress tempest plugin needs to use mistral service clients for integration testing of congress+mistral. Similarly Patrole need to use neutron tempest plugin service client(for n/n-1/n-2).

Idea here is to have lib or stable interface in Tempest plugins side like Tempest so that other plugins can use them. We will start with some documentation about use case and benefits and then work with neutron-tempest-plugin team to make their service client expose as stable interface. Once that is done then, we can suggest the same to other plugins.

    Action Items:
  • Need some documentation and guidance with use case and example, benefits for plugins. – felipemonteiro
  • mailing list discussions on making specific plugins stable that are consumed by other plugins – felipemonteiro
  • check with requirement team to add the tempest plugin in g-r and then those can be added on other plugins requirement.txt – gmann

Detailed Discussion 

  Tempest Plugins CI setup & Plugins release and tagging  clarification

We discussed about how other projects or Plugins can setup the CI to cover the stable branches testing on their master changes. Solution can be simple to define the supported stable branches and run them on master gate (same way Tempest does). QA team will start the guidelines on this.

Other part we need to cover is release and tagging guidelines. There were lot of confusion about release of Tempest plugins in Rocky. To make it better, QA team will write guidelines and document the clear process.

    Action Items:
  • move/update documentation on branchless considerations in tempest to somewhere more global so that it covers plugins documentation too – gmann
  • Add tagging and release clarification for plugins.
  • talk with neutron team about moving in-tree tempest plugins of stadium projects to neutron-tempest-plugin or separate tempest-plugins repositories – slaweq
  • Add config options to disable the plugins load – gmann

Detailed Discussion 

  Tempest Cleanup Feature

Current Tempest CLI for cleanup the test resource is not so good. It does cleanup the resources based on saved_state.json file which save the resources difference before and after Tempest run. This can end up cleaning up the other non-test resources which got created during time period of tempest run.

There is a QA spec which proposing the different approach for cleanup. After discussing all those approach, we decided to go with resource_prefix. We will bring back the resource_prefix approach (which got removed after deprecation) and modify the “tempest cleanup” cli to cleanup resource based on resource_prefix. Complete discussion can be found in etherpad. As of felipemonteiro is the owner but he will check with nicholashelgeson or AT&T folks to further work on this.

    Action Item:
  • Update spec with idea from 0.0.2 (because it’s relatively easy to implement) – get merged – felipemonteiro/nicholashelgeson
  • Add back resource_prefix config option and add back to data_utils.rand_name – felipemonteiro/nicholashelgeson
  • Go through all tempest tests and make sure data_utils.rand_name is used for resources – felipemonteiro/nicholashelgeson
  • Update tempest cleanup – felipemonteiro/nicholashelgeson
  • Update documentation – felipemonteiro/nicholashelgeson

Detailed Discussion

  Tempest conf Plugin Discovery process

This topic is about generating the Tempest conf from plugins config options too. This idea is more for python-tempestconf not for QA as such. But there were python-tempestconf folks present in QA room and discussed that this is doable in python-tempestconf itself.

There is nothing from QA side on this so I would like to drop this item from QA tracking.

Detailed Discussion 

  Proper handling of interface/volume/pci device attach/detach hotplug/unplug

Tempest tests not handling the hotplug/unplug events properly. The guest does not check for button press event at early boot time.

The hotplug events sent before the kernel fully initialized can be lost. test_stamp_pattern.py could be unskipped, if we would try to ssh vm before hot plug event (volume attach). Also there are API tests which knows nothing about the guest state, therefore it cannot know when the guest is ready for checking the button press. Details problem can be found here.

Idea is to perform the ssh validation step before we consider the test server ready to use in test.

    Action Items:
  • Adding ssh validation steps for api/scenrio tests where is required – afazekas
  • making the run_validation default true – afazekas
  • soft reboot , is nova event tells was it soft reboot (check) , is some special register on the machine can tell it ? – afazekas

Detailed Discussion

  Shared network handling

Attila observed few tests failing when using shard network. But looks like the only 100% reproducible issue is test_create_list_show_delete_interfaces_by_fixed_ip

There should not be issue for shared network also and as of now we will just fix the failing tests.

    Action Items:
  • Fix the failing tests – afazekas
  • Try to run tempest in parallel with shared network create/delete parallel testS to search for other incidents locally) – afazekas

Detailed Discussion

  Planning for Patrole Stable release

We continue the Patrole stable release discussion in PTG. We prepared the concrete list of items we need to release it stable and targeted for Stein cycle.

Along with multi-policy, system scope support, we will check the framework stability also. Documentation is already in better shape.

TODO items before stable release:

  1. multi-policy
  2. system scope support:
  3. Better support for generic check policy rules (e.g. user_id:%(user.id))
  4. Iterate through all the patrole interface/framework which needs to be used outside of patrole
    Action Items:
  • let’s finish the above planned items in Stein. – felipemonteiro

Detailed Discussion

  Proposal for New QA project: Harbinger

OpenStack QA currently lacks a dataplane testing framework that can be easily consumed, configured and maintained. To fill that gap, there is a new project proposal called “Harbinger”. Harbinger allows execution of various OpenStack data plane testing frameworks while maintaining a single standardized interface and input format.

Currently it cover Shaker and Yardstick and Kloudbuster is WIP. This can be useful to consume in Eris (extreme testing idea). There are few points which need more clarification like Standardization of output, can it cover Control plane testing etc. IMO, this is good project to start and can be consumed in Eris and cross community testing. Author or this project was not present in PTG and felipemonteiro proxy this too. We would like to extend this discussion on ML and with Extreme testing stackholders and also start the QA spec.

    Action Items:
  • There are many item we planned as AI but first step will to start the ML and spec.

Detailed Discussion 

  Clean up the tempest documentation

This is always outstanding item :-). We discussed about more improvement in documentation like better doc structure, CLI doc, consuming the Tempest related docs at central place which is Tempest. We had list of item to cover with different assignee.

    Action Items:
  • Complete all the documentation points written in etherpad. – tosky, gmann, masayukig

Detailed Discussion

  Consuming all Tempest CLI in gate

Tempest has many CLI and due to lack of unit tests, there are chance where we can/did break the CLI. Idea is to consume all the CLI on gate job so that we can improve their testing coverage. Few CLI will be covered in main gate job and other as functional testing.

    Action Items:
  • Continue this patch on zuul v3 – https://review.openstack.org/#/c/355666/ – masayukig
  • add funcional tests and new functionl job – gmann

Detailed Discussion

  Migration from Launchpad to storyboard

We discussed about possibility of migration to storyboard. Patrole is first QA projects we are trying and based on that we can proceed on other projects.

    Action Items:
  • Wait for BP script from storyboard team and then finish the Patrole projects migration. – gmann
  • Feedback or request for Storyboard:
    • Create a story for adding some filed to indicative user interest (heat / vote / points ) – afazekas
    • Gerrit automation work to story about automatic assignee etc or adding the Topic field on gerrit.
    • Have a way to not diverge too much the set of used tags – possible solution: sort proposed tags by popularity

Detailed Discussion

  QA Stein Priority:

We discussed about the priority items for Stein and listed the items and owner of each items in etherpads.

 

by Ghanshyam Mann at September 27, 2018 10:13 AM

Samuel Cassiba

Five Years in the Stacks

Y’all don’t through one fifth of the tribulation weh I go through. ‘Cause I know, only a name would be here to represent ya.

The year was 2013…

We were in the Good Ol’ Days, before we knew they were the good old days. OpenStack was just taking off, the darling of Rackspace and NASA. I was already out of the Rack by that time, which had since morphed into the powerhouse RAX. Dedicated server gives way to cloud instance, with more and more people eyeballing the warchest AWS was amassing.

In that day, OpenStack was, to put it lightly, a pain in the ass to install. It was cantankerous, difficult to manage, and just not delightful. A requisite was to have a fleet of machines at the ready, because you can just get a change order ayyyy.

My first taste of life in the stacks was not in 2013, but in 2012, where I had a radical idea to suggest OpenStack for a contract in lieu of VMware. We went with VMware and the bid was not accepted for unrelated reasons, but I did get to see how far the tendrils of cloud had crept and taken hold. That life did not further itself, and gave way to other possibilities.

It’s about this time you may notice the story is not five years in length, but six. You, my friend, would be correct in this observation.

I returned to startup life, with OpenStack being a distant thought in my mind; the journey to startup life itself being its own story. I submersed myself in the ins-and-outs of the existing stack. For startup reasons, there was never enough time to do things Properly.

The highlight of the first year was the building doing an e-waste event. In the shuffle, we rescued roughly a rack’s worth of servers to strip for parts, most just for parts. I earmarked some additional machines in one of the DCs that were doing nothing, but too old for production. In the end, we had enough for a small pile of VMs.

The half that stayed offsite in the racks got OpenStack, originally starting with Debian’s default packages of the time: Folsom. Because documentation was a work in progress for the vintage, getting it working took a while. I connected the dots to find StackForge, and, with that, breadcrumbs from old colleagues who had beaten the grass down in the initial wave of contributors.

Startups do as startups do, and the winds changed direction. I found myself on the outside looking in, charting my next course. I departed the southern waters for the north, and joined forces with a team building a production-grade private cloud.

When I found IRC, I began engaging the developers of an automation flavor that was the least painful for me to piece together. That flavor turned into what is now called Chef OpenStack.

As my time with the project progressed, I became responsible for more and more. At first, my efforts were solely to make sure one part kept building. As my time with the project progressed, I grew my abilities to work across more and more of the codebase, while keeping the amount of operational churn at a manageable rate.

And then, it happened. Peak OpenStack. Austin 2016 was a party among parties, putting Austin to its test. It was at that time, walking down a crowded Rainey Street, I learned I would be shouldering a massive undertaking, with the news that 80% of my fellow developers would be moving on.

With election time looming, I knew that rudderless projects would be first on the chopping block. With that, I became a Project Lead. The argument is still out over if the T stands for Technical or Team, since both are used by PTLs. And, subsequently, I became a public maintainer. The sum of my decisions, good and bad, forever woven within someone’s digital tapestry.

Let’s be real: y’all know someone is still going to be running Ocata in 20 years.

Over time, I found new friends within OpenStack, some of which still exist in and around OpenStack. Sometimes we (we is me upside down) talk about Chef OpenStack, sometimes about other things. Many of these friends have since moved on to the next new shiny, or old new shiny, but some are still around.

Since assuming the mantle, I have served five releases at the helm of one of the paths to OpenStack, forging through uncharted waters to light a way to open infrastructure.

I don’t use many public social media outlets since becoming a public maintainer, as the irony would go. You can find remnants of experiments with them here and there, some attributed with my name. It’s easier to find me closer to the source.

-s

September 27, 2018 12:00 AM

September 26, 2018

OpenStack Superuser

How to make your open source project more inclusive

If the tech sector has a problem attracting and keeping half of the population working in it, open source has an even bigger problem.

Camila Ayres, Jona Azizaj and Jan Christoph Borchardt combed through current reports and found that women working in open source are between one and 11 percent – compared to about 25 percent for the tech sector in general.

“We have this ambition that we want to change the world, and we want to make it better for everyone,” says Borchardt, Nextcloud design lead speaking at his company’s conference. “We really think that representation is key for that, so if half of society is so underrepresented we can’t really be innovative in in a way that we really need to be.”

Build a bridge

For starters, a code of conduct. It may seem standard and, unless you have been on a news fast recently, you might think that every major open source project already has one. Not so.

Time to change that: People from underrepresented groups are the ones most likely to encounter rudeness and denigrating terms consequently, they’re the ones who suffer most if they there is no clear code of conduct or guidelines, notes Ayres, a software engineer.

Just add

Once you’ve established a code of conduct, having less formal communication guidelines for all of the channels (IRC, mailing lists etc.) where people are contributing also help.
A few simple but time-tested ones include:

  • Ask, don’t tell
  • Be specific
  • Explain yourself
  • Offer solutions
  • Avoid hyperbole
  • Use emojis

Other ways to be more inclusive include programs like Google Summer of Code, providing travel grants and tickets for conferences and setting up a Wiki or repo where people can contribute ideas and offer suggestions.

There are a number of reasons why it makes sense to cultivate inclusivity in  your open source project.

Referring to the “Harvard Review” research on how diverse teams can be more smarter, Borchardt says it’s also worth considering “that the bottom line is better. So if you like money, it’s a pretty good also to have a diverse team — and it’s also more innovative.”

The post How to make your open source project more inclusive appeared first on Superuser.

by Superuser at September 26, 2018 04:10 PM

About

Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.

Subscriptions

Last updated:
October 23, 2018 06:38 PM
All times are UTC.

Powered by:
Planet