July 22, 2019

Galera Cluster by Codership

Galera Cluster with new Galera Replication Library 3.27 and MySQL 5.6.44, MySQL 5.7.26 is GA

Codership is pleased to announce a new Generally Available (GA) release of Galera Cluster for MySQL 5.6 and 5.7, consisting of MySQL-wsrep 5.6.44-25.26 and MySQL-wsrep 5.7.26-25.18 with a new Galera Replication library 3.27 (release notes, download), implementing wsrep API version 25. This release incorporates all changes into MySQL 5.6.44 (release notes, download) and MySQL 5.7.26 (release notes, download) respectively.

Compared to the previous 3.26 release, the Galera Replication library has a few fixes: to prevent a protocol downgrade upon a rolling upgrade and also improvements for GCache page storage on NVMFS devices.

One point of note is that this is also the last release for SuSE Linux Enterprise Server 11, as upstream has also put that release into End-of-Life (EOL) status.

You can get the latest release of Galera Cluster from http://www.galeracluster.com. There are package repositories for Debian, Ubuntu, CentOS, RHEL, OpenSUSE and SLES. The latest versions are also available via the FreeBSD Ports Collection.

by Colin Charles at July 22, 2019 02:04 PM

OpenStack Superuser

Using Istio’s Mixer for network request caching: What’s next

The humble micro-service gets a lot of love, says Zach Arnold dev-ops engineering manager. In theory, micro-services are great with absolutely no drawbacks, offering loose coupling, independent management and “all the other stuff that we say we love,” he adds.

In his experience working with them at financing startup Ygrene Energy Fund, that love doesn’t exactly come for free. He tallies up the costs for adding network hops, complex debugging scenarios, authorization and authentication, version coordination and a management burden for third-party dependencies, especially when security patches come in for other frameworks.

“I’m not leading a revolution in micro-services, I’m just hoping that maybe one less thing becomes a problem for people,” he says in a talk at KubeCon + CloudNativeCon.

Specifically, by employing a network tool like Istio to handle request caching. Right now the service-mesh project handles request routing, retries, fault tolerance, authentication and authorization but it doesn’t handle request caching — yet.

Currently, Istio acts a harness for Envoy. Istio uses an extended version of the Envoy proxy, a high-performance proxy developed in C++ to mediate all inbound and outbound traffic for all services in the service mesh. A team is at work building eCache: a multi-backend HTTP cache for Envoy, check out their efforts here.

Once this work is completed, it will be upstreamed into Istio and configurable using the same Policy DSL and will likely also offer support for TTLs, L1, L2 caching and warming.

Check out his full talk here and the slides here.

Get involved

Istio

Check out the documentation, join the Slack channel and get up to speed with the roadmap by reading the feature stages page and release notes.

Related projects
The GIT repository for eBay’s Envoy caching interface to ATS (Apache Traffic Server)’s cache back end.
Varnish, the de facto standard for HTTP caching in OSS.
The caching system for mod_pagespeed [blog] [code] is one implementation of an open-source multi-cache infrastructure.
Casper: Caching HTTP proxy for internal Yelp API calls.

Photo // CC BY NC

The post Using Istio’s Mixer for network request caching: What’s next appeared first on Superuser.

by Nicole Martinelli at July 22, 2019 02:03 PM

Aptira

Automated Network Traffic Engineering and Tunneling

Aptira Automated Network Traffic Engineering and Tunneling

Previously, network engineers were required to provision network services and keep track of changes in real time in order to implement Network Traffic Engineering (TE). This process was all manual – until we setup a process for automated Network Traffic Engineering and Tunneling.


The Challenge

One of our customers wanted to automate and manage their network services at the Service Orchestration level. They intended to build an orchestration platform to automate network services and remove manual processes. One of the key capabilities they are seeking to validate is the automation of network traffic engineering.

With the advent of Software Defined Networking (SDN) and its ability to provide a global view of the network, the provisioning of network services is now possible in real time. The challenge was to validate that the designed components were able to not only respond to traffic demands in real-time but also can be programmed to respond to future traffic demand.


The Aptira Solution

Aptira’s team of world-class SDN, Service Orchestration and Cloud engineers recognised the customer’s problem and were able to design a solution, using a combination of Software Defined Network (SDN) and Service Orchestration techniques.

To solve this challenge, we demonstrated how the combination of technologies such as Service Orchestration (i.e. Cloudify), SDN controller (i.e. ODL), and TICK stack can be used to implement network traffic engineering.

Aptira designed a Software Defined Networking Wide Area Network (SDN-WAN) topology and employed OpenDayLight (ODL) as an SDN controller to manage network resources. We then configured Cloudify as a Service Orchestrator (SO) to implement new service designs using TOSCA blueprints.

We designed the TOSCA blueprints in order to get updated information about the network topology based on recent updates in the network. The TOSCA blueprint triggered Cloudify to send a REST API request to OpenDayLight, querying the network topology and receiving the topology data of any changes. The TOSCA blueprint was then able to design a new network service based on these changes.

As an example of Traffic Engineering in real time, our solution performed the following steps (as shown in the figure) in a fully automated process, without human intervention:

  • Step 1: SDN switches (OVSes) send an update of the network topology in certain intervals to ODL
  • Step 2: The Telegraf agent (TICK stack’s module) running on the ODL detects a change in the network topology and sends an event to the TICK stack’s Policy Engine
  • Step 3: This “changed topology” event in turn triggers PCE blueprint in the Cloudify Service Orchestrator
  • Step 4: The PCE blueprint in Cloudify activates the path computation engine (PCE) module (developed by Aptira)
  • Step 5: The PCE module asks for an update of the network topology from ODL The PCE module
  • Step 6: PCE then sets up new traffic engineering path to optimize the network performance and guarantee the SLA and passes the new computed path to the SDN controller
  • Step 7: The SDN controller installs new rules on the switches included in the path and removes other rules from the switches if required
Aptira Automated Traffic Engineering and Tunneling: OpenFlow Diagram

This solution is self-healing, self-optimising in the case of link or network device failure. Moreover, the solution is not dependent on the SDN controller, which means any customer can adapt this solution to its production network with its own controller.


The Result

The solution was designed, implemented and tested by Aptira and configured into the evaluation platform, passing all use case scenarios devised by the customer. By automating the network traffic engineering and tunneling, this solution not only reduces manual intervention, but also reduces operational and development costs.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Automated Network Traffic Engineering and Tunneling appeared first on Aptira.

by Aptira at July 22, 2019 01:59 PM

Johan Guldmyr

Contributing To OpenStack Upstream

Recently I had the pleasure of contributing upstream to the OpenStack project!

A link to my merged patches: https://review.opendev.org/#/q/owner:+guldmyr+status:merged

In a previous OpenStack summit (these days called OpenInfra Summits), (Vancouver 2018) I went there a few days early and attended the Upstream Institute https://docs.openstack.org/upstream-training/ .
It was 1.5 days long or so if I remember right. Looking up my notes from that these were the highlights:

  • Best way to start getting involved is to attend weekly meetings of projects
  • Stickersssss
  • A very similar process to RDO with Gerrit and reviews
  • Underlying tests are all done with ansible and they have ARA enabled so one gets a nice Web UI to view results afterward. Logs are saved as part of the Zuul testing too so one can really dig into and see what is tested and if something breaks when it’s being tested.

Even though my patches were one baby and a bit over 1 year in time after the Upstream Institute I could still figure things out quite quickly with the help of the guides and get bugs created and patches submitted. My general plan when first attending it wasn’t to contribute code changes, but rather to start reading code, perhaps find open bugs and so on.

The thing I wanted to change in puppet-keystone was apparently also possible to change in many other puppet-* modules, and less than a day after my puppet-keystone change got merged into master someone else picked up the torch and made PRs to like ~15 other repositories with similar changes :) Pretty cool!

Testing is hard! https://review.opendev.org/#/c/669045/1 is one backport I created for puppet-keystone/rocky, and the Ubuntu testing was not working initially (started with an APT mirror issue and later it was slow and timed out)… After 20 rechecks and two weeks, it still hadn’t successfully passed a test. In the end we got there though with the help of a core reviewer that actually updated some mirror and later disabled some tests :)

Now the change itself was about “oslo_middleware/max_request_body_size” So that we can increase it from the default 114688. The Pouta Cloud had issues where our Federation User Mappings were larger than 114688 bytes and we coudln’t update them anymore, turns out they were blocked by oslo_middleware.

(does anybody know where 114688bytes comes from? Some internal speculation has been that it is from 128kilobytes minus some headers)

Anyway, the mapping we have now is simplified just a long [ list ] of “local_username”: “federation_email”, domain: “default”. I think next step might be to try to figure out if maybe we can make the rules using something like below instead of hardcoding the values into the rules

"name": "{0}" 

It’s been quite hard to find examples that are exactly like our use-case (and playing about with is not a priority right now, just something in the backlog, but could be interesting to look at when we start accepting more federations).

All in all, I’m really happy to have gotten to contribute something to the OpenStack ecosystem!

by guldmyr at July 22, 2019 05:57 AM

July 19, 2019

OpenStack Superuser

Why open source today is necessary but not sufficient – and what we should do about it

You could say that open source was born out of both frustration and necessity.

In 1980, Richard Stallman, irked after being blocked from modifying the program a glitchy new laser printer, kicked off the movement. He launched the GNU operating system which, combined with Linux, runs on tens of millions of computers today.

Stallman’s four essential freedoms – the right to run, copy, distribute, study, change and improve the software – define the free software world. Then open source was coined as a term to soothe the suits, promoting the practical values of freedom and downplaying the ethical principles that drove Stallman.

A generation later, necessity is not enough. “Despite being more business-friendly, open source was never a ‘business model,’” writes Thierry Carrez, VP of engineering at the OpenStack Foundation, in a three-part series on the topic. “Open source, like free software before it, is just a set of freedoms and rights attached to software. Those are conveyed through software licenses and using copyright law as their enforcement mechanism. Publishing software under a F/OSS license may be a component of a business model, but if is the only one, then you have a problem.”

Call it the free beer conundrum: the idea that you pay nothing for the product means that it’s free (as in free speech) but not something that comes without cost. The price tag, or lack thereof, has always been a red herring, Carrez notes. It’s really more about allowing the user to kick the tires before going all in. “You don’t have to ask anyone for permission (or enter any contractual relationship) to evaluate the software for future use, to experiment with it, or just to have fun with it. And once you are ready to jump in, there’s no friction in transitioning from experimentation to production.”

What really sets open source apart?  Sustainability, transparency, its appeal to developers and the community that supports it.

“With open source you have the possibility to engage in the community developing the software and to influence its direction by contributing directly to it. This is not about ‘giving back…’ Organizations that engage in open-source communities are more efficient, anticipate changes and can voice concerns about decisions that might adversely affect them. They can make sure the software adapts to future needs by growing the features they’ll need tomorrow.”

His series comes at a critical time for open source. It’s undeniably a success — powering everything from TVs, smartphones, supercomputers, servers and desktops to a $17,000 rifle — but that’s exactly why it’s so easy to take for granted. And notable companies still find ways to profit from the code while flipping the bird at the open-source ethos. It’s bigger than a few bad actors or lawsuits: some say there’s a fight on for the very soul of open source.

So now what? Carrez closes with a call to action: “Open-source advocates and enthusiasts need to get together, defining clear, standard terminology on how open source software is built and start communicating heavily around it with a single voice.  Beyond that, we need to create forums where those questions on the future of open source are discussed. Because whatever battles you win today, the world does not stop evolving and adapting.”

Check out his whole series here.

Photo // CC BY NC

The post Why open source today is necessary but not sufficient – and what we should do about it appeared first on Superuser.

by Nicole Martinelli at July 19, 2019 02:01 PM

Aptira

DevConf.IN 19: Cloud Orchestration using Cloudify

DevConf.IN is the annual developer’s conference organised by Red Hat, India. The conference provides a platform to the FOSS community participants and enthusiasts to come together and engage in knowledge sharing activities through technical talks, workshops, panel discussions, hackathons and much more.

The primary tracks for this year are:

  • Trending Tech
  • AI / ML
  • Storage
  • Networking
  • Open Hybrid Cloud
  • Kernel
  • FOSS Community & Standards
  • Academic Research/White Paper, and
  • Security

Our Senior Software Engineer, Alok Kumar, will be presenting at DevConf.IN, discussing Cloud Orchestration using Cloudify. Alok has been working with OpenStack, k8s and many telco tools, and loves to share and gather knowledge from folks of different backgrounds.

Application deployment, configuration management and system orchestration is easy now with Ansible and other tools. But multi-Cloud Orchestration can still be a challenging task and the tools aren’t very mature yet. During this session, Alok would like to share his experiences with one such Open Source automation tool – Cloudify – to help users understand how easy it can be. The session can be divided into the following topics:

  • Different types of orchestration and best suitable tool for each
  • The current problem with bashifying all your tasks
  • The solution
  • Additional details about Cloudify and some other use cases

If you are a technology enthusiast interested in the latest trends in Open Source and emerging digital technologies, this is the place for you to be.

When: August 2nd -3rd, 2019

Venue: Christ University – Bengaluru, India

Click here for more updates about DevConf.

Last Date of Registration: 31st July, 2019

Let us make your job easier.
Find out how Aptira's managed services can work for you.

Find Out Here

The post DevConf.IN 19: Cloud Orchestration using Cloudify appeared first on Aptira.

by Jessica Field at July 19, 2019 01:44 PM

Chris Dent

Placement Update 19-28

This is pupdate 19-28. Next week is the Train-2 milestone.

Most Important

Based on the discussion on the PTG attendance thread and the notes on the related etherpad I'm going to tell the Foundation there will be approximately seven Placement team members at Shanghai but formal space or scheduling will not be required. Instead any necessary discussions will be arranged on premise. If you disagree with this, please speak up soon.

The main pending feature is consumer types, see below.

What's Changed

  • A bug in the resource class cache used in the placement server was found and fixed. It will be interesting to see how this impacts performance. While it increases database reads by one (for most requests) it removes a process-wide lock, so things could improve in threaded servers.

  • os-resource-classes 0.5.0 was released, adding FPGA and PGPU resource classes.

    It's been discussed in IRC that we may wish to make 1.x releases of both os-resource-classes and os-traits at some point to make it clear that they are "real". If we do this, I recommend we do it near a cycle boundary.

  • An integrated-gate-placement zuul template has merged. A placement change to use it is ready and waiting to merge. This avoids running some tests which are unrelated; for example, cinder-only tests.

Specs/Features

Since spec freeze for most projects is next week and placement has merged all its specs, until U opens, I'm going to skip this section.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 22 (-1) stories in the placement group. 0 (0) are untagged. 2 (-1) are bugs. 5 (0) are cleanups. 10 (-1) are rfes. 4 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 11 microversions.

  • https://review.opendev.org/666542 Add support for multiple member_of. There's been some useful discussion about how to achieve this, and a consensus has emerged on how to get the best results.

Main Themes

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

As mentioned for a few weeks, one of the important cleanup tasks that is not yet in progress is updating the gabbit that creates the nested topology that's used in nested performance testing. We've asked the startlingx community for input.

Another cleanup that needs to start is satisfying the community wide goal of PDF doc generation. I did some experimenting on this and while I was able to get a book created, the number of errors, warnings, and manual interventions required meant I gave up until there's time to do a more in-depth exploration and learn the tools.

Other Placement

Miscellaneous changes can be found in the usual place.

There are two os-traits changes being discussed. And zero os-resource-classes changes (yay!).

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

If you've done any performance or scale testing with placement, I'd love to hear about your observations. Please let me know.

by Chris Dent at July 19, 2019 11:13 AM

July 18, 2019

OpenStack Superuser

Help bring key contributors to the Open Infrastructure Summit with the Travel Support Program

Most of us are more frequently awash in the glow of a monitor than actual sunlight, but we all know how much gets done talking to people in real life.

You can help bring more key contributors face-to-face at the Open Infrastructure Summit in Shanghai by donating to the Travel Support Program. Individual donors can help out by donating at registration for the November 4-6 event. If your organization can contribute to the Travel Support Program, check out the sponsor prospectus.

“It takes somewhere between $2,000 and $3,ooo to send someone via the TSP program to an Open Infra Summit. If 20ish people donate $100, that’s one more person able to attend,”  the OSF’s upstream developer advocate Kendall Nelson said in an impromptu fund-raising thread on Twitter. “Consider donating when you go to register.”

For every Summit, the OpenStack Foundation funds attendance for about 30 dedicated contributors from the open infrastructure community.  These include projects like Kubernetes, Kata Containers, AirShip, StarlingX, Ceph, Cloud Foundry, OVS, OpenContrail, Open Switch, OPNFV.   In addition to developers and reviewers, the program welcomes documentation writers, organizers of user groups around the world, translators, forum moderators and even first-time attendees. Applications for Shanghai are open until August 8, 2019.

TSP grantees from the previous Summit in Denver provide a typical snapshot. The committee picked a diverse group: five nationalities, of which five are Active Technical Contributors, two are Active User Contributors, four are Active User Groups members and the group as a whole contributes to 11 projects.

After every Summit, Superuser profiles the TSP grantees and ask them how it went. Invariably, they say the real-world connections and interactions make them more productive community members.

“The OpenStack community is spread all over the world. We work every day using mostly IRC and experiencing the difficulties of interacting with people in different time zones,” said Rossella Sblendido, software engineer at SUSE and Neutron core reviewer, who traveled from her native Italy to the Tokyo Summit on a TSP grant.

“The Summit is the only time when we’re all there in person at the same time. It’s really crucial to exchange ideas, coordinate and get ready for the next release. Being there makes the difference.”

Photo // CC BY NC

The post Help bring key contributors to the Open Infrastructure Summit with the Travel Support Program appeared first on Superuser.

by Superuser at July 18, 2019 02:01 PM

Aptira

Multi-Cloud Orchestration with Kubernetes, ONAP, Cloudify, Azure & OpenStack

Multi-Cloud Orchestration with Kubernetes, ONAP, Cloudify, Azure & OpenStack

This customer is the research and development arm of a major Australian telecommunications company.

They wanted to evaluate options to deploy and manage Virtual Network Functions (VNF) workloads across a Public Cloud platform (Microsoft Azure) and a Private Cloud platform (OpenStack) to determine whether the platform could be adopted within the organisation. The customer wanted to validate the operation of Service Chaining of VNFs deployed across these market-leading Private and Public Cloud platforms.

The customer also wanted to evaluate a new open-source automation platform called ONAP (Open Network Automation Platform). In particular, the customer wanted to assess its ability to integrate with the selected VIM (OpenStack) and to orchestrate VNF workloads using a market-leading product called Cloudify.


The Challenge

The network infrastructure of telecommunications organisations is increasingly virtualised in the same way that enterprise servers have been virtualised. Such virtualised network components are called Virtual Network Functions (VNFs). Once running on virtual resources, it is possible to move these VNFs between Cloud platforms as required by customer needs. For example, a firewall might be deployed onto a Microsoft Azure Cloud to connect Azure resources into a customer’s enterprise network.

As the overall profitability of the telecom market declines, cost control is of vital importance to telecommunications companies and automating the deployment of VNFs onto virtual resources is an important step in reducing operating costs while also improving agility. Overall, automating these services reduces a process that used to take days or weeks to something which takes minutes, and therefore enables customers to build their systems more efficiently.

Our client sought to evaluate the capabilities of Cloudify and the Open Source ONAP network orchestrator to automate the deployment and provisioning of VNFs to customer Clouds.

They had identified the Open Source ONAP network as a potentially strategic product but needed to evaluate at its current state and maturity given that it was relatively new. Cloudify is a relatively mature product and has been working with ONAP project on a number of integrations.

The customer needed to run up an instance of ONAP to make this assessment. Installing ONAP is a challenge because it is not a packaged product with a general-purpose deployment process. ONAP also consumes a significant amount of resources which must be present for the deployment to succeed, even if they are not necessary for the particular use-case.

Overall this configuration required integration between multiple components, some for the first time.


The Aptira Solution

The end deployment was complex since it required integration between different components that weren’t attempted initially. Following are some of the integrations that were completed as part of the evaluation:

  • Cloudify as Service Orchestrator integrated with Azure to deploy Clearwater vIMS
  • Cloudify as Service Orchestrator integrated with ONAP to deploy F5 vLB VNF respectively
  • Deployed ONAP atop Kubernetes and integrated OpenStack (VIM) with ONAP to deploy F5 vLB VNFs
  • Adapt existing Clearwater TOSCA blueprint to model and deploy vIMS Telco service on Microsoft Azure using Cloudify
  • Model a Service chaining blueprint to enable the SIP traffic to Clearwater vIMS via the F5 load balancer
  • Orchestrating connections between the OpenStack and Azure Cloud to route the SIP traffic

Below is a high-level design diagram, detailing the multi-Cloud orchestration and service chaining:

Aptira deployed the Beijing release of ONAP onto an OpenStack Cloud run by the customer. Kubernetes with Helm were used to deploy ONAP and manage the deployment post installation. This ONAP instance was configured with Cloudify to provision Virtual Network Functions (VNFs) onto an OpenStack Cloud, as well as Microsoft Azure.

The VNFs selected for this evaluation included:

  • The Open Source Clearwater IP Mulitmedia Subsystem (vIMS), which is used to deliver voice, video and multimedia services to mobile telephony users
  • A virtual load balancer from F5

We also demonstrated service chaining between F5 vLB VNF running on OpenStack and Clearwater vIMS running on Azure.


The Result

During the validation process, Aptira successfully:

  • Deployed and managed an F5 virtual load balancer on OpenStack and the Clearwater vIMS system on Microsoft Azure using Cloudify TOSCA blueprints and ONAP Artifacts
  • Registered multiple SIP clients and making calls between SIP clients
  • Triggered autoscaling of the virtual service elements to validate the ability of the configuration to handle Closed-loop automation based on increased traffic load

Now that the validation process is complete, the customer is now able to orchestrate and service chain VNF workloads across multiple Clouds.


Take control of your Cloud.
Get a customised Cloud strategy today.

Learn More

The post Multi-Cloud Orchestration with Kubernetes, ONAP, Cloudify, Azure & OpenStack appeared first on Aptira.

by Aptira at July 18, 2019 01:37 PM

Mirantis

Quick tip: Enable nested virtualization on a GCE instance

There are times when you need to run a virtual machine -- but you're already ON a virtual machine.  Fortunately, it's possible, but you need to enable nested virtualization.

by Nick Chase at July 18, 2019 02:39 AM

July 17, 2019

OpenStack Superuser

Report: Open-source object storage “more mainstream than ever”

It’s taken awhile, but object storage has moved out of the margins. The roots of the computer data storage architecture that manages data as objects stretch back to 1994 — the same year eBay was founded and the DVD launched — but now it’s “becoming more mainstream than ever,” according to a recent GigaOm report.

Typically employed for second-tier storage, backup and long-term archives, now it’s supporting cloud-native workloads that require data to remain always and quickly accessible. The proliferation of devices and data streaming from them is a key driver in this change. The 12-page report titled “Enabling Digital Transformation with Hybrid Cloud Data Management,” outlines new use cases along with adding a few inevitable buzzwords.

Author Enrico Signoretti notes that even the most conservative execs are “now confidently building hybrid cloud infrastructures for multiple use cases.”

Some key examples:

  • Cloud bursting: leveraging the vast amounts of available computing power in the cloud for highly-demanding workloads and fast analysis, while keeping full control over data and paying only for the time required.
  • Cloud tiering: offloading cold data to the cloud to take advantage of the low $/GB while maintaining flexibility.
  • Business continuity (BC) and disaster recovery (DR): eliminating the expense of a secondary DR site without sacrificing data protection or infrastructure resiliency.
  • Advanced data management and governance: complying with increasingly demanding regional regulations while serving global customers.
  • The proliferation of edge services: supporting users, applications, and data generators that are pushing and pulling data to and from core and cloud infrastructures.

The report, sponsored by Scality, spells out the benefits of open source object storage.

Object storage with the right orchestration solution can manage huge amounts of data safely and cost-effectively, Signoretti says, making it accessible from everywhere and from every device.
An object storage solution should:

  • Allow data mobility across cloud and on-premises infrastructure to support the use cases described earlier
  • Possess strong security features
  • Provide advanced data management capabilities to enable both architectural and business flexibility.

While the report specifically examines the merits of Scality’s products Zenko and RING8, the characteristics outlined above apply to projects in the open-source object storage panorama such as Ceph, Swift and Minio. For more, you can read the full report here.

Cover image // CC BY NC

The post Report: Open-source object storage “more mainstream than ever” appeared first on Superuser.

by Superuser at July 17, 2019 02:01 PM

Aptira

System Integration

Aptira System Integration

Why is System Integration Important?

Building a solution out of one technology is generally not going to give you the best results. By integrating specific technologies to build a customised solution, you’re able to solve problems in new and innovative ways. In Engineering, System Integration is defined as the process of bringing together the component sub-systems into one system. In our experience, integration extends to more than just technologies – it also involves the integration of many practices, occupations and organisational units into one discipline which previously were quite distinct and separate. It can be challenging, but the results are worth it.

We understand that you are probably looking for integration with your existing billing, monitoring and provisioning systems rather than having yet another system pushed on you. Your day-to-day processes can be streamlined, increasing efficiency and enabling you to focus on what’s actually important for your business.

System Integration Training

In order to help businesses successfully integrate new technologies into their organisations, we’ve specifically designed a course that will cover all the core features of Agile System Integration for Open Networking Projects. This course has been custom designed by our team who have successfully integrated new technolology solutions for some Australia’s largest and well-known brands.

This course will enable attendees to understand the holistic end-to-end scope of complex technical projects and the pressures that these projects place on existing methodologies. Graduates of this course will be able to select the right tools, processes and operating paradigms to manage or participate in these projects and contribute to high levels of success. We’ll cover a range of topics, including:

  • What is Agile System Integration?
  • Why is Agile System Integration necessary?
  • What problems indicate that I need Agile System Integration?
  • Definitions and context
  • Scope of concerns
  • Stakeholder management
  • Dealing with “multi-everything”
  • Managing precision with uncertainty
  • Reconciling different viewpoints, processes and paradigms
  • Practical considerations

The course will be delivered in workshop format as a combination of lectures and other media. Attendees work individually and in groups, and will take part in practical exercises, enabling students to gain real life experience with System Integration. As we have designed this course from the experiences of our customers, we can fully customise the content to suit your requirements. Generally, this course can be completed in 3-5 days, however as with all of our training courses we can tailor this in order to focus on particular technologies, needs and learning outcomes.

Enabling System Integration

We can work alongside you to provide mentoring and lead your team to achieve your integration goals. Or if you’d rather someone else take care of the hard part for you, our team can develop an integration plan that will drive innovation and increase efficiency for your organisation. We can work with your existing infrastructure teams to get the desired level of integration between your new and existing systems. Each solution is comprehensive and unique to fit your requirements and can easily be integrated into your business workflow, resulting in reduced cost and complexity. Chat with a Solutionaut today for more info.

Become more agile.
Get a tailored solution built just for you.

Find Out More

The post System Integration appeared first on Aptira.

by Jessica Field at July 17, 2019 01:01 PM

July 16, 2019

OpenStack Superuser

Running an OpenStack cloud? Check out the next Operators Meetup

If you run an OpenStack cloud, attending the next Ops Meetup is a great way to trade best practices and share war stories.

Ops Meetups give people who run clouds a place to meet face-to-face, share ideas and give feedback. The vibe is more round table-working group-unconference, with only a small number of presentations. The aim is to gather feedback on frequent issues and work to communicate them across the community, offer a forum to share best practices and architectures and increase constructive, proactive involvement from those running clouds. Ops Meetups are typically held as part of the six-monthly design summit and also once mid-cycle.

This time around, it’ll be held September 3-4 in New York City, hosted by Bloomberg LP at the company’s super-central Park Avenue offices.

You still have time to influence the sessions, so check out the Etherpad. Current session topics include deployment tools, long-term support, RDO and TripleO, Ceph and the always popular tracks dedicated to architecture show-and-tell and war story lightning talks. (Stay tuned for ticket info.)

In the meantime, if you have questions or want to get involved, the Ops Meetup Team holds meetings Tuesdays at 10:00 a.m. EST (UTC -5) and welcomes items added to the meeting agenda on an Etherpad. You’ll also find folks on the #openstack-operators IRC channel or post to the unified OpenStack mailing list using the [ops] tag.

“It’s another way Stackers come together to keep the momentum going as a community, in between Summits, in smaller, focused groups,” says OSF COO Mark Collier.

His experience of attending one of the early ones? “I was really struck by the atmosphere: all focus, no flash…Real life superusers from companies like Comcast, Time Warner Cable, GoDaddy, Yahoo, Sony Playstation, Symantec, Cisco, Workday, IBM, Bluebox, Intel and Paypal made the time to attend and collaborate.”

And, more importantly, the operators came ready to work, to talk about the pain points they’d encountered running thousands of nodes, battles with upgrades and making tough configuration decisions.

For more on what to expect from an Ops Meetup, check out these reports from previous editions held in Manchester and Mexico City.

Photo // CC BY NC

The post Running an OpenStack cloud? Check out the next Operators Meetup appeared first on Superuser.

by Superuser at July 16, 2019 02:02 PM

Aptira

Replacing an OSD in Nautilus

Now that you’ve upgraded Ceph from Luminous to Nautilus, what happens if a disk fails or the administrator needs to convert from filestore to bluestore? The OSD needs to be replaced.

The OSD to be replaced was created by ceph-disk in Luminous. But in Nautilus, things have changed. The ceph-disk command has been removed and replaced by ceph-volume. By default, ceph-volume deploys OSD on logical volumes. We’ll largely follow the official instructions here. In this example, we are going to replace OSD 20.

On MON, check if OSD is safe to destroy:


[root@mon-1 ~]# ceph osd safe-to-destroy osd.20
OSD(s) 20 are safe to destroy without reducing data durability.

If yes on MON, destroy it:


[root@mon-1 ~]# ceph osd destroy 20 --yes-i-really-mean-it
destroyed osd.20

The OSD will be shown as destroyed:


[root@mon-1 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 66.17834 root default
-7 22.05945 host compute-1
......
19 hdd 1.83829 osd.19 up 1.00000 1.00000
20 hdd 1.83829 osd.20 destroyed 0 1.00000
22 hdd 1.83829 osd.22 up 1.00000 1.00000

On OSD after replacing the faulty disk, use perccli to create a new VD with the same sdX device name. Then zap it.


[root@compute-3 ~]# ceph-volume lvm zap /dev/sdl
--> Zapping: /dev/sdl
--> --destroy was not specified, but zapping a whole device will remove the partition table
Running command: /usr/sbin/wipefs --all /dev/sdl
Running command: /bin/dd if=/dev/zero of=/dev/sdl bs=1M count=10
stderr: 10+0 records in
10+0 records out
10485760 bytes (10 MB) copied
stderr: , 0.0634846 s, 165 MB/s
--> Zapping successful for:

Find out the existing db/wal partitions used by the old OSD. Since the ceph-disk command is not available any more, I have written a script to show current mappings of OSD data and db/wal partitions.

Firstly, run “ceph-volume simple scan” to generate OSD json files in /etc/ceph/osd/. Then run this script.


[root@compute-3 ~]# cat ceph-disk-list.sh
#!/bin/bash
JSON_PATH="/etc/ceph/osd/"
for i in `ls $JSON_PATH`; do
OSD_ID=`cat $JSON_PATH$i | jq '.whoami'`
DATA_PATH=`cat $JSON_PATH$i | jq -r '.data.path'`
DB_PATH=`cat $JSON_PATH$i | jq -r '."block.db".path'`
WAL_PATH=`cat $JSON_PATH$i | jq -r '."block.wal".path'`
echo "OSD.$OSD_ID: $DATA_PATH"
#echo $DB_PATH
DB_REAL=`readlink -f $DB_PATH`
WAL_REAL=`readlink -f $WAL_PATH`
echo " db: $DB_REAL"
echo " wal: $WAL_REAL"
echo "============================="
done

It will show the mapping of existing ceph OSD (created by ceph-disk) and db/wal.


[root@compute-3 ~]# ./ceph-disk-list.sh
OSD.1: /dev/sdb1
db: /dev/nvme0n1p27
wal: /dev/nvme0n1p28
=============================
OSD.11: /dev/sdg1
db: /dev/nvme0n1p37
wal: /dev/nvme0n1p38
=============================
OSD.13: /dev/sdh1
db: /dev/nvme0n1p35
wal: /dev/nvme0n1p36
=============================
OSD.14: /dev/sdi1
db: /dev/nvme0n1p33
wal: /dev/nvme0n1p34
=============================
OSD.18: /dev/sdj1
db: /dev/nvme0n1p51
wal: /dev/nvme0n1p52
=============================
OSD.22: /dev/sdm1
db: /dev/nvme0n1p29
wal: /dev/nvme0n1p30
=============================
OSD.3: /dev/sdc1
db: /dev/nvme0n1p45
wal: /dev/nvme0n1p46
=============================
OSD.5: /dev/sdd1
db: /dev/nvme0n1p43
wal: /dev/nvme0n1p44
=============================
OSD.7: /dev/sde1
db: /dev/nvme0n1p41
wal: /dev/nvme0n1p42
=============================
OSD.9: /dev/sdf1
db: /dev/nvme0n1p39
wal: /dev/nvme0n1p40
=============================

Compare this list with the output of lsblk to find out free db/wal devices. Then create a new OSD with them:


[root@compute-3 ~]# ceph-volume lvm create --osd-id 20 --data /dev/sdl --bluestore --block.db /dev/nvme0n1p49 --block.wal /dev/nvme0n1p50
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new e795fd7b-df8d-48d7-99d5-625f41869e7a 20
Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e /dev/sdl
stdout: Physical volume "/dev/sdl" successfully created.
stdout: Volume group "ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e" successfully created
Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e
stdout: Logical volume "osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-20
Running command: /usr/sbin/restorecon /var/lib/ceph/osd/ceph-20
Running command: /bin/chown -h ceph:ceph /dev/ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e/osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a
Running command: /bin/chown -R ceph:ceph /dev/dm-2
Running command: /bin/ln -s /dev/ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e/osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a /var/lib/ceph/osd/ceph-20/block
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-20/activate.monmap
stderr: got monmap epoch 9
Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-20/keyring --create-keyring --name osd.20 --add-key AQD38iNdxf89GRAAO6HbRFcgCj6HSuyOsJRGeA==
stdout: creating /var/lib/ceph/osd/ceph-20/keyring
added entity osd.20 auth(key=AQD38iNdxf89GRAAO6HbRFcgCj6HSuyOsJRGeA==)
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-20/keyring
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-20/
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p50
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p49
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 20 --monmap /var/lib/ceph/osd/ceph-20/activate.monmap --keyfile - --bluestore-block-wal-path /dev/nvme0n1p50 --bluestore-block-db-path /dev/nvme0n1p49 --osd-data /var/lib/ceph/osd/ceph-20/ --osd-uuid e795fd7b-df8d-48d7-99d5-625f41869e7a --setuser ceph --setgroup ceph
--> ceph-volume lvm prepare successful for: /dev/sdl
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-20
Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e/osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a --path /var/lib/ceph/osd/ceph-20 --no-mon-config
Running command: /bin/ln -snf /dev/ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e/osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a /var/lib/ceph/osd/ceph-20/block
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-20/block
Running command: /bin/chown -R ceph:ceph /dev/dm-2
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-20
Running command: /bin/ln -snf /dev/nvme0n1p49 /var/lib/ceph/osd/ceph-20/block.db
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p49
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-20/block.db
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p49
Running command: /bin/ln -snf /dev/nvme0n1p50 /var/lib/ceph/osd/ceph-20/block.wal
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p50
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-20/block.wal
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p50
Running command: /bin/systemctl enable ceph-volume@lvm-20-e795fd7b-df8d-48d7-99d5-625f41869e7a
stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-20-e795fd7b-df8d-48d7-99d5-625f41869e7a.service to /usr/lib/systemd/system/ceph-volume@.service.
Running command: /bin/systemctl enable --runtime ceph-osd@20
Running command: /bin/systemctl start ceph-osd@20
--> ceph-volume lvm activate successful for osd ID: 20
--> ceph-volume lvm create successful for: /dev/sdl

The new OSD will be started automatically, and backfill will start.

For further information, check out the official Ceph documentation to replace an OSD. If you’d like to learn more, we have Ceph training available, or ask our Solutionauts for some help.

The post Replacing an OSD in Nautilus appeared first on Aptira.

by Shunde Zhang at July 16, 2019 05:12 AM

July 15, 2019

OpenStack Superuser

How to run a simple function with Qinling

In my previous post on Qinling, I showed how to get started with OpenStack’s function-as-a-service project. Here, I’ll explain how to use it with a very simple function and show what Qinling does behind the scenes from a high-level perspective.

High level OpenStack Qinling workflow (thanks draw.io)

The diagram above offers a quick look at what happens when a function is executed. First of all, the execution can be triggered by multiple clients such as GitHub, Prometheus, Mistral, your dear colleague, etc… But keep in mind that there aren’t really any limits about what can act as a trigger — a lollipop, your dog, E.T.

The sidecar container deployed within the POD is only used to download the code from Qinling API, the /download endpoint is called by the runtime container via http://localhost:9091/download. You’ll find the runtime here.

A very simple function

Let’s use a very simple function to demonstrate Qinling basics. The function is written in Python, so you’ll need a Python runtime. (Stay tuned: runtimes will be the subject of an upcoming post.)

def main(**kwargs):
print("Hello Qinling \o/")

Here we have a function named main() that prints “Hello Qinling \o/”. To interact with the Qinling API, a client is required, it could be python-qinlingclient, httpie or curl. I’m going with the easiest option, the official client which I installed by pip.

$ openstack function create --name func-hq-1 \
--runtime python3 --file hello_qinling.py --entry hello_qinling.main

The command is simple — I asked to Qinling to create a function. hello_qinling.py is a simple file (not a package), I used the python3 runtime to execute my function. Last but not least, the entry point says to Qinling how to enter in the function, which in the example is hello_qinling.main (the file name and the function to execute in this file).

Get the function ID returned by the command..

Run, Qinling, run… and show me the output!

When a function is executed it should return a value, depending how it has been coded of course. Qinling can provide two different outputs/results:

  • return: Terminates and returns a value from a function
  • print: Displays a value to the standard output/console

Let’s execute the function with the ID from the function created above and see the result line.

$ openstack function execution create 484c45df-6637-4349-86cc-ecaa6041352e | grep result
| result           | {"duration": 0.129, "output": null}  |

By default, Qinling will display the return under the output JSON keyword (yes, the result field is JSON formatted). Of course if no return exists (as in our sample function) then the output value will be null. But then where the print defined in our function will be displayed ?

Qinling provides a way to show messages printed during the function execution. The function execution returned an ID, this one is required to display the function log.

$ openstack function execution log show 4a3cc6ae-3353-4d90-bae5-3f4bf89d4ae9
Start execution: 4a3cc6ae-3353-4d90-bae5-3f4bf89d4ae9
Hello Qinling \o/
Finished execution: 4a3cc6ae-3353-4d90-bae5-3f4bf89d4ae9

As expected “Hello Qinling \o/” as been printed.

Your turn

Now that all the tools have been provided, let’s try a little test. What will be the result of running this function?

def main(**kwargs):
    msg = "Hello Qinling \o/"
return msg

Just post the answer/output in the comments section and let’s see if you get the concept, if not then I failed!

About the author

Gaëtan Trellu is a technical operations manager at Ormuco. This post first appeared on Medium.

Superuser is always interested in open infra community topics, get in touch at editorATopenstack.org

Photo // CC BY NC

The post How to run a simple function with Qinling appeared first on Superuser.

by Gaëtan Trellu at July 15, 2019 02:02 PM

Aptira

Swinburne Ceph Upgrade and Improvement

Swinburne is a large and culturally diverse organisation. A desire to innovate and bring about positive change motivates their students and staff. The result is in an institution that grows and evolves each year.


The Challenge

Aptira deployed SUSE Enterprise Storage 4 for Swinburne university a year ago. As SUSE Storage 5 was released, Swinburne wanted to take advantage of its new features like CephFS and Bluestore, so they planned to upgrade to this latest version. The Ceph Filesystem (CephFS) is a POSIX-compliant filesystem that uses a Ceph Storage Cluster to store its data.

BlueStore is a new back end object store for the OSD daemons. The original object store, FileStore, requires a file system on top of raw block devices. Objects are then written to the file system. BlueStore does not require a file system, it stores objects directly on the block device. Thus BlueStore provides a high-performance backend for OSD daemons in a production environment.

As a SUSE partner, we were called in to help them upgrade their existing Ceph storage system and expand it with more storage nodes.

Moreover, Swinburne was concerned about an emerging problem with their Netapp instance, which was soon to be out of warranty, and therefore needed to decommission, and migrate data from Netapp to Ceph.


The Aptira Solution

Aptira’s solution was to upgrade and expand the storage cluster to Storage 5 using DeepSea, which was the tool originally used to deploy their storage 4 cluster. DeepSea is a tool developed by SuSE based on Salt Stack for deploying, managing and automating Ceph.

We set up an environment on our own lab equipment to simulate their environment and test the full upgrade process. This included installing the SUSE storage environment and corresponding services like SUSE manager, local SMT server on Aptira’s own infrastructure (“cloud-q”). On this test environment the full upgrade process was tested and proven to work.

When it comes to Swinburne’s product system upgrade, the base OS of each node was upgraded to SLES SP3 and SUSE Enterprise Storage 4.0 was upgraded to version 5.5. At the same time we did an expansion to both Ceph clusters too. Nine additional OSD nodes were added to each Ceph cluster, which brings the total storage of each cluster to 3 PB.

In addition, Aptira deployed NFS and Samba services on top of CephFS, in order to provide a seamless transition from Netapp to Ceph.

Samba is also integrated with Swinburne’s Windows Active Directory (AD), so users can easily access Ceph storage with their own Windows credentials. Since DeepSea does not allow fully customised installation for Samba and NFS, Aptira have written Ansible playbooks to install Samba and NFS.


The Result

Swinburne’s storage was successfully upgraded to SUSE Storage 5 and their node expansion has been successfully completed as well. Samba and NFS are running as gateways to CephFS.

From the result of some performance tests, they are providing good performance and the users are satisfied. An as-built document was written and handed to Swinburne to conclude this project.


Keep your data in safe hands.
See what we can do to protect and scale your data.

Secure Your Data

The post Swinburne Ceph Upgrade and Improvement appeared first on Aptira.

by Aptira at July 15, 2019 01:14 PM

Stackmasters team

Mastering OpenStack as a Stackmasters intern

I was looking for an opportunity to gather practical knowledge on OpenStack and Ansible. As a final year student and a member of the CONSERT lab in University of West Attica, I had already touched my fingers on cloud technology. And since Stackmasters is heavily involved in cloud management and OpenStack in particular, I chose them to apply for my internship. A Stackmasters intern, then!

Mastering OpenStack as a Stackmasters intern

Joining the team as a Stackmasters intern, I anticipated a typical environment like most of the companies offer in the Greek labor market. But I was proven wrong, quickly. Stackmasters, being part of the Starttech Ventures portfolio companies, is way too different. The friendly atmosphere makes you feel at home from day1. The open mindset of the people I had the chance to meet and work with, make up a unique working environment.

Experimenting with the Cloud

Starting with Ansible

My major goal to achieve within my three month stay, as a Stackmasters intern, was to deepen my knowledge on cloud technology!
At first, I started working with Ansible. Having some experience beforehand, I realized really fast that there are many more to get from such a tool. Ansible is a simple agent-less IT automation tool that helps you automate common or even more complex jobs. With Ansible you are able to automate all those tasks in a computer environment; like provision, deployment or even changing the behavior of services and resources. In Ansible’s absence you have to tackle such tasks manually, or with a bunch of scripts. Along with whatever consequences in terms of delays, maintenance and human mistakes this method brings about.

My first project was to understand Ansible’s best practices for structuring a task work; so that I would get the hang of it. And that’s exactly what I did. Following, my mentor Thanassis’ pertinent directions, I started with the basics. And I think I can proudly say that I finally managed to create a playbook that automatically handles the installation and configuration of an Apache server.

OpenStack Services’ turn

While setting up an Apache server seemed up to scratch in the first few days; still I was thirsty for a bigger challenge!
And so after completing my first project, I moved on to OpenStack. I was keen to gain expertise on OpenStack deployments and management of such environments. The first step in this journey was to study the OpenStack documentation guide; so as to manually deploy a small lab. With its core services running on two virtual machines.
And the learning goal? Well, to understand the architecture of a cloud with OpenStack services — Keystone, Glance, Nova, Newton, Horizon, Swift — as its components. Furthermore, to comprehend how each of those services interconnects and contributes to its provided features.

OpenStack-Ansible

I have to admit. I got frustrated with the complexity of such a system. And I was pretty unsure on whether I could go through the project with success.
The team at Stackmasters helped me understand quite a great detail on the OpenStack architecture. And what options I had in order to go on with my project. I got acquainted with a few OpenStack projects, developed by the community to ease the pain of management; such as deployment. Ironically, Kolla and OpenStack-Ansible (aka OSA) were the next things I checked. It felt natural to opt in for OSA.

Then, preparing and applying an OpenStack deployment became easier using OSA. As a next step, I practiced upgrades to the existing OpenStack installation.
Mission accomplished! I had gained a good understanding of how things are run when it comes to OpenStack!

Wrapping up my internship on Cloud Management, as a Stackmasters intern

I have to say, I have had the chance to gain great experiences in OpenStack. Mostly, I got lessons from fantastic professionals in Stackmasters. And I met interesting people at Starttech Ventures.

Thanassis Parathyras, CTO at Stackmasters, helped me into a smooth start and guided me, so that I could gradually delve into the concepts of OpenStack technology and community. Stackmasters team were very helpful for those three months.

The whole experience was definitely of benefit for me. Not only at a professional level, but also at a more personal one. For these reasons, I am confident that the skills I’ve gained as a Stackmasters intern, will have a major contribution to my future career development.

As a final point, what I’d definitely recommend to young IT and Computer Science graduates, is this:

Grab internship opportunities, get hands-on experience and explore how brilliant the Greek startup ecosystem is.

Mastering OpenStack as a Stackmasters intern was last modified: July 15th, 2019 by Nikos Kaftatzis

The post Mastering OpenStack as a Stackmasters intern appeared first on Stackmasters.

by Nikos Kaftatzis at July 15, 2019 09:46 AM

July 12, 2019

OpenStack Superuser

The ABCs of open-source license compliance

With open source software ubiquitous and irreplaceable, setting up a license compliance and procurement strategy for your business is indispensable. No software engineer I know wants to voluntarily talk about open source compliance, but avoiding those conversations can lead to a lot of pain. Remember the litigation for GPL-violations with D-Link, TomTom and many more in the early 2000s?

It’s better to keep in mind open-source license compliance from the early stages of development when creating a product: you want to know where all its parts are coming from and if they’re any good. Nobody thinks they will be asked for the bill of material for their software product until they are.

“Open source compliance is the process by which users, integrators and developers of open source software observe copyright notices and satisfy license obligations for their open source software components” — The Linux Foundation

Objectives for open source software (OSS) compliance in companies:

  • Protect proprietary IP
  • Facilitate the effective use of open source software
  • Comply with open source licensing
  • Comply with the third-party software supplier/customer obligations

What’s a software license, anyway?

Put very simply, a software license is a document that states what users are permitted to do with a piece of software. Open source software (OSS) licenses are licenses that the Open Source Initiative (OSI) has reviewed for respecting the Open Source Definition.  There are approximately 80 open source licenses (OSI maintains a list and so does the Free Software Foundation although these are called “free software” licenses), split between two larger families:

  • So-called “copyleft” licenses (GPLv2 and GPLv3) designed to guarantee users long-term freedoms, make it harder to lock the code in proprietary/non-free applications. The most important clause in these is that if you want to modify the software under copyleft license you have to share the modifications under a compatible license.
  • Permissive/BSD-like open source licenses guarantee freedom of using the source code, modifying it and redistribute, including as a proprietary product. (for example MIT, Apache.)

Despite the variety of licenses, companies sometimes invent new ones, modify them with new clauses and apply them to their products. This creates even more confusion among engineers. If your company’s looking to use open source software, tracking and complying with every open source license and hybrids can be a nightmare.

Establish an open-source license compliance policy

The goal is to have a full inventory of all the open source components in use and their  dependencies. It should be clear that there are no conflicts between licenses, all clauses are met and necessary attributions to the authors are made.

Whether you have an open source project using other open source components, or a proprietary project using open source components, it is important to establish a clear policy regarding OSS compliance. You want to create a solid, repeatable policy to outline what licenses are acceptable for your specific project.

Ways to execute OSS compliance

Manual

A surprising number of companies are still using this approach.  Basically, you create a spreadsheet and manually fill it out with components, versions, licenses and analyze it against your policy.

This works out well for smaller projects if they established a compliance policy (list of licenses or clauses acceptable in the company) from the beginning to spare themselves trouble in the future. In this scenario, every developer must review and log a software’s license before introducing the open source component.

The downside of this approach is that as the quantity of OSS components in the project grows, it becomes more difficult to keep track of relationships between licenses (if they all work together or there are conflicts). It is vital to list them as the dependency might have a different license than the actual library you are using.

Semi-automated

This is a more reliable approach and is becoming more popular, as the importance of open source compliance practices grows along with the risks associated with ignoring these practices. There are many tools available, in a way that it makes the prospect of automating overwhelming. Why semi-automated? Because there are always false positives if the license is not explicitly referenced in the header and you still have to read through some of them to discover special terms or conditions.

Of the tools I’ve seen, there are four main approaches:

  1. File scanners – usually involve all sorts of heuristics to detect licenses or components that usually would be missed by developers. Usually, these tools offer different formats for the output.
  2. Code scanners – exactly what it sounds like. You can use them periodically to check for new open- source components.
  3. Continuous integration (CI) scanners – these tools work with continuous integration or build tools. This will automatically detect all open-source components in the code every time you run a build. The idea is to create a unique identifier for each open-source component in the build and reference it against a database of existing components. You can also set policies to break the build if a blacklisted license is found.
  4. Component identification tools – these tools can help you produce a software bill-of-material (SBOM), the list of OSS components in your product.

A good place to start? The tools highlighted by the Open Compliance Program, a Linux Foundation initiative.

Conclusions

For smaller projects, fully manual tracking might be sufficient to achieve license compliance. For more complex projects, especially the ones built in an agile style with regular releases, automation is better. Whichever way you choose to handle OSS compliance, don’t ignore it for the sake of your project and sustaining the open-source community.

Dasha Gurova is the technical community manager at Scality, where a version of this post first appeared in the forum.

Superuser is always interested in open infrastructure community content, get in touch: editorATopenstack.org

Photo // CC BY NC

The post The ABCs of open-source license compliance appeared first on Superuser.

by Superuser at July 12, 2019 02:06 PM

Aptira

Upgrading Ceph from Luminous to Nautilus

Ceph Nautilus was released earlier in the year and it has many new features.

In CentOS, the Ceph Nautilus packages depend on CentOS 7.6 packages. Since we are using local YUM repo mirrors, we needed to download all CentOS 7.6 RPMs and Ceph Nautilus RPMs to our local YUM repo servers, and then update yum configs in all Ceph nodes to CentOS 7.6 and Ceph Nautilus.

Next we just followed the instructions as per the official Ceph documentation. The process is as follows:

Upgrade Ceph MONs:

  • yum update ceph mon packages
  • restart ceph mon service one by one

Upgrade Ceph Mgrs:

  • yum update ceph mgr packages
  • restart ceph mgr service one by one

Upgrade Ceph OSDs:

  • yum update ceph osd packages
  • restart ceph osd service one by one

Upgrade Ceph MDSs:

  • reduce mds ranks max_mds to 1
  • stop all standby mds services
  • restart the remaining active mds service
  • start all other mds services
  • restore original value of max_mds

Upgrade Ceph RADOSGW:

  • update and restart radosgw

Update CRUSH buckets:

  • switch any existing CRUSH buckets to straw2


ceph osd getcrushmap -o backup-crushmapceph osd crush set-all-straw-buckets-to-straw2

Enable V2 Network Protocol:

  • enable v2 network protocol using “ceph mon enable-msgr2”

Configure the Ceph Dashboard:

Ceph’s dashboard has been changed to enable SSL by default, so it will not work without certificates. You will need to either create certificates, or disable SSL. Since our Ceph is running in an internal network, we disabled SSL using the following command:


ceph config set mgr mgr/dashboard/ssl false

For more details about creating certificates, see the dashboard documentation.

We also enabled the Prometheus plugin in Ceph Mgr to collect metrics. In order to enable the plugin, simply run this command:


ceph mgr module enable prometheus

Then configure the scrape targets and some rules in Prometheus to scrape data from Ceph. More details can be found here.

After the scraping is working, you can use Grafana to visualise the metrics. Grafana Dashboards can be installed from RPMs or downloaded from its github repository.

For more information, check out the official Ceph documentation, as it describes how to upgrade from Luminous in great detail. If you’d like to learn more, we have Ceph training available, or ask our Solutionauts for some help.

Keep your data in safe hands.
See what we can do to protect and scale your data.

Secure Your Data

The post Upgrading Ceph from Luminous to Nautilus appeared first on Aptira.

by Shunde Zhang at July 12, 2019 01:50 PM

Chris Dent

Eight Hour Day Update

Since Denver in early May I've been running a timer to limit my work day to eight hours. I've stuck to it pretty well, long enough that I have a few observations.

Some of the expected positive outcomes are there: I have more time to attend to non-work tasks like feeding myself, getting a bit more exercise, and making plans and actions for things around my home.

But there are some negatives which suggest further effort is required.

The one that is on my mind today is that with only eight hours of continuous work in a day, it's been difficult to get anything of substance done. Not that I'm getting nothing done. Rather, the time I have available is mostly consumed by reputedly urgent requests—that are initially small but turn out not to be—from co-workers both internal to $EMPLOYER and in the OpenStack community.

In the past I would attend to these requests, clear them off the plate, and then do what I felt to be "the real work" (which could be defined, vaguely, as "improving OpenStack for the long term").

Now that I have a time limit, I rarely get to the point where I have a clean plate. If I do there's not enough time to gain the focus and flow required to do "the real work". As a result, my day to day satisfaction is poor.

I can think of a few strategies for resolving this, but both will be difficult to integrate with social mores in my daily environments:

  • Reserve entire days without attention to IRC, Slack and perhaps even email. That is, avoid interruption by being unreachable. This will be hard to do. The majority of the population in these environments is addicted to 24 hour synchronous communication and expects the same from others. People get used to it in some kind of bizarre form of Stockholm Syndrome and synchronous becomes the only reliable way to reach them.

    24 hours, seven days a week contact is not how work is supposed to work. If you are a member of a team and you work this way, you are effectively encouraging, and in some cases requiring, other members of your team to work the same way.

  • Enforce a queuing mechanism. Don't let other people turn me into a stack on which they can put themselves.

These are both hard because there is always a perceived urgency. Sometimes real, sometimes not. Denying or resisting that is easily perceived as rude or unhelpful.

One of the reasons I write the placement updates is to make it clear there is a queue of placement-related work for people who either cannot or do not want to be a part of the synchronous flow of information.

I need more tools like that.

I'm not going to go back to greater than eight hour days. Until I find some better ways to manage tasks and inputs my apologies (to me and to you) for not getting the good stuff done.

by Chris Dent at July 12, 2019 12:00 PM

Placement Update 19-27

Pupdate 19-27 is here and now.

Most Important

Of the features we planned to do this cycle, all are done save one: consumer types (in progress, see below). This means we have a good opportunity to focus on documentation, performance, and improving the codebase for maintainability. You do not need permission to work on these things. If you find a problem and know how to fix it, fix it. If you are not sure about the solution, please discuss it on this email list or in the #openstack-placement IRC channel.

This also means we're in a good position to help review changes that use placement in other projects.

The Foundation needs to know how much, if any, Placement time will be needed in Shanghai. I started a thread and an etherpad.

What's Changed

  • The same_subtree query parameter has merged as microversion 1.36. This enables a form of nested provider affinity: "these providers must all share the same ancestor".

  • The placement projects have been updated (pending merge) to match the Python 3 test runtimes for Train community wide goal. Since the projects have been Python3-enabled from the start, this was mostly a matter of aligning configurations with community-wide norms. When Python 3.8 becomes available we should get on that sooner than later to catch issues early.

Specs/Features

All placement specs have merged. Thanks to everyone for the frequent reviews and quick followups.

Some non-placement specs are listed in the Other section below.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 23 (0) stories in the placement group. 0 (0) are untagged. 3 (1) are bugs. 5 (0) are cleanups. 11 (0) are rfes. 4 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 11 microversions.

Main Themes

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

As mentioned last week, one of the important cleanup tasks that is not yet in progress is updating the gabbit that creates the nested topology that's used in nested performance testing. The topology there is simple, unrealistic, and doesn't sufficiently exercise the several features that may be used during a query that desires a nested response. This needs to be someone who is more closely related to real world use of nested than me. efried? gibi?

Another cleanup that needs to start is satisfying the community wide goal of PDF doc generation. Anyone know if there is a cookbook for this?

Other Placement

Miscellaneous changes can be found in the usual place.

There are three os-traits changes being discussed. And one os-resource-classes change.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

A colleague suggested yesterday that the universe doesn't have an over subscription problem, rather there's localized contention, and what we really have is a placement problem.

by Chris Dent at July 12, 2019 09:56 AM

July 11, 2019

OpenStack Superuser

How to test your developer workflow with TripleO

In this post we’ll see how to use TripleO for developing and testing changes in OpenStack Python-based projects.

Even though Devstack remains a popular tool, it’s not the only one that can handle your development workflow.

TripleO wasn’t just built for real-world deployments but also for developers working on OpenStack related projects, like Keystone for example.

Let’s say the Keystone directory where I’m writing code is in /home/emilien/git/openstack/keystone.

Now I want to deploy TripleO with that change and my code in Keystone. For that I’ll need a server (or a virtual machine) with at least 8GB of RAM, 4 vCPU and 50GB of disk and CentOS7 or Fedora28 installed.

First, prepare the repositories and install python-tripleoclient:

#sudo yum install -y git python-setuptools
git clone https://github.com/openstack/tripleo-repos
cd tripleo-repos
python setup.py install
tripleo-repos current # use -b if you're deploying a stable version
sudo yum install python-tripleoclient

If you’re deploying on recent Fedora or RHEL8, you’ll also need to install python3-tripleoclient.

Now, let’s prepare your environment and deploy TripleO:



# Change the IP you have on your host
export IP=192.168.33.20
export NETMASK=24
export INTERFACE=eth1

# cleanup
rm -f $HOME/containers-prepare-parameters.yaml $HOME/standalone_parameters.yaml

cat < $HOME/containers-prepare-parameters.yaml
parameter_defaults:
  ContainerImagePrepare:
  - push_destination: true
    set:
      name_prefix: centos-binary-
      name_suffix: ''
      namespace: docker.io/tripleomaster
      neutron_driver: ovn
      tag: current-tripleo
    tag_from_label: rdo_version
  - push_destination: true
    includes:
    - keystone
    modify_role: tripleo-modify-image
    modify_append_tag: "-devel"
    modify_vars:
      tasks_from: dev_install.yml
      source_image: docker.io/tripleomaster/centos-binary-keystone:current-tripleo
      python_dir:
        - /home/emilien/git/openstack/keystone
EOF

cat < $HOME/standalone_parameters.yaml
parameter_defaults:
  CloudName: $IP
  ControlPlaneStaticRoutes: []
  Debug: true
  DeploymentUser: $USER
  DnsServers:
    - 1.1.1.1
    - 8.8.8.8
  DockerInsecureRegistryAddress:
    - $IP:8787
  NeutronPublicInterface: $INTERFACE
  # domain name used by the host
  NeutronDnsDomain: localdomain
  # re-use ctlplane bridge for public net, defined in the standalone
  # net config (do not change unless you know what you're doing)
  NeutronBridgeMappings: datacentre:br-ctlplane
  NeutronPhysicalBridge: br-ctlplane
  # enable to force metadata for public net
  #NeutronEnableForceMetadata: true
  StandaloneEnableRoutedNetworks: false
  StandaloneHomeDir: $HOME
  StandaloneLocalMtu: 1500
  # Needed if running in a VM, not needed if on baremetal
  NovaComputeLibvirtType: qemu
EOF

sudo openstack tripleo deploy \
  --templates \
  --local-ip=$IP/$NETMASK \
  -e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \
  -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
  -e $HOME/containers-prepare-parameters.yaml \
  -e $HOME/standalone_parameters.yaml \
  --output-dir $HOME \
  --standalone

Note: change the YAML for your own needs if needed. If you need more help on how to configure Standalone, please check out the official manual.

Now, let’s say your code needs a change and you need to retest it. Once you modified your code, just run:

sudo buildah copy keystone /tmp/keystone /tmp/keystone
sudo podman exec -it -u root -w /tmp/keystone keystone python setup.py install
sudo systemctl restart tripleo_keystone 

At this stage, if you need to test a review that’s already pushed in Gerrit and you want to run a fresh deployment with it, here’s how you do it:

# Change the IP on your host
export IP=192.168.33.20
export NETMASK=24
export INTERFACE=eth1
# cleanup
rm -f $HOME/containers-prepare-parameters.yaml $HOME/standalone_parameters.yaml

cat < $HOME/containers-prepare-parameters.yaml
parameter_defaults:
ContainerImagePrepare:
- push_destination: true
set:
name_prefix: centos-binary-
name_suffix: ''
namespace: docker.io/tripleomaster
neutron_driver: ovn
tag: current-tripleo
tag_from_label: rdo_version
- push_destination: true
includes:
- keystone
modify_role: tripleo-modify-image
modify_append_tag: "-devel"
modify_vars:
tasks_from: dev_install.yml
source_image: docker.io/tripleomaster/centos-binary-keystone:current-tripleo
refspecs:
-
project: keystone
refspec: refs/changes/46/664746/3
EOF

cat < $HOME/standalone_parameters.yaml
parameter_defaults:
CloudName: $IP
ControlPlaneStaticRoutes: []
Debug: true
DeploymentUser: $USER
DnsServers:
- 1.1.1.1
- 8.8.8.8
DockerInsecureRegistryAddress:
- $IP:8787
NeutronPublicInterface: $INTERFACE
# domain name used by the host
NeutronDnsDomain: localdomain
# re-use ctlplane bridge for public net, defined in the standalone
# net config (do not change unless you know what you're doing)
NeutronBridgeMappings: datacentre:br-ctlplane
NeutronPhysicalBridge: br-ctlplane
# enable to force metadata for public net
#NeutronEnableForceMetadata: true
StandaloneEnableRoutedNetworks: false
StandaloneHomeDir: $HOME
StandaloneLocalMtu: 1500
# Needed if running in a VM, not needed if on baremetal
NovaComputeLibvirtType: qemu
EOF

sudo openstack tripleo deploy \
--templates \
--local-ip=$IP/$NETMASK \
-e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \
-r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
-e $HOME/containers-prepare-parameters.yaml \
-e $HOME/standalone_parameters.yaml \
--output-dir $HOME \
--standalone 

I hope these tips help you understand how to test any OpenStack Python-based project in a painless way — and pretty quickly. On my environment, the whole deployment takes less than 20 minutes.

About the author

Emilien Macchi, software engineer at Red Hat, describes himself as a French guy hiding somewhere in Canada. This post was first published on his blog.

Superuser is always interested in community content, get in touch: editorATopenstack.org

Photo // CC BY NC

The post How to test your developer workflow with TripleO appeared first on Superuser.

by Emilien Macchi at July 11, 2019 02:05 PM

Aptira

Lifecycle and Operational Management Orchestration

An international Telecommunications provider requires an Orchestration solution to perform a range of lifecycle and operational management functions.


The Challenge

One of our overseas customers has requested help with an orchestration solution, and had quite a list of requirements that this solution needed to address, including:

  • Deploy and configure Juniper vSRX as vCPE (Virtual Customer Premises Equipment) on the customers Cloud (OpenStack)
  • Create a Multiprotocol Label Switching Layer 2 Virtual Private Network (MPLS L2vPN) between the vCPE and routers in the data center
  • Provision bandwidth-on-demand for the L2VPN tunnel, so the required bandwidth can be updated with zero down time
  • Monitor the vCPE operational performance
  • Auto healing vCPE in case of failure
  • Autoscaling up/down depending on the load

In addition to their long list of requirements, being located overseas meant that we were required to operate remotely and across multiple time zones.


The Aptira Solution

In order to meet these specific requirements, Aptira proposed a solution based on the Cloudify Service Orchestrator product which has capabilities to provision both Virtual Network Functions (VNFs) and Physical Network Functions (PNFs).

Using Cloudify, we were able to:

  • Develop TOSCA templates to:
    • Provision the vCPE (Juniper vSRX) on customer OpenStack Cloud
    • Configure the Cisco switches and routers
    • Create a L2VPN tunnel between the on-premise Cisco routers and vCPE (Juniper vSRX) present on the customers OpenStack Cloud
  • Develop a custom Cloudify workflow to provision bandwidth on demand for MPLS L2vPN
  • Implement TOSCA templates for auto-scaling and auto-healing vCPE

As the client did not have their own lab for us to test this solution on, and given the restrictions of operating remotely, we developed the solution in our own internal lab.


The Result

Having implemented the fully-configured Cloudify solution to meet the requirements above, the Telecommunications provider is now able to successfully provision vCPE (Juniper vSRX) on their OpenStack Cloud, as well as provision the PNF (Cisco router and switch) configuration in their data center. In addition to this, a L2vPN tunnel has been created between the vCPE and PNF.

We have tested the L2VPN in both Port mode and VLAN mode, and also tested multiple scenarios of a production environment, including bandwidth on demand, auto scale up/down and auto healing.


How can we make OpenStack work for you?
Find out what else we can do with OpenStack.

Find Out Here

The post Lifecycle and Operational Management Orchestration appeared first on Aptira.

by Aptira at July 11, 2019 01:24 PM

July 10, 2019

Mirantis

C’mon! OpenStack ain’t that tough

OpenStack is still viewed as difficult to install and administer, but it's not that tough -- especially after you’ve taken OpenStack training and hands-on lab exercises.

by Paul Quigley at July 10, 2019 02:42 PM

OpenStack Superuser

Inside open infrastructure: The latest from the OpenStack Foundation

Welcome to the latest edition of the OpenStack Foundation Open Infrastructure newsletter, a digest of the latest developments and activities across open infrastructure projects, events and users. Sign up to receive the newsletter and email community@openstack.org to contribute.

Spotlight: Upstream investment opportunities launch with Glance

It takes a global village to develop the OpenStack open infrastructure platform. From time to time, the community identifies activities where additional volunteers could make a substantial impact. OpenStack recently started to revamp the help wanted list with a new process to better underscore the investment opportunities available upstream.

The first request for more brainpower comes from the Glance team. As OpenStack’s disk image management service, Glance provides a crucial component for a vast majority of deployments. Its source code is maintained by a small but dedicated team looking to expand their ranks to take on some additional challenges in upcoming development cycles.

Collaboration with a focused team like Glance can be a rewarding experience even for seasoned developers and provides a platform for newcomers to grow professionally due to its central nature and interdependence with other services. Working on a project like this also enhances an organization’s understanding of OpenStack (both the software and the people who come together to produce it) and can be instrumental in improving its own efficacy in the broader ecosystem.

Assistance is especially appreciated with code review, bug triage and fixes, development of new features, and bringing the software in line with emerging standards. If you or your employer want to help with Glance, please see the Glance contributors upstream investment opportunity for details on how to get involved.

OpenStack Foundation news

  • The OpenStack Foundation joined the Open Source Initiative as an affiliate member. This provides a unique opportunity to work together to identify and share resources that foster community and facilitate collaboration to support the awareness and integration of open-source technologies.

Open Infrastructure Summit Shanghai and Project Teams Gathering (PTG)

  • Registration is open. Summit tickets also grant access to the PTG. You can pay in U.S. dollars or yuan if you need an official invoice (fapiao.)
  • If your organization can’t fund your travel, apply for the Travel Support Program by August 8.
  • If you need a travel visa, get started now: Information here.
  • Put your brand in the spotlight by sponsoring the Summit: Learn more here.
  • Next week, we’ll be sending surveys to teams about their participation at the PTG.

OpenStack Foundation project news

OpenStack

  • The upcoming OpenStack release, Train, is planned to arrive October 16. But what about the name of the release after that? The release naming process calls for a moniker starting with the letter U, ideally related to a geographic feature close to Shanghai, China. Post your picks on the release naming Wiki page.
  • A great way to get involved in our community is to help run community elections. If you’re interested in helping for the next round in September, reach out to the current election officials.
  • The 2019 OpenStack User Survey is currently open. If you’re operating OpenStack, please share your deployment choices and feedback by August 22.

Airship

  • The first Airship Technical Committee elections are underway. With six nominations from six companies, the elections reflect how much the Airship community has grown since the project launch, just over a year ago. Polls close on July 9.
  • Drawing on the project’s telecom roots, the Airship community has proposed a new OPNFV project for an infrastructure deployment and lifecycle management tool to provide cloud and NFV infrastructure to support VNF testing and certification. The project already has a large cross-industry contributor base and plans to land a first release for fall 2019.

StarlingX

  • The community has been working hard on the second release of the project and recently reached their third milestone. This means that the release timeline is on track and the community is focusing on bug fixes, testing and some features that got an exception to still be able to fit them into the release.
  • The community has also started their planning for the third release which will happen later this year to include features such as the Train version of the OpenStack services or Time Sensitive Networking and Redfish support.

Zuul

  • Zuul 3.9.0 is released. Ansible 2.8 support is added to Zuul for job execution. Pipelines may be configured to “fail fast” stopping a buildset and reporting its results after the first failure. More details in the release notes.
  • Nodepool 3.7.0 launched. A new driver supporting unprivileged OpenShift clusters has been added. Improvements to networking and host key management have been added to the OpenStack driver. More details can be found in Nodepool’s release notes.
  • Join the Zuul community at AnsibleFest in Atlanta, September 24-26.

Upcoming Open Infrastructure Community Events

 July

August

September

October

November

OSF reception on Monday, November 18 at the Hilton Bayfront Hotel

OSF booth

Questions / feedback / contribute

This newsletter is written and edited by the OpenStack Foundation staff to highlight open infrastructure communities. We want to hear from you!
If you have feedback, news or stories that you want to share, reach us through community@openstack.org . To receive the newsletter, sign up here.

The post Inside open infrastructure: The latest from the OpenStack Foundation appeared first on Superuser.

by OpenStack Foundation at July 10, 2019 02:03 PM

Aptira

Solutionaut Anniversaries

This week 2 of our Solutionauts are celebrating their Aptira anniversaries – Lei and Ankit. So we thought we’d tell you a little bit about them and what they do at Aptira.

Aptira Staff Headshot - Lei Zhang

Lei Zhang – Cloud Engineer

“I joined Aptira as a Cloud Engineer four years ago. It is a unique and wonderful experience working at Aptira. In a small agile team, we solve big complex problems. We all work from home and help our customers from all over the world.

I never get bored because I am learning new things every day. I have great flexibility in terms of working hours as long as I get my job done. Aptira also provides us with many learning and training opportunities, sends us to conferences and helps us get certified.

Great people and great culture – it is like a family.”

Aptira Staff Headshot - Ankit Goel

Ankit Goel – DevOps Specialist

“I am a software professional with 10 years of industry experience majorly into virtualization, storage and cloud domain, I have been working with Aptira as a Senior DevOps Specialist for the past year.

It’s been an amazing journey so far to work with really great people and to learn from them. Aptira gave me an opportunity to work on new cutting edge technologies and they also provide lots of additional training and certification courses to continue learning new skills. Management is very supportive of your ideas and suggestions. Aptira provides flexible work timings and amazing work life balance.

Aptira is one of the best companies to work with.”

Lei and Ankit have been working on some exciting projects lately – check out these case studies for more details.

Let us make your job easier.
Find out how Aptira's managed services can work for you.

Find Out Here

The post Solutionaut Anniversaries appeared first on Aptira.

by Jessica Field at July 10, 2019 01:31 PM

July 09, 2019

OpenStack Superuser

When less open source is more: Report finds that fewer components work best

It turns out you can have too much of a good thing. According to the 2019 edition of the “State of the Software Supply Chain Report,” the latest boom in open-source software can lead to more vulnerabilities, technical debt and costs.

This is the fifth edition produced by Sonatype and, like previous versions, the sample size is gigantic: 36,000 open-source project teams, 3.7 million open-source releases, 12,000 engineering teams and two surveys for a combined participation of over 6,200 people. You can download the report, free with email registration, here.

Researchers found a 75 percent growth in supply of open-source component releases over the past two years, counterbalanced by a 71-percent increase in confirmed or suspected open-source related breaches since 2014.

“We’ve long advised organizations to rely on the fewest open-source components suppliers with the best track records in order to develop the highest quality and lowest risk software,” says Wayne Jackson, Sonatype CEO. The report recommends companies “tame their software supply chains” through better supplier choices, component selection and use of automation thereby reducing vulnerable components by 55 percent.

Best practices

Beyond not overloading on components, researchers found a number of characteristics common to successful teams who tended to be larger, release software twice as fast and tinker away on projects that are six times more downloaded than other teams. Less obvious? These teams were dedicated to the workaday drudgery of updating dependencies and pushing patches.

“Good development teams consider out-of-date libraries a code quality issue,” Jeremy Long, founder of the OWASP Dependency Check project says about the findings. “They build time into their schedule to upgrade their dependencies.”

The report found these open-source superstars 10 times more likely to schedule dependency updates as part of their daily work. They are also on top of vulnerabilities – clocking median times to remediate (MTT) that are 3.4 times faster than less-successful teams and they are 27 percent more likely than “laggard teams” to already be protected when new vulnerabilities crop up. Teams in the bottom 20 percent for median time to update (MTTU) and stale dependencies were the furthest behind in terms of “update hygiene,” the report found.

For projects looking to ramp up, the report still advises investing development effort on new features and bug fixes but committing similar resources to dependency management. “This means that developers maintaining open-source software projects who are considering adding a new dependency and looking for a metric to guide that choice should focus on those dependencies with fast MTTU,” report authors state.

In another interesting finding — contrary to the evergreen argument that there are too many — the report notes that successful teams are four times more likely to be housed in open-source foundations rather than traditional companies.

Check out the full report here.

The post When less open source is more: Report finds that fewer components work best appeared first on Superuser.

by Nicole Martinelli at July 09, 2019 02:02 PM

Aptira

Swinburne Ceph Deployment

Swinburne University of Technology is an Australian public University based in Melbourne, Victoria. They need to set up a massive (think petabyte) storage system for their researchers to store valuable research data.


The Challenge

This storage system needs to be dynamic, scalable, reliable, easy to manage and resize, also support commodity hardware. Large storage solutions are challenging to design and set up partly due to the level of detail that has to be addressed in the design and configuration process.

For example, accessing the data such as addresses and device names of the individual servers. For a distributed storage system such as Ceph, that aggregate can run into the hundreds. Collecting the information and entering the data manually into the design and later into a configuration management tool is prohibitive and error prone.


The Aptira Solution

Ceph was chosen as the storage solution because it meets all the functional requirements, as demonstrated in an earlier evaluation phase before Aptira become involved. Ceph is an open source software storage platform which implements object storage on a single distributed computer cluster. Ceph supports object-storage, block-storage and file-level storage. For file-level storage, Ceph provides a POSIX-compliant network file system that aims for high-perormance, large data storage and maximum compatibility with legacy applications. Ceph aims primarily for completely distributed operation without a single point of failure, scalable to the exabyte level, and freely available.

Several companies offer their own commercial products based on Ceph. After some evaluation, SUSE was chosen because of its competitive price and good service. Aptira partnered with SUSE in the deployment of SUSE Enterprise Storage (a SUSE commercial product based on Ceph), from the OS level (SLES) up to Ceph.

Two Ceph clusters were deployed on two data centres. Each cluster consists of 3 monitor (MON) nodes, 2 Gateway nodes and 12 object storage (OSD) nodes. Each OSD has 160 TB storage and thus each cluster provides 1.92 PB space. All Ceph nodes are managed by SUSE Manager, which acts as a local package repository and installs base OS on all Ceph nodes. After the nodes were installed, Puppet was used to do some customisation including configuration of network etc. Then DeepSea was used to deploy Ceph.

DeepSea is an Open Source tool specifically designed to make deploying Ceph easier. The goal of DeepSea is to save the administrator time and confidently perform complex operations on a Ceph cluster. The traditional method for deploying Ceph is ceph-deploy, which is functional and has a low overhead but requires administrators to obtain configuration data manually and make numerous configuration decisions external to the deployment script. DeepSea automates much of this and organises the Ceph deployment process into 6 structured stages:

  • Stage 0 – Provisioning
  • Stage 1 – Discovery
  • Stage 2 – Configure
  • Stage 3 – Deploy
  • Stage 4 – Services
  • Stage 5 – Removal

The Result

The deployment was completed in less than two months, which is less than what Swinburne had planned. Aptira conducted several performance tests on the system to validate that the system was running in a healthy state and met the performance benchmarks.

An “as-built” document was written and handed to Swinburne at the conclusion of this project. After the deployment, the system went into production straightaway: Swinburne use it as the backend storage of their OpenStack cloud, Commvault backup system (via RADOS gateway) and Windows File System (via iSCSI gateway).


Keep your data in safe hands.
See what we can do to protect and scale your data.

Secure Your Data

The post Swinburne Ceph Deployment appeared first on Aptira.

by Aptira at July 09, 2019 01:31 PM

July 08, 2019

Aptira

Ansible Automation and Deployment

Aptira Ansible Automation Deployment

Everyone knows that the right tool for the trade makes work a whole lot easier. It’s a bonus for us when that tool is Open Source – We have a strong preference for Open Source solutions as they provide the most flexible and effective solutions without vendor lock-in which is often expensive and restrictive.

Over the past few years, we’ve been playing with Ansible – an Open Source orchestration engine used for automation, configuration management and deployment.

ANSIBLE: AUTOMATION AND DEPLOYMENT

When it comes to customer projects, our aim is to provide custom, fully integrated, turn-key solutions encompassing various technologies to suit your business. Each piece of infrastructure is API driven, allowing complex deployments of network, compute and storage resources to be automated. Fast, repeatable deployments, minimised error rates and the flexibility to ensure that IT isn’t a drag on your business’ ability to deliver what your customers need.

We often use Ansible for customer projects, but our Solutionauts were so impressed with its capabilities that we also use it internally. We’ve built a custom OpenStack Lab with Ceph Storage and Ansible Playbooks, giving us access to resources on demand, rather than hosted externally. Win!

ANSIBLE TRAINING

In keeping with our “teach a man to fish” policy, we’re providing Ansible training to allow our customers to experience the benefits of Ansible for themselves.

This is a 3 day intermediate course, covering all the core components of Ansible, as well as dealing with sensitive data via Ansible Vault. We’ll guide participants through installing and configuring Ansible, running ad-hoc commands, understanding modules, creating and using playbooks, variables and inclusion, task control, templates, roles and more. There’s also extensive labs to provide hands-on experience with Ansible.

We offer a wide range of technology courses designed to enable users to more efficiently manage their technology stacks – all of which are customisable to suit your specific requirements.

CUSTOM TOOLS

Alternatively, we can review your infrastructure to see what manual processes can be removed utilising a tool such as Ansible. If you have specific requirements, and there aren’t any tools on the market to suit your needs – let us know. Our Inventor of Solutionauting can provide you with a tailor-made solution to meet your requirements. Chat with us today to find out how we can make your job easier.

Automate your Application.
Make your job easier with DevOps tools.

Find Out More

The post Ansible Automation and Deployment appeared first on Aptira.

by Jessica Field at July 08, 2019 01:59 PM

July 05, 2019

OpenStack Superuser

A quickstart guide to deploying Qinling in production

I recently started working on an Qinling implementation for our OpenStack platforms, this is my first post in a series about it. But before going further, here’s a quick overview of what Qinling is and what it does.

Qinling is an OpenStack project to provide Function-as-a-Service. This project aims to provide a platform to support serverless functions (like AWS Lambda). Qinling supports different container orchestration platforms (Kubernetes/Swarm, etc…) and different function package storage backends (local/Swift/S3) by nature using plugin mechanism.

Basically, it allows you to trigger a function only when you need it, helping you to consume only the CPU and memory time that you really need without requiring you to configure any servers. In the end, this makes for a lighter billing, making everyone happy. (There’s a lot more about Qinling online if you want to take a deeper dive.)

Deploying Qinling in production

Our platforms are deployed and maintained by Kolla, an OpenStack project to deploy OpenStack within Docker and configured by Ansible. The first thing I checked was to see if Qinling integrated with Kolla, alas…no.

When you have to manage production you don’t want or like to deal with custom setups that are impossible to maintain or to upgrade (that little voice in your head knows what I mean), so I started working integrating Qinling in Kolla, namely the Docker and Ansible parts.

The qinling_api and qinling_engine containers are now up and running, configured to communicate with RabbitMQ, MySQL/Galera, memcached, Keystone and etcd. The final important step is to authenticate qinling-engine to the Kubernetes cluster — I must admit this was the most complex to set up and that the documentation is a bit confusing.

Qinling and Magnum, for the win!

Our Kubernetes cluster has been provisioned by OpenStack Magnum, an OpenStack project used to deploy container orchestration engines (COE) such as Docker Swarm, Mesos and Kubernetes.

Basically, the communication between Qinling and Kubernetes is done by SSL certificates (the same ones used with kubectl), qinling-engine needs to be aware of the CA, the certificate and the key and the Kubernetes API endpoint.

Magnum provides a CLI which allows easily to retrieve the certificates, just make sure that you have python-magnumclient installed.

# Get Magnum cluster UUID
$ openstack coe cluster list -f value -c uuid -c name
687f7476–5604–4b44–8b09-b7a4f3fdbd64 goldyfruit-k8s-qinling
# Retrieve Kubernetes certificates
$ mkdir -p ~/k8s_configs/goldyfruit-k8s-qinling
$ cd ~/k8s_configs/goldyfruit-k8s-qinling
$ openstack coe cluster config --dir . 687f7476-5604-4b44-8b09-b7a4f3fdbd64 --output-certs
# Get the Kubernetes API address
$ grep server config | awk -F"server:" '{ print $2 }'

Four files should have been generated in ~/k8s_configs/goldyfruit-k8s-qinling directory:

ca.pem — CA — ssl_ca_cert (Qinling option)
cert.pem — Certificate — cert_file (Qinling option)
key.pem — Key — key_file (Qinling option)
config— Kubernetes configuration

Only ca.pem, cert.pem and key.pem will be useful in our case (config file will only be used to get the Kubernetes API), which from Qinling documentation will become these options:

[kubernetes]
kube_host = https://192.168.1.168:6443
ssl_ca_cert = /etc/qinling/pki/kubernetes/ca.crt
cert_file = /etc/qinling/pki/kubernetes/qinling.crt
key_file = /etc/qinling/pki/kubernetes/qinling.key

At this point if qinling-engine has restarted, then you should see a network policy created on the Kubernetes cluster under the qinling namespace (yes, you should see that too).

The network policy mentioned above could block the incoming traffic to the pods inside the qinling namespace which result in a timeout from qinling-engine. A bug has been opened about this issue and it should be solved soon, so right now the “best” thing to do is to remove this policy (keep in mind that every time than qinling-engine will be restarted the policy will be re-created).

$ kubectl delete netpol allow-qinling-engine-only -n qinling

Just a quick word about the network policy created by Qinling. It has the objective to restrict the pod access to a trusted CIDR list (192.168.1.0/24, 10.0.0.53/32, etc…) preventing connections from unknown sources.

One common issue is to forget to open the Qinling API port (7070), this will prevent the Kubernetes cluster to download the function code/package (it’s time to be nice with your dear network friend ^^).

Runtime, it’s time to run!

One of Qinling pitfalls is the “lack” of runtime, preventing Qinling to be widely adopted, the reason why there are not that much is because of security reasons (completely understandable).

Actually, in the production environment (especially in the public cloud), it’s recommended that cloud providers supply their own runtime implementation for security reasons. Knowing how the runtime is implemented gives the malicious user the chance to attack the cloud environment.

So far, “only” Python 2.7, Python 3 and Node.JS runtimes are available, it’s a good start but it would be nice to have it for Golang and PHP too (just saying, not asking).

Conclusion

My journey has just begun and I think Qinling has a huge potential, which is why I was a bit surprised to see the project isn’t popular as it could be.

Having it in Kolla, improving the documentation for integration with Magnum, Microk8s, etc… and providing more runtimes would help the project to gain the popularity it deserves.

Thanks to Lingxian Kong and the community for making this project happen!

About the author

Gaëtan Trellu is a technical operations manager at Ormuco.

This post first appeared on Medium.

Superuser is always interested in open infra community topics, get in touch at editorATopenstack.org

The post A quickstart guide to deploying Qinling in production appeared first on Superuser.

by Gaëtan Trellu at July 05, 2019 02:07 PM

Aptira

Visualising Prometheus Data in Grafana

Grafana is a graphing, dashboarding and alerting tool which can take data from a large number of sources and can send alerts to various locations such as email, PagerDuty, Slack etc. Everything in Grafana is displayed in a graph. For example, if you want to setup an alert, you need to create the graph and then add alerting rules to it.

Deployment

For the purpose of this demo, we’re going to deploy Grafana into a Docker container on the same host machine where we previously installed Prometheus and the Black Box exporter.

Create this file tree in /srv/docker:


root@demo:/srv/docker# find grafana/
grafana/
grafana/etc
grafana/etc/grafana.ini
grafana/etc/provisioning
grafana/etc/provisioning/notifiers
grafana/etc/provisioning/datasources
grafana/etc/provisioning/dashboards
grafana/data
grafana/logs

wget defaults.ini from Grafana’s GitHub repo and rename it to grafana.ini


root@demo:/srv/docker/grafana/etc# wget https://raw.githubusercontent.com/grafana/grafana/master/conf/defaults.ini

Note that – /srv/docker/grafana/data and /srv/docker/grafana/logs needs to have permissions that will allow the container to write, so you need to ensure write permissions are enabled for that directory.

Use the below script to start and configure the Grafana container:


root@demo:/srv/docker# cat grafana.sh
#!/bin/bash
docker run \
-d \
--restart always \
--name grafana \
-p 3000:3000 \
-v /srv/docker/grafana/data:/var/lib/grafana \
-v /srv/docker/grafana/etc:/etc/grafana \
-v /srv/docker/grafana/logs:/var/log/grafana \
-e "GF_SERVER_ROOT_URL=http://IPorDOMAIN" \
-e "GF_SECURITY_ADMIN_PASSWORD=REPLACEME" \
grafana/grafana

Change IPorDOMAIN with the actual host IP of your machine and REPLACEME with the admin password.

Grafana runs on port 3000. Open the port if it’s blocked by your firewall and access Grafana from any browser. After you’ve logged in, you will see a screen like this:

Click “Add data source” and choose data source type to Prometheus.

Provide the Prometheus url “HostIP:9090” then click Save & Test.

Next, import the dashboards to Grafana. In Grafana, you can create your own graph or use the prebuilt graphs. Prebuilt graphs are available on https://grafana.com/dashboards Search for “Node Exporter Full” or open this link https://grafana.com/dashboards/1860  and copy its ID.

On the Grafana dashboard, click the dropdown button under Home and you will see a screen like this:

Click on “Import Dashboard” and paste that copied ID here and click the Load button.

Once it’s loaded you should see a dashboard like this. It will automatically show the hosts which are configured with the Node Exporter:

You can search for other dashboards in https://grafana.com/dashboards or create your own custom dashboard.

Want to learn more about Monitoring with Prometheus and Grafana? Check out our 2 day Introduction to Monitoring course.

Monitoring and Machine Learning
Detect Anomalies within Complex Systems

Find Out Here

The post Visualising Prometheus Data in Grafana appeared first on Aptira.

by Salman Memon at July 05, 2019 01:07 PM

Chris Dent

Placement Update 19-26

Pupdate 19-26. Next week is R-15, two weeks until Train milestone 2.

Most Important

The spec for nested magic merged and significant progress has been made in the implementation. That work is nearly ready to merge (see below), after a few more reviews. Once that happens one of our most important tasks will be experimenting with that code to make sure it fully addresses the uses cases, has proper documentation (including "how do I use this?"), and is properly evaluated for performance and maintainability.

What's Changed

  • The implementation for mappings in allocation candidates had a bug which Eric found and fixed and then I realized there was a tidier way to do it. This then led to the same_subtree work needing to manage less information, because it was already there.

  • The spec for Consumer Types merged and work has started.

  • We're using os-traits 0.15.0 now.

  • There's a framework in place for nested resource provider peformance testing. We need to update the provider topology to reflect real world situations (more on that below).

  • The root_required query parameter on GET /allocation_candidates has been merged as microversion 1.35.

  • I've sent an email announcing my intent to not go to the Shangai (or any other) summit, and what changes that could imply for how Placement does the PTG.

Specs/Features

All placement specs have merged. Thanks to everyone for the frequent reviews and quick followups. We've been maintaining some good velocity.

Some non-placement specs are listed in the Other section below.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 23 (3) stories in the placement group. 0 (0) are untagged. 2 (-2) are bugs. 5 (0) are cleanups. 11 (0) are rfes. 4 (1) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 11 microversions.

Main Themes

Nested Magic

These are the features required by modern nested resource provider use cases. We've merged mappings in allocation candidates and root_required. same_subtree and resourceless request groups are what's left and they are in:

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

As mentioned above, one of the important cleanup tasks that is not yet in progress is updating the gabbit that creates the nested topology that's used in nested performance testing. The topology there is simple, unrealistic, and doesn't sufficiently exercise the several features that may be used during a query that desires a nested response.

Recently I've been seeing that the placement-perfload job is giving results that vary between N and N*2 (usually .5 and 1 seconds) and the difference that I can discern is the type of CPUs being presented by the host (same number of CPUs (8) but different type). This supports something we've been theorizing for a while: when dealing with large result sets we are CPU bound processing the several large result sets returned by the database. Further profiling required…

Another cleanup that needs to start is satisfying the community wide goal of PDF doc generation.

Other Placement

Miscellaneous changes can be found in the usual place.

There are three os-traits changes being discussed. And one os-resource-classes change.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

This space left intentionally blank.

by Chris Dent at July 05, 2019 01:00 PM

CERN Tech Blog

OpenStack Nova Scheduling Optimizations at CERN

CERN Cloud Infrastructure has been available since 2013. From the beginning we started to use Nova Cells. Initially only with 2 Cells the Infrastructure has been growing to more than 70 Cells. Cells is a Nova functionality that enables operators to shard the Infrastructure. Sharding the Infrastructure has innumerous advantages in large deployments. For example: Avoid the message broker and DB bottleneck Isolate failure domains and consequently improve availability Transparent to the end user In 2018 we upgraded from what is now known as CellsV1 (Ocata) to CellsV2 (Queens).

by CERN (techblog-contact@cern.ch) at July 05, 2019 06:30 AM

July 04, 2019

Aptira

Prometheus Exporters

Now that Prometheus has been installed and configured, we’ll discuss some of the Exporters.

In Prometheus, the data providers (agents) are called Exporters. You can write your own exporter/custom collector or use the prebuilt exporters which will collect data from your infrastructure and send it to Prometheus.

  • Node Exporter: The Prometheus Node Exporter exposes a wide variety of hardware- and kernel-related metrics like disk usage, CPU performance, memory state, et cetera, it’s for *nix systems only.
  • WMI Exporter: Prometheus exporter for Windows machines, using the WMI (Windows Management Instrumentation). It exposes the System and process metrics.
  • Blackbox Exporter: Blackbox Exporter probes endpoints over HTTP, HTTPS, DNS, TCP or ICMP protocols and gives detailed metrics about the request. The most common use of the Black Box exporter is to get the status of the webpages.
  • SNMP Exporter: This is an exporter that exposes information gathered from SNMP for use by the Prometheus monitoring system. The most common use of the SNMP exporter is to get the metrics from the monitor network devices like firewalls, switches and the devices which just supports snmp only.

Install the Node Exporter

Download the latest node exporter package from the official Prometheus site and unpack it.


root@demo:~# wget https://github.com/prometheus/node_exporter/releases/download/v0.18.1/node_exporter-0.18.1.linux-amd64.tar.gz
root@demo:~# tar -xvf node_exporter-0.18.1.linux-amd64.tar.gz

Copy the node export binary to /usr/local/bin:


root@demo:~# cp node_exporter-0.18.1.linux-amd64/node_exporter /usr/local/bin/

Now add a node_exporter user to run the node exporter service.


root@demo:~# useradd -rs /bin/false node_exporter

Create a node_exporter service file under systemd with the below-listed content:


root@demo:~# cat /etc/systemd/system/node_exporter.service


[Unit]
Description=Node Exporter
After=network.target


[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter


[Install]
WantedBy=multi-user.target

Reload the system daemon and start the node exporter service.


root@demo:~# systemctl daemon-reload
root@demo:~# systemctl start node_exporter

Check the status of the node_exporter service and run the below curl request, you should see a result like this:


root@demo:~# curl --silent http://localhost:9100/metrics | head
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.102e-05
go_gc_duration_seconds{quantile="0.25"} 2.1267e-05
go_gc_duration_seconds{quantile="0.5"} 3.1807e-05
go_gc_duration_seconds{quantile="0.75"} 3.4186e-05
go_gc_duration_seconds{quantile="1"} 3.7118e-05
go_gc_duration_seconds_sum 0.000158711
go_gc_duration_seconds_count 6
# HELP go_goroutines Number of goroutines that currently exist.

Next, add the target to the Prometheus server configuration. Open the /srv/docker/prometheus/prometheus.yml and add the IP address or hostname of the machine which you want to monitor and installed the noded_exporter.

A basic file looks like this:


root@demo:/srv/docker/prometheus# cat prometheus.yml
global:
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'Demo'
scrape_configs:
# The job name is added as a label `job=` to any timeseries scraped from this config.
- job_name: 'node'
# nodeexporter
static_configs:
- targets: ['192.168.25.7:9100']

Reload the Prometheus configuration with the below curl request


curl -X POST http://localhost:9090/-/reload

Prometheus itself has a dashboarding and alerting language, but Grafana is much nicer and has prebuilt graphs so we’re going to use that instead. You can still use the Prometheus dashboard to verify that whether node has been successfully added or not.

http://PrometheusHostIP:9090/targets

Install and configure Blackbox Exporter

For the purpose of this demo, we will use a simple shell script to configure and run the Black Box exporter. The Black Box exporter will run the same container as the Prometheus install.

Create a directory named as “blackbox-exporter” under the /srv/docker

Then create this file:


root@demo:/srv/docker/blackbox-exporter# cat blackbox.yml
modules:
http_2xx:
prober: http
http_post_2xx:
prober: http
http:
method: POST
tcp_connect:
prober: tcp
pop3s_banner:
prober: tcp
tcp:
query_response:
- expect: "^+OK"
tls: true
tls_config:
insecure_skip_verify: false
ssh_banner:
prober: tcp
tcp:
query_response:
- expect: "^SSH-2.0-"
irc_banner:
prober: tcp
tcp:
query_response:
- send: "NICK prober"
- send: "USER prober prober prober :prober"
- expect: "PING :([^ ]+)"
send: "PONG ${1}"
- expect: "^:[^ ]+ 001"
icmp:
prober: icmp

Now create and run the below script file:


root@demo:/srv/docker# cat blackbox-exporter.sh
#!/bin/bash
docker run \
-d \
--restart always \
--name blackbox-exporter \
-p 9115:9115 \
-h blackbox_exporter \
-v /srv/docker/blackbox-exporter/blackbox.yml:/etc/blackbox_exporter/config.yml \
prom/blackbox-exporter

Edit the Prometheus server configuration file to use the Blackbox exporter by adding this block:


- job_name: 'blackbox'
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets: [
'https://aptira.com/',
'https://google.com/',
'https://domains.aptira.com'
]
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 172.17.0.3:9115 # Blackbox exporter's docker container IP

You can obtain the IP address of the container with help of the Docker inspect command:


root@demo:/srv/docker# docker inspect blackbox-exporter | grep "IPAddress"
"SecondaryIPAddresses": null,
"IPAddress": "172.17.0.3",
"IPAddress": "172.17.0.3",

Reload the Prometheus configuration with the below curl request and check the UI:


curl -X POST http://localhost:9090/-/reload

Next, we will visualise this Prometheus data in Grafana.

Want to learn more about Monitoring with Prometheus and Grafana? Check out our 2 day Introduction to Monitoring course.

Monitoring and Machine Learning
Detect Anomalies within Complex Systems

Find Out Here

The post Prometheus Exporters appeared first on Aptira.

by Salman Memon at July 04, 2019 03:18 AM

July 03, 2019

John Likes OpenStack

Notes on testing a tripleo-common mistral patch

I recently ran into bug 1834094 and wanted to test the proposed fix. These are my notes if I have to do this again.

Get a patched container

Because the mistral-executor is running as a container on the undercloud I needed to build a new container and TripleO's Container Image Preparationhelped me do this without too much trouble.

As described the Container Image Preparation docs, I already download a local copy of the containers to my undercloud by running the following:


time sudo openstack tripleo container image prepare \
-e ~/train/containers.yaml \
--output-env-file ~/containers-env-file.yaml
where ~/train/containers.yaml has the following:

---
parameter_defaults:
NeutronMechanismDrivers: ovn
ContainerImagePrepare:
- push_destination: 192.168.24.1:8787
set:
ceph_image: daemon
ceph_namespace: docker.io/ceph
ceph_tag: v4.0.0-stable-4.0-nautilus-centos-7-x86_64
name_prefix: centos-binary
namespace: docker.io/tripleomaster
tag: current-tripleo

I now want to download the same set of containers to my undercloud but I want the mistral-executor container to have the proposed fix. If I vist the review and click download I can see the patch is at refs/changes/60/668560/3 and I can pass this information to TripleO's Container Image Preparationso that it builds me a container with that patch applied.

To do this I update my containers.yaml to exclude the mistral-executor container from the usual tags with the excludes list directive and then create a separate section with the includes directive specific to the mistral-executor container.

Within this new section I ask that the tripleo-modify-image ansible role pull that patch and apply it to that source image.


---
parameter_defaults:
NeutronMechanismDrivers: ovn
ContainerImagePrepare:
- push_destination: 192.168.24.1:8787
set:
ceph_image: daemon
ceph_namespace: docker.io/ceph
ceph_tag: v4.0.0-stable-4.0-nautilus-centos-7-x86_64
name_prefix: centos-binary
namespace: docker.io/tripleomaster
tag: current-tripleo
excludes: [mistral-executor]
- push_destination: 192.168.24.1:8787
set:
name_prefix: centos-binary
namespace: docker.io/tripleomaster
tag: current-tripleo
modify_role: tripleo-modify-image
modify_append_tag: "-devel-ps3"
modify_vars:
tasks_from: dev_install.yml
source_image: docker.io/tripleomaster/centos-binary-mistral-executor:current-tripleo
refspecs:
-
project: tripleo-common
refspec: refs/changes/60/668560/3
includes: [mistral-executor]

When I then run the `sudo openstack tripleo container image prepare` command I see that it took a few extra steps to create my new container image.


Writing manifest to image destination
Storing signatures
INFO[0005] created - from /var/lib/containers/storage/overlay/10c5e9ec709991e7eb6cbbf99c08d87f9f728c1644d64e3b070bc3c81adcbc03/diff
and /var/lib/containers/storage/overlay-layers/10c5e9ec709991e7eb6cbbf99c08d87f9f728c1644d64e3b070bc3c81adcbc03.tar-split.gz (wrote 150320640 bytes)
Completed modify and upload for image docker.io/tripleomaster/centos-binary-mistral-executor:current-tripleo
Removing local copy of 192.168.24.1:8787/tripleomaster/centos-binary-mistral-executor:current-tripleo
Removing local copy of 192.168.24.1:8787/tripleomaster/centos-binary-mistral-executor:current-tripleo-devel-ps3
Output env file exists, moving it to backup.

If I were deploying the mistral container in the overcloud I could just use 'openstack overcloud deploy ... -e ~/containers-env-file.yaml' and be done, but because I need to replace my mistral-executor container on my undercloud I have to do a few manual steps.

Run the patched container on the undercloud

My undercloud is ready to serve the patched mistral-executor container but it doesn't yet have its own copy of it to run; i.e. I only see the original container:


(undercloud) [stack@undercloud train]$ sudo podman images | grep exec
docker.io/tripleomaster/centos-binary-mistral-executor current-tripleo 1f0ed5edc023 9 days ago 1.78 GB
(undercloud) [stack@undercloud train]$
However, the same undercloud will serve it from the following URL:

(undercloud) [stack@undercloud train]$ grep executor ~/containers-env-file.yaml
ContainerMistralExecutorImage: 192.168.24.1:8787/tripleomaster/centos-binary-mistral-executor:current-tripleo-devel-ps3
(undercloud) [stack@undercloud train]$
So we can pull it down so we can run it on the undercloud:

sudo podman pull 192.168.24.1:8787/tripleomaster/centos-binary-mistral-executor:current-tripleo-devel-ps3
I now want to stop the running mistral-executor container and start my new one in it's place. As per Debugging with Paunch I can use the print-cmd action to extract the command which is used to start the mistral-executor container and save it to a shell script:

sudo paunch debug --file /var/lib/tripleo-config/container-startup-config-step_4.json --container mistral_executor --action print-cmd > start_executor.sh
I'll also add the exact container image name to the shell script

sudo podman images | grep ps3 >> start_executor.sh
Next I'll edit the script to update the container name and make sure the container is named mistral_executor:

vim start_executor.sh
Before I restart the container I'll prove that the current container isn't running the patch (the same command later will prove that it is).

(undercloud) [stack@undercloud train]$ sudo podman exec mistral_executor grep render /usr/lib/python2.7/site-packages/tripleo_common/utils/config.py
# string so it's rendered in a readable format.
template_data = deployment_template.render(
template_data = host_var_server_template.render(
(undercloud) [stack@undercloud train]$
Stop the mistral-executor container with systemd (otherwise it will automatically restart).

sudo systemctl stop tripleo_mistral_executor.service
Remove the container with podman to ensure the name is not in use:

sudo podman rm mistral_executor
Start the new container:

sudo bash start_executor.sh
and now I'll verify that my new container does have the patch:

(undercloud) [stack@undercloud train]$ sudo podman exec mistral_executor grep render /usr/lib/python2.7/site-packages/tripleo_common/utils/config.py
def render_network_config(self, stack, config_dir, server_roles):
# string so it's rendered in a readable format.
template_data = deployment_template.render(
template_data = host_var_server_template.render(
self.render_network_config(stack, config_dir, server_roles)
(undercloud) [stack@undercloud train]$
For a bonus, I also see it fixed the bug.

(undercloud) [stack@undercloud tripleo-heat-templates]$ openstack overcloud config download --config-dir config-download
Starting config-download export...
config-download export successful
Finished config-download export.
Extracting config-download...
The TripleO configuration has been successfully generated into: config-download
(undercloud) [stack@undercloud tripleo-heat-templates]$

by Unknown (noreply@blogger.com) at July 03, 2019 08:04 PM

Chris Dent

Remote Maintainership

This is a followup to More on Maintainership and OpenStack Denver Summit Reflection. Together these are starting to form the foundation of a series on open source collaboration in the face of the climate emergency and political and social inclusiveness. Starting...

I've decided, if possible, to stop flying to technology or other professional conferences. If I can get there by train or boat, perhaps, but all the evidence I can gather suggests that the environmental cost of conference attendance negates the value, especially when:

  • There are plenty of other costs, as discussed in my Denver report and further, below.
  • We have the technologies to collaborate remotely with greater benefits and productivity than having a self-congratulatory party with a few hundred or thousand of our not-so-closest friends, most of whom burn a bunch of energy to get there and be there.

There are—of course—advantages to being in person, especially in terms of relationship building, but what's the point of building a lasting friendship if you die soon after in a cataclysmic weather event?

I've been thinking about this for a long time, but a recent article in The Guardian, No flights, a four-day week and living off-grid: what climate scientists do at home to save the plane crystallized the issues and options for me. The article includes a section from Dave Reay, the author of New Directions: Flying in the face of the climate change convention. He hasn't flown since 2004.

Further, I can't ignore the social and political implications of conference attendance. Wherever they are, conferences create a separate space of privileged attendees and sometimes (often?) they are held in places that I don't want to give economic or other support. China's recent actions in Hong Kong and Xinjiang are intolerable and the onerous list of actions colleagues have advised on how to safely use the internet (and thus "do work") there do not inspire a willingness to attend the next OpenStack summit in Shanghai. Not that many (any?) other countries are much better these days.

I'll need to consider if being unwilling to travel to summits takes me out of the running for being a PTL or otherwise relevant in the OpenStack community. I certainly hope not, both for my sake and the sake of the community. We need as many, and as many different, people as we can get and there are others like me.

I've been doing remote collaboration since the start of this century, I even co-founded a think tank focused on high-performance and frequently-remote collaboration. I think I can help make it go.

by Chris Dent at July 03, 2019 02:15 PM

OpenStack Superuser

How open-source tech can save bird life

Most people think the occasional chirps of birds sound sweet, but one nonprofit wants to bring a riot of birdcall to backyards.

The Cacophony Project aims to bring the native bird noise back by keeping non-native predators away.

Founder Grant Ryan says it all started when the house he bought in Akaroa, New Zealand turned out to have some unexpected guests: a large population of rats and possums.

“So like every good Kiwi, I just went about doing a bit of trapping,” he says. “Over the next couple of years, I started noticing that the bird population went up and I thought, ‘man that’s kind of cool.’”

Not content to leave the problem at his own doorstep, he did some research.

“New Zealand’s the second worst place in the world for species loss,” Ryan adds. “It’s not because we don’t care, it’s that we were our own island for 70 million years and they never learned to adapt. It’s really quite embarrassing how bad it is.” Two of the main threats — rats and possums — were introduced to the island country relatively late and have devastated local bird populations.

Animating the Cacophony Project is a host of open-source software and hardware technology aimed at increasing trapping efficiency by up to 80,000 times.

It all started with the Cacophonometer, an Android application that wakes up at regular intervals, records audio and then uploads it to the Cacophony Project API server. That runs on a Node.js platform with a PostgreSQL database and uses Minio for object storage. (The test infrastructure is implemented in Python, you can take a look at the source code on GitHub.)

Then there’s the Cacophonator, an embedded platform kitted out with a thermal camera, speakers and sensors that lures, identifies and eliminates invasive predators built from a Raspberry Pi3 running on Go and Python code.

And, finally, an as yet unnamed machine learning component for classifying predators. It runs on a TensorFlow-based machine learning model trained by a classifier pipeline. It relies on data science tools including NumPy, SciPy, OpenCV and HDF5. (Even the Cacacphony Project cloud hosting is provided for free open source innovators Catalyst Cloud.)

Get involved

The Cacaphony Project is actively seeking contributors, whether experts with the open-source tech used in the project or people who are willing to learn them.  You can join the mailing list, check out the project’s GitHub (see the list of “good first issues” on where to jump in), or sync with the developers on Rocket Chat.

 

Catch Ryan’s full talk at the recent Boma New Zealand Agri Summit.

Photo // CC BY NC

The post How open-source tech can save bird life appeared first on Superuser.

by Nicole Martinelli at July 03, 2019 02:02 PM

Aptira

Monitoring your Infrastructure with Prometheus

Prometheus is an open source, next-generation monitoring and time series system that collects metrics from agents running on target hosts, storing the collected data onto the Prometheus server to be analysed for problem diagnosis.

In this series of posts, we will first install and configure Prometheus, then we look at the different exporters and visualising Prometheus data in Grafana.

Prometheus Installation

There are a couple of different methods to install Prometheus. For the purpose of this tutorial, we will install Prometheus in a Docker container running on an Ubuntu 18.04 server.

First, install Docker in your system and enable it.


sudo apt-get install docker.io
sudo systemctl start docker
sudo systemctl enable docker

Then, create this file tree in /srv/docker:


root@demo:/srv/docker# find prometheus
prometheus
rometheus/prometheus.yml
rometheus/data

Configuring Prometheus

Note that /srv/docker/prometheus/data needs to have permissions that will allow the container to write, so you need to ensure write permissions are enabled for that directory. The Prometheus configuration file can be found here: srv/docker/prometheus/prometheus.yml

We will use asimple shell script to start Prometheus. Once it has been started you won’t need to start it again because its setup as a daemon. This means Docker will restart it on boot.

Here is the shell script:


root@demo:/srv/docker# cat prometheus.sh
#!/bin/bash
docker run \
-d \
--restart always \
--name prometheus \
-p 9090:9090 \
-h prometheus \
-v /srv/docker/prometheus:/prometheus-data \
prom/prometheus --config.file=/prometheus-data/prometheus.yml --storage.tsdb.path=/prometheus-data/data --storage.tsdb.retention=3650d --web.enable-lifecycle

Verify that the Prometheus container is running by using Docker ps -a command.


root@demo:/srv/docker# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
be6ba6bddffe prom/prometheus "/bin/prometheus --c…" 2 minutes ago Up 2 minutes 0.0.0.0:9090->9090/tcp prometheus

It should be now available on port 9090. Open this port if it’s blocked by your firewall.

Prometheus expects to monitor things over HTTP and expects the exporters to be embedded into the application being monitored. The Prometheus community manages a list of well-known port numbers on GitHub, but here’s a list of the most useful ones:

  • 9090: prometheus
  • 9091: pushgateway
  • 9100: nodeexporter

Prometheus itself has a dashboarding and alerting language, but Grafana is much nicer and it has prebuilt dashboards so we’re going to use that instead.

In the next posts, we will configure an exporter and view the metrics of different nodes within Grafana.

Want to learn more about Monitoring with Prometheus and Grafana? Check out our 2 day Introduction to Monitoring course.

Monitoring and Machine Learning
Detect Anomalies within Complex Systems

Find Out Here

The post Monitoring your Infrastructure with Prometheus appeared first on Aptira.

by Salman Memon at July 03, 2019 01:24 AM

July 02, 2019

OpenStack Superuser

Meet magnum-auto-healer: A solution for OpenStack Magnum clusters

Kubernetes is self-healing container orchestration platform that can detect failures from pods and redeploy those workloads, but magnum-auto-healer is a self-healing cluster management service that will automatically recover a failed master or worker node within your Magnum cluster.

Basically, magnum-auto-healer ensures that the Kubernetes nodes you’re running are healthy by monitoring their status periodically, searching for unhealthy instances and triggering replacements when needed, maximizing the cluster’s high availability and reliability as well as protecting applications from downtime when the node it’s running on fails.

Another common concern for Kubernetes clusters is scalability. Kubernetes cluster-autoscaler can scale the worker pools in your cluster automatically to increase or decrease the number of worker nodes based on the sizing needs of the scheduled workloads. cluster-autoscaler periodically scans the cluster to adjust the number of worker nodes in response to your workload resource requests and any custom settings that you configure, such as scanning intervals. The main purpose of cluster-autoscaler is autoscaling, not autohealing. There’s also a Magnum driver for cluster-autoscaler, cluster-autoscaler that can be deployed together with magnum-auto-healer.

Like cluster-autoscaler, magnum-auto-healer is supposed to run together with cloud providers as well; OpenStack Magnum is supported as the reference implementation.

What the magnum-auto-healer can do for you

In the current Kubernetes design, one major downside for developers is that Kubernetes can’t auto-manage its own machines. As a consequence, operations must get involved every time a worker node fails, such as when the kubelet service hangs, a random hardware failure, etc. So the company where I work, Catalyst Cloud, developed the magnum-auto-healer to enable a node auto-repair process. It’s similar to the auto-repairing feature in GKE (Google Kubernetes Engine) but the magnum-auto-healer is fully open source and offers a pluggable mechanism that supports various cloud providers.

In addition to on-premise solutions like GKE node auto-repair, there are a few similar open-source projects such as OpenShift’s machine healthcheck controller. However, most of these existing solutions integrate with Kubernetes so tightly by defining CRD resources and managing the node resources on their own. On the contrary, the magnum-auto-healer is assumed to be running in the cloud environment, which means the Kubernetes cluster (and all its nodes) is created and managed by the cloud service API, the source of truth of the cluster information comes from the cloud rather than the Kubernetes cluster. As a result, magnum-auto-healer is designed as a lightweight service that can coordinate with the cloud environment for the auto-healing.

Behind the design

A few considerations that were top of mind when we designed the service:

  • A single component for the cluster auto-healing purpose. There were already some other components to deal with some specific tasks separately, combining them together with some customization may work, but this leads to more complexity and maintenance overhead.
  • Support both master nodes and worker nodes.
  • Allow the cluster administrator to disable the autohealing feature on the fly, which is very important for cluster operations like upgrade or scheduled maintenance.
  • Give the option for the Kubernetes cluster to not be exposed to either the public or the OpenStack control plane. For example, in Magnum, the end user may create a private cluster that’s not accessible even from Magnum control services.
  • The health check should be pluggable. Deployers should be able to write their own health check plugin with customized health check parameters.
  • Support different cloud providers.

How to deploy and test magnum-auto-healer

Prerequisites

  1. A multi-node cluster (three masters and  three workers) is created in OpenStack Magnum.
     $ openstack coe cluster list
     +--------------------------------------+-----------------------------+-----------------+------------+--------------+-----------------+
     | uuid                                 | name                        | keypair         | node_count | master_count | status          |
     +--------------------------------------+-----------------------------+-----------------+------------+--------------+-----------------+
     | c418c335-0e52-42fc-bd68-baa8d264e072 | lingxian_por_test_1.12.7_ha | lingxian_laptop |          3 |            3 | CREATE_COMPLETE |
     +--------------------------------------+-----------------------------+-----------------+------------+--------------+-----------------+
     $ openstack server list --name lingxian-por-test-1-12-7-ha
     +--------------------------------------+---------------------------------------------------+--------+-----------------------------------------+-------------------------+---------+
     | ID                                   | Name                                              | Status | Networks                                | Image                   | Flavor  |
     +--------------------------------------+---------------------------------------------------+--------+-----------------------------------------+-------------------------+---------+
     | 908957c2-ac88-4b54-a1fc-91f9cc8f98f1 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-2 | ACTIVE | lingxian_net=10.0.10.33, 150.242.42.234 | fedora-atomic-27-x86_64 | c1.c4r8 |
     | 8f0c3ad9-caf5-45b6-bf3a-97b3bb6de623 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-0 | ACTIVE | lingxian_net=10.0.10.32, 150.242.42.233 | fedora-atomic-27-x86_64 | c1.c4r8 |
     | a6ae4cee-7cf2-4b25-89bc-a5c6cb2c364d | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-1 | ACTIVE | lingxian_net=10.0.10.34, 150.242.42.245 | fedora-atomic-27-x86_64 | c1.c4r8 |
     | 2af96203-cc6f-4b55-8fb2-062340207ebb | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-2 | ACTIVE | lingxian_net=10.0.10.31, 150.242.42.226 | fedora-atomic-27-x86_64 | c1.c2r4 |
     | 10bef366-b5a8-4400-b2c3-82188ec06b13 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-1 | ACTIVE | lingxian_net=10.0.10.30, 150.242.42.22  | fedora-atomic-27-x86_64 | c1.c2r4 |
     | 9c17f034-6825-4e49-b3cb-0ecddd1a8dd8 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-0 | ACTIVE | lingxian_net=10.0.10.29, 150.242.42.213 | fedora-atomic-27-x86_64 | c1.c2r4 |
     +--------------------------------------+---------------------------------------------------+--------+-----------------------------------------+-------------------------+---------+
    
  2. The kubeconfig file of the cluster is in place.

Deploy magnum-auto-healer

We recommend running magnum-auto-healer service as a DaemonSet on the master nodes, the service is running in active-passive mode using leader election mechanism. There is a sample manifest file here, you need to change some variables as needed before actually running kubectl apply command.
The following commands are just examples:

magnum_cluster_uuid=c418c335-0e52-42fc-bd68-baa8d264e072
keystone_auth_url=https://api.nz-por-1.catalystcloud.io:5000/v3
user_id=ceb61464a3d341ebabdf97d1d4b97099
user_project_id=b23a5e41d1af4c20974bf58b4dff8e5a
password=password
region=RegionOne
image=lingxiankong/magnum-auto-healer:0.1.0

cat <<EOF | kubectl apply -f -
---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: magnum-auto-healer
  namespace: kube-system

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: magnum-auto-healer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: magnum-auto-healer
    namespace: kube-system

---
kind: ConfigMap
apiVersion: v1
metadata:
  name: magnum-auto-healer-config
  namespace: kube-system
data:
  config.yaml: |
    cluster-name: ${magnum_cluster_uuid}
    dry-run: false
    monitor-interval: 15s
    check-delay-after-add: 20m
    leader-elect: true
    healthcheck:
      master:
        - type: Endpoint
          params:
            unhealthyDuration: 30s
            protocol: HTTPS
            port: 6443
            endpoints: ["/healthz"]
            okCodes: [200]
        - type: NodeCondition
          params:
            unhealthyDuration: 1m
            types: ["Ready"]
            okValues: ["True"]
      worker:
        - type: NodeCondition
          params:
            unhealthyDuration: 1m
            types: ["Ready"]
            okValues: ["True"]
    openstack:
      auth-url: ${keystone_auth_url}
      user-id: ${user_id}
      project-id: ${user_project_id}
      password: ${password}
      region: ${region}

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: magnum-auto-healer
  namespace: kube-system
  labels:
    k8s-app: magnum-auto-healer
spec:
  selector:
    matchLabels:
      k8s-app: magnum-auto-healer
  template:
    metadata:
      labels:
        k8s-app: magnum-auto-healer
    spec:
      serviceAccountName: magnum-auto-healer
      tolerations:
        - effect: NoSchedule
          operator: Exists
        - key: CriticalAddonsOnly
          operator: Exists
        - effect: NoExecute
          operator: Exists
      nodeSelector:
        node-role.kubernetes.io/master: ""
      containers:
        - name: magnum-auto-healer
          image: ${image}
          imagePullPolicy: Always
          args:
            - /bin/magnum-auto-healer
            - --config=/etc/magnum-auto-healer/config.yaml
            - --v
            - "2"
          volumeMounts:
            - name: config
              mountPath: /etc/magnum-auto-healer
      volumes:
        - name: config
          configMap:
            name: magnum-auto-healer-config
EOF

Testing magnum-auto-healer

You can ssh into a worker node (lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-1 in this example) and stop the kubelet service to simulate the worker node failure. The node status check is implemented in NodeCondition health check plugin (see configuration above).

$ ssh fedora@150.242.42.245
[fedora@lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-1 ~]$ sudo systemctl stop kubelet

Now wait for the magnum-auto-healer to detect the node failure and trigger the repair process. Notice that the unhealthy node is shut down:

+--------------------------------------+---------------------------------------------------+---------+-----------------------------------------+-------------------------+---------+
| ID                                   | Name                                              | Status  | Networks                                | Image                   | Flavor  |
+--------------------------------------+---------------------------------------------------+---------+-----------------------------------------+-------------------------+---------+
| 908957c2-ac88-4b54-a1fc-91f9cc8f98f1 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-2 | ACTIVE  | lingxian_net=10.0.10.33, 150.242.42.234 | fedora-atomic-27-x86_64 | c1.c4r8 |
| a6ae4cee-7cf2-4b25-89bc-a5c6cb2c364d | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-1 | SHUTOFF | lingxian_net=10.0.10.34, 150.242.42.245 | fedora-atomic-27-x86_64 | c1.c4r8 |
| 8f0c3ad9-caf5-45b6-bf3a-97b3bb6de623 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-0 | ACTIVE  | lingxian_net=10.0.10.32, 150.242.42.233 | fedora-atomic-27-x86_64 | c1.c4r8 |
| 2af96203-cc6f-4b55-8fb2-062340207ebb | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-2 | ACTIVE  | lingxian_net=10.0.10.31, 150.242.42.226 | fedora-atomic-27-x86_64 | c1.c2r4 |
| 10bef366-b5a8-4400-b2c3-82188ec06b13 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-1 | ACTIVE  | lingxian_net=10.0.10.30, 150.242.42.22  | fedora-atomic-27-x86_64 | c1.c2r4 |
| 9c17f034-6825-4e49-b3cb-0ecddd1a8dd8 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-0 | ACTIVE  | lingxian_net=10.0.10.29, 150.242.42.213 | fedora-atomic-27-x86_64 | c1.c2r4 |
+--------------------------------------+---------------------------------------------------+---------+-----------------------------------------+-------------------------+---------+

Then a new node comes up:

+--------------------------------------+---------------------------------------------------+---------+-----------------------------------------+-------------------------+---------+
| ID                                   | Name                                              | Status  | Networks                                | Image                   | Flavor  |
+--------------------------------------+---------------------------------------------------+---------+-----------------------------------------+-------------------------+---------+
| 31d5e246-6f40-4e14-88a9-8cd86a19c75a | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-1 | BUILD   |                                         | fedora-atomic-27-x86_64 | c1.c4r8 |
| 908957c2-ac88-4b54-a1fc-91f9cc8f98f1 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-2 | ACTIVE  | lingxian_net=10.0.10.33, 150.242.42.234 | fedora-atomic-27-x86_64 | c1.c4r8 |
| a6ae4cee-7cf2-4b25-89bc-a5c6cb2c364d | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-1 | SHUTOFF |                                         | fedora-atomic-27-x86_64 | c1.c4r8 |
| 8f0c3ad9-caf5-45b6-bf3a-97b3bb6de623 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-0 | ACTIVE  | lingxian_net=10.0.10.32, 150.242.42.233 | fedora-atomic-27-x86_64 | c1.c4r8 |
| 2af96203-cc6f-4b55-8fb2-062340207ebb | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-2 | ACTIVE  | lingxian_net=10.0.10.31, 150.242.42.226 | fedora-atomic-27-x86_64 | c1.c2r4 |
| 10bef366-b5a8-4400-b2c3-82188ec06b13 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-1 | ACTIVE  | lingxian_net=10.0.10.30, 150.242.42.22  | fedora-atomic-27-x86_64 | c1.c2r4 |
| 9c17f034-6825-4e49-b3cb-0ecddd1a8dd8 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-0 | ACTIVE  | lingxian_net=10.0.10.29, 150.242.42.213 | fedora-atomic-27-x86_64 | c1.c2r4 |
+--------------------------------------+---------------------------------------------------+---------+-----------------------------------------+-------------------------+---------+

Finally, all the nodes are healthy again after the repair processs. In Magnum, the new node has the same IP address and hostname with the previous one:

+--------------------------------------+---------------------------------------------------+--------+-----------------------------------------+-------------------------+---------+
| ID                                   | Name                                              | Status | Networks                                | Image                   | Flavor  |
+--------------------------------------+---------------------------------------------------+--------+-----------------------------------------+-------------------------+---------+
| 31d5e246-6f40-4e14-88a9-8cd86a19c75a | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-1 | ACTIVE | lingxian_net=10.0.10.34, 150.242.42.245 | fedora-atomic-27-x86_64 | c1.c4r8 |
| 908957c2-ac88-4b54-a1fc-91f9cc8f98f1 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-2 | ACTIVE | lingxian_net=10.0.10.33, 150.242.42.234 | fedora-atomic-27-x86_64 | c1.c4r8 |
| 8f0c3ad9-caf5-45b6-bf3a-97b3bb6de623 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-minion-0 | ACTIVE | lingxian_net=10.0.10.32, 150.242.42.233 | fedora-atomic-27-x86_64 | c1.c4r8 |
| 2af96203-cc6f-4b55-8fb2-062340207ebb | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-2 | ACTIVE | lingxian_net=10.0.10.31, 150.242.42.226 | fedora-atomic-27-x86_64 | c1.c2r4 |
| 10bef366-b5a8-4400-b2c3-82188ec06b13 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-1 | ACTIVE | lingxian_net=10.0.10.30, 150.242.42.22  | fedora-atomic-27-x86_64 | c1.c2r4 |
| 9c17f034-6825-4e49-b3cb-0ecddd1a8dd8 | lingxian-por-test-1-12-7-ha-bbgjts5g4xhb-master-0 | ACTIVE | lingxian_net=10.0.10.29, 150.242.42.213 | fedora-atomic-27-x86_64 | c1.c2r4 |
+--------------------------------------+---------------------------------------------------+--------+-----------------------------------------+-------------------------+---------+

The whole process is available in a video demo.

Get involved

Currently, the magnum-auto-healer is still in the prototype phase, meaning many breaking changes can get accepted over time. Catalyst Cloud will deploy the service in production but as an alpha feature. Any feedback or contributions are welcome.

About the author

Lingxian Kong is a senior developer at Catalyst Cloud and frequent Superuser contributor. Follow him on Weibo or check out his LinkedIn profile. This post first appeared on GitHub.

 

Superuser is always interested in open infrastructure community content, get in touch: editorATopenstack.org

Photo // CC BY NC

The post Meet magnum-auto-healer: A solution for OpenStack Magnum clusters appeared first on Superuser.

by Lingxian Kong at July 02, 2019 02:05 PM

Aptira

Designing Cloud Architecture for High Performance GPU-powered Desktops

Designing Cloud Architecture for High Performance GPU

A local University with a strong research focus had initiated a successful pilot project to bring remote, on-demand, interactive graphically intensive computing environments to a limited number of research groups. The challenge was to take it to the next level into a sustainable production service utilising high performance GPU-powered desktops.


The Challenge

The University wanted a facility that will not only allow for the more efficient creation of high impact research outcomes, but potentially also allow the University to be renown as a leader in the provisioning of high performance GPU-based imaging and 3D visualization. Advanced research of this type has some unique characteristics: intense calculation loads and truly massive data storage and transfer requirements.

The GPU’s support advanced mathematical and analytics calculations, most often enabled by proprietary software developed for specific research domains at world-class level, e.g. Genome research Virtualised GPU’s enable these expensive components to be shared over multiple applications.

And just one scan from an advanced instrument can generate Terabytes of raw data.

The University is also taking the opportunity to consider creating a platform that supports other uses, such as a tie-in with its HPC system and integrating with national facilities. They needed a Cloud expert to work with the various stakeholders to define the architecture as input to a business case.


The Aptira Solution

Aptira ran a requirement gathering process in which more than 20 University staff gathered for various workshop meetings and whiteboard sessions over two solid days to collect requirements. The requirements were collated, and rapid technical feasibility studies were conducted to provide a choice of technologies for key areas. As is often the case, asking the right questions can really help us to focus on what is important.

Aptira produced an architecture design using the University’s standard templates. The architecture design gave the University a framework to understand the way in which the requirements could be implemented and to see how the new solution might integrate with other systems.

The architecture was an OpenStack based Cloud with enhanced high performance GPU support, authorisation and authentication, and user interfaces.


The Result

The University has been furnished with a high performance GPU-powered Cloud architecture they can use to plan the next steps of their infrastructure transformation and competitive vendor selection.

This architecture design not only demonstrated the feasibility of a platform that would enhance the user experience of the pilot users, but will also be extensible to cater for additional use cases and meet their massive data requirements.


Let us make your job easier.
Find out how Aptira's managed services can work for you.

Find Out Here

The post Designing Cloud Architecture for High Performance GPU-powered Desktops appeared first on Aptira.

by Aptira at July 02, 2019 01:10 PM

July 01, 2019

Aptira

Bare Metal High Performance Computing (HPC)

Aptira High Performance Computing HPC

One of our customers is facing a problem with efficiently provisioning bare metal servers to their High Performance Computing (HPC) cluster.


The Challenge

This challenge involved provisioning bare metal servers to a High Performance Computing (HPC) cluster, including installing the Operating System and configuring their networking. This customers’ provisioning at the time was performed using Cobbler. Although Cobbler can automate the process of installing OS and network configuration, it is not well integrated with their other existing platforms like NeCTAR and Object Storage. In particular, some information needed to be maintained manually.


The Aptira Solution

OpenStack is highly capable of provisioning flexible infrastructure, including high performance computing clusters and can solve the problems this customer was facing, so we planned to move them to OpenStack Ironic. We designed and deployed a minimal Highly Available OpenStack Cloud with Ironic installed. The solution was delivered by Kolla-Ansible, a tool to deploy a production-ready containerised OpenStack Cloud. The OpenStack instance was integrated with their existing Object storage cluster (Ceph Radosgw) to store bare metal images.

One issue that arose early was the lack of direct Internet access. Internet access is required to download images from the Docker hub. The customer had internet access via http proxy. The corresponding Ansible code for glance-api and ironic-conductor containers don’t support passing http proxy environment variables to the containers, so these two containers deployed by Kolla-Ansible couldn’t access the Internet. Fortunately, Aptira was able to patch the Ansible code to enable support for http proxy and we were able to continue.

Another challenge that arose was using an external Radosgw as the storage backend. Our engineers identified a bug that showed up when Ironic uses Ceph Radosgw as it’s backend. Basically, there was a bug in the format of the endpoint url, specifically this file: ironic/common/glance_service/v2/image_service.py

We were able to identify and correct this defect so that Ironic was able to download images from the Ceph Radosgw backed image service.


The Result

We successfully delivered an efficient mechanism for provisioning the customer’s HPC Bare metal servers, using the Highly Available OpenStack Cloud with ironic service that we implemented.

Aptira supported the customer post-deployment during their product test phases and catered to new requirements not specified in their original requirements documents. This is the nature of agile/flexible development so this was not unexpected or a problem. For example, the customer realized that they needed to set MTU 9000 using fixed IP address. Also, the customer needed to building GPT images for their bare-metal nodes.

After first passing extensive testing in the customer’s test environment, the solution was deployed in production and the customer is now completing their final testing of the bare metal High Performance Computing cluster.


How can we make OpenStack work for you?
Find out what else we can do with OpenStack.

Find Out Here

The post Bare Metal High Performance Computing (HPC) appeared first on Aptira.

by Aptira at July 01, 2019 01:20 PM

OpenStack Superuser

Takeaways from OpenInfra Days Krakow 2019

Every year in June, the team from Compendium Education Center organizes a community event in Krakow, Poland. This year signaled the first event under the new name, OpenInfra Days Krakow.

Roughly 200 people boarded Forum Wydarzen directly on the river Vistula for  the two-day event with three session streams. In the workshop stream the attendees played the Phoenix Project Game, a business simulation around DevOps and Agile Work.

The keynotes were from Thierry Carrez, VP of engineering at OpenStack Foundation and Simon Briggs, senior solutions architect for cloud at SUSE, also one of the main sponsors for the event. Both talked about open development and open source. Carrez emphasized the importance of integration as the last mile while Briggs traced the history of the SUSE mascot chameleon as a symbol of innovation. The fitting title for his session: “Evolve or Die.”

The breakout sessions toggled between Polish and English, with the slides predominantly in English.  Stream two started after a break with Kenneth Tan from Sardina Systems advising attendees to “Be ready for OpenStack and Kubernetes highly efficient while optimizing your resource management.” It was very deep dive into the business of OpenStack and Kubernetes operations, cost avoidance and optimization.

“Are your containers secure?” asked Michał Gutowski from Oracle Open cloud infrastructure asked in his talk. He touched on several points about risks, e.g downloading containers from unknown sources or using unknown installations. Overall, a very practical talk with a lot of good advice.

After the lunch break, a full block of NFV sessions were served up. All of them focused on OpenStack and in demo sessions attendees could see the power of virtual networks.

At the end of the first day, it was my turn to talk about the business case for OpenStack, offering use cases on how to adapt working methods from the OpenStack community to enterprise outfits.

Next on the agenda in the evening was the party, started on a boat at the foot of the castle Wawel with the fire-breathing dragon:

Photo // CC BY NC

But that was only the starting point. To reach the next stop of the party, attendees had to solve some riddles and puzzles — each pub meant solving another riddle. A really fun event with lots of conversations, mostly in Polish.

After this scorching evening (about 35 degrees — that’s 95 in Farenheit), day two started at Forum Wydarzen.

“Beyond the hype: the real promise of AI” was the first talk given by Kaimar Karu at Mindbridge. A very good overview about the status of AI and how about the future.

After that, Szymon Datko at OVH, another event sponsor, talked about Zuul. The session was held in Polish and a lots of people are interested in Zuul, the gate automation system, a fairly new topic for the audience.

Szymon showed some nice demos with Zuul Containers, which provides a completely Zuul installation in minutes. Very impressive! That we are still with OpenStack during the Open Infra Days showed the session by Piotr Kopeć from Red Hat. He was explaining configuration management in TripleO and how is the current project status.

And also Bartosz Żurkowski from Samsung combined different OpenStack projects for his use case. He is still develop Trove and showed a way for self-healing database operations with Vitrage and Mistral.

At the end of the event we had a lot of winners:

The sponsors offered various prizes, but everyone took home a lot of new knowledge, new contacts and the possibility for new collaborations.

“We have a much wider range with the new title OpenInfra Days and looking forward to next year”, Bartosz Niepsuj said, one of the event organizers.

A huge thanks to the organization team and of course the sponsors. See you next year in Krakow.

About the author

Frank Kloeker is a technology manager of cloud applications at Deutsche Telekom.

Photo // CC BY NC

The post Takeaways from OpenInfra Days Krakow 2019 appeared first on Superuser.

by Frank Kloeker at July 01, 2019 12:02 PM

June 28, 2019

Mirantis

Toward a pure devops world: why infrastructure must be treated as software

While the bleed of apps to infrastructure was inevitable, the move to cloud native architecture makes automation essential. The infrastructure is very much a part of the application.

by Nick Chase at June 28, 2019 04:30 PM

OpenStack Superuser

What Programming Committee members are looking for in 5G, NFV & edge content for the Open Infra Summit Shanghai

These days, there’s more intersection than ever between NFV, telecom, 5G and edge computing. The Open Infra Summit in Shanghai is combining the track to highlight these developments. Here’s what members of the Programming Committee are looking for. (For tips on the other Tracks, check out this post.)
Submit your proposals by July 2.

Ian Jolliffe, director of engineering, Wind River


“I’d like to see talks relevant to the China market.  I’ve been to China many times and understand the market fairly well as an outsider.  I’d like to see a wide variety of talks on new edge use cases, complementary projects that could be relevant to Edge, NFV and 5G; The convergence of technology to come together to address the challenges of these three similar but different domains/technologies.”

Ben Silverman, chief cloud architect, Cincinnati Bell Technology Services (CBTS)

“I’m looking for interesting proposals that either use open infrastructure or open source, including OpenStack, in ways that are different, especially for use cases like 5G and edge technologies. Edge is quickly becoming the enabler for so many new technologies, so I’d love to see some specific architectures or engineering for particular large use cases like NFV, 5G and edge, but at the same time, could be used elsewhere for distributed computing that is not on the Edge. Of course no track would be complete without discussing containers, so I’m also looking for unique ideas regarding the operations, deployment and security of containers, VMs and bare metal with Edge, 5G and NFV use cases. Great questions to think about when putting together a proposal in this area might be: How do we deploy the infrastructure? What do we deploy it in/on? How is bare metal handled? Is latency our concern, is it high bandwidth, or is it both? Why does edge, 5G and NFV using open infrastructure make sense? How do companies have a win/win/win here?”

Shane Wang, engineering manager at Intel, individual board member of OSF, chair of China Open Infra meetup group 


“I’d like to see user case adoption of NFV and edge computing in telco carriers especially based on OpenStack and related pilot projects, diversity of edge computing open source projects, instead of StarlingX only. I would like to see collaborations between many projects to make NFV and edge happen and become real.”

Qihui Zhao, project manager, China Mobile

“With the development of 5G, cloud computing has been gradually evolving to edge. On one hand, we could focus on MEC use cases and related implementations. On the other hand, we need to consider the collaboration between cloud and edge, IT and CT services, so that we could build an automatic ICT cloud. In addition, for infrastructure providers, how to achieve unified O&M of cloud resources and shield the difference of underlying infrastructures is also a hot topic.”

 

Get your talk submissions in by July 2!

The post What Programming Committee members are looking for in 5G, NFV & edge content for the Open Infra Summit Shanghai appeared first on Superuser.

by Ashlee Ferguson at June 28, 2019 02:01 PM

Galera Cluster by Codership

Galera Cluster 4 with MariaDB 10.4

Congratulations to Team MariaDB at MariaDB Corporation and MariaDB Foundation for releasing MariaDB 10.4.6 as Generally Available (GA) last week on 18 June 2019. This release is very exciting for Galera Cluster users as it comes with Galera 4 (it is now the first server to come with it!), with Galera wsrep library version 26.4.2.

What can Galera Cluster users expect from MariaDB 10.4? Some high level features include:

  • Streaming replication — a huge boost to large transaction support, since the node breaks transactions into fragments, replicates and certifies it across all secondary nodes while the transaction is still in progress. Read more about it in our dedicated documentation on streaming replication as well as a guide on using streaming replication (yes, you have to enable it first).
  • Galera System Tables — there are three new tables added to the mysql database: wsrep_cluster, wsrep_cluster_members, and wsrep_streaming_log. As a database administrator, you can see the cluster activity — again, please read the documentation on system tables, and note that if you do not have streaming replication enabled, you will not see anything in wsrep_streaming_log.
  • Synchronisation functions — these are SQL functions for use in wsrep synchronisation operations, like getting the GTID based on the last write or last seen transaction, as well as setting the node to wait for a specific GTID to replicate and apply, before executing the next transaction. 

But that is not all — recently a presentation by our CEO, Seppo Jaakola, can also shed some light into new features, and the roadmap. Please read: Galera 4 in MariaDB 10.4 for more information.

Both Team Codership and Team MariaDB have worked hard to ensure that there can be rolling upgrades performed from Galera Cluster in MariaDB Server 10.3 to MariaDB Server 10.4, and we highly recommend that you read the upgrade documentation: Upgrading from MariaDB 10.3 to MariaDB 10.4 with Galera Cluster.

So what are you waiting for? Give MariaDB Server 10.4 with Galera Cluster 4 a try (download it), and provide us some feedback. Bugs can of course be reported to the MariaDB Jira. We will monitor the maria-discuss and maria-developers mailing lists but don’t forget to ask specific Galera Cluster questions at our Google Group.

by Sakari Keskitalo at June 28, 2019 08:59 AM

Aptira

EOFY Deals Ending Soon

ICYMI, We’re offering End of Financial Year discounts on all of our training courses, including: 

  • Discounts for Multiple Courses
  • Discounts for Pre-Payment
  • Discounts for Training Bundled with Software
  • Discounts for Training Bundled with Hardware
  • Discounts for Training Bundled with Services

Once purchased, these deals can be used at any time during the next financial year.

If you’re looking to refresh your hardware, need new software licenses or would like to upskill yourself or your team – let us know. We can build a customised bundle to suit your requirements – and save you $$$. If there’s a specific bundle you require for your business and we haven’t listed it above, feel free to ask.

For more details, check out this page or chat with us to find the best deal for you.

Don’t miss out!

Learn from instructors with real world expertise.
Start training with Aptira today.

View Courses

The post EOFY Deals Ending Soon appeared first on Aptira.

by Jessica Field at June 28, 2019 05:51 AM

June 27, 2019

Aptira

OpenStack Rules: How OpenVswitch works inside OpenStack

OpenStack Fuel Rules: How OpenVSwitch works inside OpenStack

Understanding OpenFlow rules

OpenVswitch (OVS) is a virtual switch that connects virtual machines together using virtual links and ports. Traditionally this would be done by a physical switch over physical links and network cards and switch ports. In OpenStack, OVS also plays an important role which provides virtualised network services and both the Neutron node, and the compute node are running OpenVSwitches.

But what is important about OVS is its role in manipulating and directing the coming in and out. In this article we intend to describe the flow rules installed on OVS inside OpenStack Mitaka.

Login to Mitaka node using the following:

ssh root@Mitaka’s IP address

For example:

ssh root@192.168.127.101

Login to the compute node:


[root@mitaka ~]# ssh compute
Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-135-generic x86_64)
* Documentation: https://help.ubuntu.com/
Last login: Wed Sep 26 06:40:57 2018 from 10.20.0.2
root@node-4:~#

Print the information of the br-tun of OpenStack as it provides communication inside and outside of the OpenStack:


root@node-4:~# ovs-ofctl dump-flows br-tun
NXST_FLOW reply (xid=0x4):
1- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=0, n_packets=183, n_bytes=28498, idle_age=4, priority=1,in_port=1 actions=resubmit(,2)
2- cookie=0xbb7b3cdd52626a01, duration=9917.985s, table=0, n_packets=198, n_bytes=36045, idle_age=4, priority=1,in_port=2 actions=resubmit(,4)
3- cookie=0xbb7b3cdd52626a01, duration=13003.030s, table=0, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=drop
4- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=2, n_packets=1, n_bytes=42, idle_age=9913, priority=1,arp,dl_dst=ff:ff:ff:ff:ff:ff actions=resubmit(,21)
5- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=2, n_packets=168, n_bytes=26780, idle_age=4, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)
6- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=2, n_packets=14, n_bytes=1676, idle_age=9904, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22)
7- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=3, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=drop
8- cookie=0xbb7b3cdd52626a01, duration=9921.166s, table=4, n_packets=198, n_bytes=36045, idle_age=4, priority=1,tun_id=0x2 actions=mod_vlan_vid:1,resubmit(,10)
9- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=4, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=drop
10- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=6, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=drop
11- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=10, n_packets=198, n_bytes=36045, idle_age=4, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xbb7b3cdd52626a01,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1
12- cookie=0xbb7b3cdd52626a01, duration=9917.984s, table=20, n_packets=102, n_bytes=14108, idle_age=9435, priority=2,dl_vlan=1,dl_dst=fa:16:3e:0b:cf:10 actions=strip_vlan,set_tunnel:0x2,output:2
13- cookie=0xbb7b3cdd52626a01, duration=9917.984s, table=20, n_packets=66, n_bytes=12672, idle_age=4, priority=2,dl_vlan=1,dl_dst=fa:16:3e:4a:10:2b actions=strip_vlan,set_tunnel:0x2,output:2
14- cookie=0xbb7b3cdd52626a01, duration=9913.613s, table=20, n_packets=0, n_bytes=0, hard_timeout=300, idle_age=9913, hard_age=4, priority=1,vlan_tci=0x0001/0x0fff,dl_dst=fa:16:3e:4a:10:2b actions=load:0->NXM_OF_VLAN_TCI[],load:0x2->NXM_NX_TUN_ID[],output:2
15- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=20, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=resubmit(,22)
16- cookie=0xbb7b3cdd52626a01, duration=9917.985s, table=21, n_packets=1, n_bytes=42, idle_age=9913, priority=1,arp,dl_vlan=1,arp_tpa=192.168.111.1 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:0b:cf:10,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163e0bcf10->NXM_NX_ARP_SHA[],load:0xc0a86f01->NXM_OF_ARP_SPA[],IN_PORT
17- cookie=0xbb7b3cdd52626a01, duration=9917.984s, table=21, n_packets=0, n_bytes=0, idle_age=9917, priority=1,arp,dl_vlan=1,arp_tpa=192.168.111.2 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:4a:10:2b,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163e4a102b->NXM_NX_ARP_SHA[],load:0xc0a86f02->NXM_OF_ARP_SPA[],IN_PORT
18- cookie=0xbb7b3cdd52626a01, duration=13003.028s, table=21, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=resubmit(,22)
19- cookie=0xbb7b3cdd52626a01, duration=9917.956s, table=22, n_packets=10, n_bytes=1336, idle_age=9904, dl_vlan=1 actions=strip_vlan,set_tunnel:0x2,output:2
20- cookie=0xbb7b3cdd52626a01, duration=13003.002s, table=22, n_packets=4, n_bytes=340, idle_age=9920, priority=0 actions=drop

Explanation of the Rules:

Table 0:

1- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=0, n_packets=183, n_bytes=28498, idle_age=4, priority=1,in_port=1 actions=resubmit(,2)
2- cookie=0xbb7b3cdd52626a01, duration=9917.985s, table=0, n_packets=198, n_bytes=36045, idle_age=4, priority=1,in_port=2 actions=resubmit(,4)
3- cookie=0xbb7b3cdd52626a01, duration=13003.030s, table=0, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=drop

Rule 1 Has priority=1 and checks if the packets coming on port in_port=”patch-int” then the action is: go to table 2 
Rule 2  Checks if the packets coming on port in_port=vxlan-c0a80202 then the action is: go to table 4
Rule 3 Has priority=0 (lowest priority) and drop the packets that don’t match rule 1 and rule 2

Table 2:

4- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=2, n_packets=1, n_bytes=42, idle_age=9913, priority=1,arp,dl_dst=ff:ff:ff:ff:ff:ff actions=resubmit(,21)
5- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=2, n_packets=168, n_bytes=26780, idle_age=4, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)
6- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=2, n_packets=14, n_bytes=1676, idle_age=9904, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22)

Rule 4 

Has priority=1 and checks if the packets are ARP packet with destination MAC address set to broadcast

then the action is: go to table 21

Rule 5

Has priority=0 and checks if the packets has dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 

(match all unicast Ethernet packets) then the action is: go to table 20 

Rule 6 

Has priority=0 and checks if the packets has dl_dst=01:00:00:00:00:00/01:00:00:00:00:00

(match all multicast(including broadcast Ethernet packets) then the action is: go to table 22 

Table 3:


7- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=3, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=drop

Rule 7 drop the packets

Table 4:


8- cookie=0xbb7b3cdd52626a01, duration=9921.166s, table=4, n_packets=198, n_bytes=36045, idle_age=4, priority=1,tun_id=0x2 actions=mod_vlan_vid:1,resubmit(,10)
9- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=4, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=drop

Rule 8 Has priority=1 and checks if the packets tun_id=0x20 hen the action is to add the vlan_vid:1 and go to table 10 
Rule 9 Has priority=0 (lower priority) and drop the packets that don’t match rule 8

Table 6:


10- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=6, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=drop

Rule 10 drop the packets

Table 10:


11- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=10, n_packets=198, n_bytes=36045, idle_age=4, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xbb7b3cdd52626a01,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1

Rule 11

Has priority=1 and the action has two parts: 

Part one:

Is to install a rule in table 20. This table (20) will be a MAC learning table. 

The “learn” action modifies a flow table based on the content of the flow currently being processed by table 4.

Here’s how you can interpret each part of the “learn” action above:

table=20     Modify flow table 20.  This will be the MAC learning table.

    

      hard_timeout=300

       Causes the flow to expire after the 300 seconds, regardless of activity.

    

      priority=1

       The priority at which a wildcarded entry will match in comparison to others

    

      cookie=0x407518fa3ccd67d2 NXM_OF_VLAN_TCI[0..11]     Make the flow that we add to flow table 20 match the same VLAN    ID that the packet we’re currently processing contains.  This    effectively scopes the MAC learning entry to a single VLAN,    which is the ordinary behavior for a VLAN-aware switch. NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[]     Make the flow that we add to flow table 20 match, as Ethernet    destination, the Ethernet source address of the packet we’re    currently processing.

    

      load:0->NXM_OF_VLAN_TCI[],

    

          Strip off the VLAN ID by loading 0 as a VLAN ID

    

      load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],

    

          Load the tunnel ID of the proceesing packet as a tunnel id of the packet

    

      output:OXM_OF_IN_PORT[]),

    

          Send the packet out via input port

    

      Part Two:

      output:”patch-int”

      sends the packet out via port patch-int

Table 20:

12- cookie=0xbb7b3cdd52626a01, duration=9917.984s, table=20, n_packets=102, n_bytes=14108, idle_age=9435, priority=2,dl_vlan=1,dl_dst=fa:16:3e:0b:cf:10 actions=strip_vlan,set_tunnel:0x2,output:2
13- cookie=0xbb7b3cdd52626a01, duration=9917.984s, table=20, n_packets=66, n_bytes=12672, idle_age=4, priority=2,dl_vlan=1,dl_dst=fa:16:3e:4a:10:2b actions=strip_vlan,set_tunnel:0x2,output:2
14- cookie=0xbb7b3cdd52626a01, duration=9913.613s, table=20, n_packets=0, n_bytes=0, hard_timeout=300, idle_age=9913, hard_age=4, priority=1,vlan_tci=0x0001/0x0fff,dl_dst=fa:16:3e:4a:10:2b actions=load:0->NXM_OF_VLAN_TCI[],load:0x2->NXM_NX_TUN_ID[],output:2
15- cookie=0xbb7b3cdd52626a01, duration=13003.029s, table=20, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=resubmit(,22)

Rule 12,13

Have priority=2 and check if the packets has VLAN id = 1 and  a certain dl_dst addresses

then the action is: strip the VLAN id and load the tunnel id of 0x2 and send the packets out via output:vxlan-c0a80202

Rule 14 

These rule are installed via the learn action of table 10:

Has priority=1 and checks if the packets has vlan_tci=0x0001/0x0fff (VLAN id = 1) and ,dl_dst=fa:16:3e:4a:10:2b

then the action is: strip the VLAN id and load the tunnel id of 0x2 and send the packets out via output:vxlan-c0a80202

Rule 15 Has priority=0 (lower priority) and the action is: go to table 22

Table 21:

16- cookie=0xbb7b3cdd52626a01, duration=9917.985s, table=21, n_packets=1, n_bytes=42, idle_age=9913, priority=1,arp,dl_vlan=1,arp_tpa=192.168.111.1 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:0b:cf:10,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163e0bcf10->NXM_NX_ARP_SHA[],load:0xc0a86f01->NXM_OF_ARP_SPA[],IN_PORT
17- cookie=0xbb7b3cdd52626a01, duration=9917.984s, table=21, n_packets=0, n_bytes=0, idle_age=9917, priority=1,arp,dl_vlan=1,arp_tpa=192.168.111.2 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:fa:16:3e:4a:10:2b,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163e4a102b->NXM_NX_ARP_SHA[],load:0xc0a86f02->NXM_OF_ARP_SPA[],IN_PORT
18- cookie=0xbb7b3cdd52626a01, duration=13003.028s, table=21, n_packets=0, n_bytes=0, idle_age=13003, priority=0 actions=resubmit(,22)

Rule 16, 17

Has priority=1 and checks if the packets are ARP packet and  have certain VLAN ID (e.g. dl_vlan=1) and 

a certain destination IP address (e.g. arp_tpa=192.168.111.1)

then the action of the flow is:

  • move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[] → move the Ethernet destination of the processing packet as an Ethernet source address of the flow
  • mod_dl_src:fa:16:3e:0b:cf:10 → change the Ethernet source address to a certain value
  • load:0x2→NXM_OF_ARP_OP[] → Load the tunnel ID 0x2 
  • move:NXM_NX_ARP_SHA[]→NXM_NX_ARP_THA[] → move the ARP source MAC address of the processing packet as an ARP target MAC address of the flow
  • move:NXM_OF_ARP_SPA[]→NXM_OF_ARP_TPA[] → move the ARP source IP address of the processing packet as an ARP target IP address of the flow
  • load:0xfa163e0bcf10→NXM_NX_ARP_SHA[] → load 0xfa163e0bcf10 as an ARP source MAC address 
  • load:0xc0a86f01→NXM_OF_ARP_SPA[] →  load 0xc0a86f01 as an ARP IP address 
  • IN_PORT → send the packet out via input port 

Note: the above flow indicate that the switch which is close to the host replies to arp MAC address  

Rule 18 Has priority=0 (lower priority) and the action is: go to table 22

Table 22:

19- cookie=0xbb7b3cdd52626a01, duration=9917.956s, table=22, n_packets=10, n_bytes=1336, idle_age=9904, dl_vlan=1 actions=strip_vlan,set_tunnel:0x2,output:2
20- cookie=0xbb7b3cdd52626a01, duration=13003.002s, table=22, n_packets=4, n_bytes=340, idle_age=9920, priority=0 actions=drop

Rule 19 Checks if the packet has VLAN ID=1 then the action is: strip the VLAN id and load the tunnel id of 0x2 and send the packets out via output:vxlan-c0a80202
Rule 20 Has priority=0 (lower priority) and drop the packets that don’t match rule 19

Having a good understanding of these rules will help us troubleshooting network traffic. If there are any connectivity issues in the network (internal/external) which result in the packet loss, we can easily follow the trail of packets within the engaged flow rules to find the leakage in the network.

For example, if we run ping between two OpenStack endpoints, first we need to understand which flow rules are being hit by the ping packets and then observe if there are any incremental changes in the “n_packets” count of the rule. The “n_packet” feature inform us if the packets are begin forwarded to another endpoint or being dropped in the network.

How can we make OpenStack work for you?
Find out what else we can do with OpenStack.

Find Out Here

The post OpenStack Rules: How OpenVswitch works inside OpenStack appeared first on Aptira.

by Farzaneh Pakzad at June 27, 2019 01:45 AM

June 26, 2019

OpenStack Superuser

The OpenStack Foundation joins the Open Source Initiative as an affiliate member

Over the past year, the definition of open source has been challenged, as some companies wanted to change the licensing of their software while continuing to reap the benefits of calling it open source, or at least the benefits of being potentially confused with open source.

That makes the work of the Open Source Initiative more important than ever. For more than 20 years, the OSI has been a steadfast guardian of the Open Source Definition. They’ve kept it focused on user freedoms, evaluating new proposed software licenses against that definition, while discouraging further license proliferation. They’ve also been instrumental to the success of open source through their tireless advocacy and education work.

These objectives resonate with the work we do at the OpenStack Foundation (OSF). Today open source is necessary, but not sufficient: users of open-source licensed software are sometimes denied some of the original free and open source software benefits. We need to go beyond how the software is licensed and drive new standards on how open source should be built. Users should be able to tell easily the difference between a truly open collaboration guaranteeing all of open source benefits and single-vendor or open core projects.

“Without this single, standard definition of [the kilogram] or other fundamental units, commerce as we know it would not be possible. There is no trust in a world where anyone can invent their own definition for units, items, and concepts on which others rely, and without trust there is no community, no collaboration, and no cultural or technological development,” OSI, affirmation of open source definition.

This work cannot happen unless we base it on a strong and steady open source definition, focused on user freedoms. That’s why two months ago the OpenStack Foundation joined other open source organizations in signing the affirmation of the open source definition. That’s also why today the OSF is joining the Open Source Initiative as an affiliate member.

I’m looking forward to working closer with the OSI on those critical topics, and discuss challenges in the future of Open Source with them.

About the author

Thierry Carrez is the vice-president of engineering at the OSF and an OpenStack Technical Committee elected member. A long-time advocate of free and open source software and Python Software Foundation fellow, he was previously involved with Ubuntu server and Gentoo Linux security.

The post The OpenStack Foundation joins the Open Source Initiative as an affiliate member appeared first on Superuser.

by Thierry Carrez at June 26, 2019 02:42 PM

How to perfect your talk pitch for the Open Infra Summit Shanghai

If you’re ready to pitch your talk for the upcoming Summit, we’ve got you covered.
Before submitting a presentation for the Shanghai Summit, we asked the Programming Committee about the content they’d like to see for each Track. Read their expert opinions below and pitch your session in English or Mandarin by 11:59 p.m. PT July 2.

AI, Machine Learning & High Performance Computing (HPC)

    • Mohamed Elsakhawy, operational lead, national cloud team, compute Canada / SHARCNET
    • Fred Li, manager, open source development team, Huawei
    • Sundar Nadathur, accelerator architect, Intel
    • Liexiang Yue, senior software architect, China Mobile
      • Container-based HPC Cloud deployments
      • Bursting from supercomputers to the Cloud
      • Hybrid large-scale clouds: Bare metal, containers, VMs and HPC offerings
      • Performance benchmarks for various components, tips, tricks and best practices
      • Use cases for OpenStack in AI, machine learning and HPC
      • HPC in the cloud best practices
      • Forums that discuss opportunities to improve (including gaps in the current open infrastructure larger stack)
      • Forums that discuss resource provisioning best practices in large scale OpenStack deployments”
      • Solutions  built on OpenStack core projects
      • How OpenStack enables AI/ML/HPC rather than the workloads themselves
      • New/different deployments (e.g. containers), hardware acceleration (esp. scaling with GPUs, non-GPU accelerators), quantitative data preferable.”

CI/CD

    • Qiao Fu, technical manager, China Mobile
    • ChangBo Guo, community director, EasyStack; OpenStack Board of Directors
    • Chris MacNaughton, software engineer, Canonical
      • Best practices with different tools like Zuul, Jenkins, Spinnaker, other tools that work well with Kubernetes
      • How CI/CD helps developers in daily work
      • Current challenges and obstacles for operator adoption, namely:
        • Challenges of large-scale deployment of OpenStack like  CI/CD, network orchestration
        • New issues and requirements for new Operator scenarios, e.g. edge cloud, AI
        • OpenStack day-two operations: experience, challenges, suggestions
        • Key issues and challenges for NFV networks, e.g. hardware acceleration, high availability
        • Adoption for containers: challenges, requirements, experiences

Containers

    • Ricardo Aravena, website operations manager, Rakuten Rewards
    • Lingxian Kong, senior cloud computing engineer, Catalyst Cloud
    • Hongbin Lu, software engineer, Huawei
    • Xu Wang, senior staff engineer, Ant Financial
      • OpenStack container projects: Magnum, Zun, Kuryr, Kolla
      • Kubernetes integration with OpenStack components including Keystone, Cinder, Neutron, Manila
      • Kata Containers
      • Serverless
      • Service mesh
      • Use cases: (e.g. Kubernetes on OpenStack, OpenStack on Kubernetes, etc.)
      • The strategy and business cases for deploying containers
      • Best practice for managing container infrastructure — deploying, upgrading, auto-healing, monitoring, etc.
      • Container security, especially in the public cloud
      • The latest development and deployment progress in the container infrastructure area, especially the production usage of secure containers and virtualization technologies
      • Content on Kata Containers use cases, serverless, Kubernetes and how all these technologies fit into the OpenStack ecosystem
      • Newer enhancements for sandboxing workloads

Hands-on workshops

    • Keith Berger, senior software engineer, SUSE
    • Florian Haas, VP education, City Network
    • Yujun Zhang, EasyStack
      • The value of the hands-on sessions is to provide attendees the opportunity to learn by playing” with the content
      • Performing tasks that are representative of real OpenStack workflows and environments.
      • “I wish I had known” and “here’s how you do X” content, as well as troubleshooting workshops
      • Simplified topics, sharing experiences such as how to integrate Ironic with Nova and Neutron to enable multi-tenancy, how to consume vGPU managed by Nova

Open development

    • Tony Breeds, principal software engineer, Red Hat
    • Li Jiansheng, chief open source office, X-lab, TongJi University
    • Sean McGinnis, community representative on the OpenStack Foundation Board of Directors
      • Content showcasing the power of the “Four Opens
        • What is open development?
        • Why do we need open development?
        • How does open development work?
        • What is the relationship of open development and Conway’s law?
        • How community success relates to successful development
        • How to participate in open development in modern open source software projects, especially in China

Private & hybrid cloud

    • Ruan He, chief architect, Tencent
    • Rico Lin, software engineer, EasyStack
    • Belmiro Moreira, cloud architect, CERN
    • Yih Leong Sun, cloud architect, Intel
      • Private cloud and hybrid cloud implementation success stories, from small to large-scale (20 to 1,000 nodes), especially for industries with stringent requirements such as financial services and government
      • Different methods to deploy OpenStack;
        • How to scale and what, if any, are the bottlenecks?
        • Large deployment references
        • Maintaining cloud deployments
        • User stories, devs updates
        • New approaches for services integration
      • Cross-community collaboration
      • Presentations from actual superusers (providing user experience, testing and upstream at the same time)
      • OpenStack operational challenges

Public cloud

    • Zhipeng Huang, open source operation manager at Huawei
    • Frank Kloeker, technology manager cloud applications, Deutsche Telekom AG
    • Nils Magnus, cloud architect at Open Telekom Cloud, T-Systems International GmbH
      • Presentations that cover clever, innovative or useful solutions and approaches that improve visibility of OpenStack public clouds
      • Common use cases to bring more operators and vendors to the event

Security

    • Ashish Kurmi, senior cloud security engineer, Uber Technologies
    • Liang Chen, director of system engineering, EasyStack
    • Josephine Seifert, innovation assistant, SecuStack GmbH
    • Colleen Murphy, software engineer, SUSE
      • Actionable presentations that can help cloud customers detect various security threats and mitigate them effectively
      • How to make sure cloud platforms are immune to CPU security flows – not only Spectre & Meltdown, but also newer threats like Microarchitectural Data Sampling (MDS)
      • What criteria people are using to decide whether or not a package needs to be patched/updated and how often does the action take place
      • At the coding level (specific to OpenStack), what practices people are using to ensure they don’t introduce security breaches to the platform or automation scanning process
      • Raising awareness of existing security enhancements and how to use them, as well as pointing out potential pitfalls and possible measures for securing infrastructures
      • Presentations from operators and developers about security-enhancing techniques for configuring OpenStack,
      • New open source projects geared toward security
      • New features in OpenStack that enhance security

5G, NFV & Edge

Stay tuned for a post focusing on this Track.
Get your talk submissions in by July 2!

The post How to perfect your talk pitch for the Open Infra Summit Shanghai appeared first on Superuser.

by Ashlee Ferguson at June 26, 2019 02:02 PM

Aptira

How To Hybrid Cloud

Aptira Hybrid Cloud

Different organisations have unique experiences with their Cloud implementations. And no two organisations are the same. They range in size and the extent of their Cloud infrastructure, their level of Cloud skill, and whether they are operating a private or public Cloud … or a combination of the two – hybrid Clouds.

Due to the rapidly expanding technical requirements for businesses today, we have seen many organisations using more than one Cloud to run their workloads. Organisations are also running workloads in a hybrid scenario by consuming both private and public Clouds to meet their needs.

It’s important to get value out of your Cloud platform, and this is simply not possible with a standard cookie-cutter approach to Cloud. What you need is a personalised strategy to leverage a Cloud platform that will provide maximum value for your organisation. Your Cloud solution should be optimised, providing a custom, fully integrated, turn-key solution encompassing various technologies to suit your business.

There are also a lot of different approaches to delivering a Cloud, ranging from simple deployments through to complex ecosystems and multi Cloud deployments. As there are many approaches to delivering a Cloud, how do Cloud admins manage it efficiently?

Multi/Hybrid Cloud Orchestration Training

In order to help users wrap their heads around a multi/hybrid Cloud environment, we’ve created a training course designed to teach developers and architects the core features of multi/hybrid Cloud orchestration using Cloudify.

This course will focus on multi/hybrid Cloud use cases, the building blocks to realise the use cases and different technology options using which use cases can be deployed. The course will also look at how to orchestrate a use case across a multi Cloud environment to provide a service. Almost half of this course is delivered in hands-on labs, giving participants real world experience working with Cloud orchestration.

This course (as with all of our training courses) can be completely customised to suit your needs, so feel free to chat with us about your requirements so we can deliver a course that best suits you.

Fully Managed Clouds

If you’d prefer to let someone else do the hard work for you, we can help there also. Our managed Cloud solution provides a complete hands-off Cloud experience, utilising state of the art Cloud infrastructure combined with well-honed traditionally engineered best practice solutions. From Cloud planning and strategy, through to the migration, testing and optimisation – this is the whole package.

Your infrastructure can be deployed on any PaaS, including OpenStack, AWS, Azure, Google Cloud Platform, Rackspace, Vexxhost, any Container Orchestration platform of your choice, or in-house – we can run your infrastructure in your DC as easily as we can in ours.

We’re also offering end of financial year discounts on all of our technology training – including Cloud Training. This discount applies to pre-paid training, booking multiple courses, bundling with your hardware, software licences and any of our services (including Managed Cloud). So if you’re looking to upgrade your infrastructure, or learn how to manage it more efficiently – now is the time.

This deal is running until the end of June, but can be used at any time during the next 12 months. For more information on our DevOps services, or to get the best discount for you – chat with our Solutionauts today. The results will speak for themselves.

Take control of your Cloud.
Get a customised Cloud strategy today.

Learn More

The post How To Hybrid Cloud appeared first on Aptira.

by Jessica Field at June 26, 2019 06:42 AM

June 25, 2019

OpenStack Superuser

Openstack Ironic: Outreachy project works to extend sushy support

It’s only been a few weeks since I embarked on my Outreachy internship at Openstack and I already feel like my brain has evolved so much in terms of the tremendous amount of learning that has taken place. This post is aimed at sharing some of that knowledge. Before we begin, here’s a short introduction to the organization that I am working with, OpenStack.

OpenStack is a set of open source, scalable software tools for building and managing cloud computing platforms for public and private clouds. OpenStack is managed by the OpenStack Foundation, a non-profit that oversees both development and community-building around the project.

There are many teams working on different projects under the banner of Openstack, one of which is Ironic. Openstack Ironic is a set of projects that perform bare metal (no underlying hypervisor) provisioning in use cases where a virtualized environment might be inappropriate and a user would perfer to have an actual, physical, bare-metal server. Thus, bare metal provisioning means a customer can use hardware directly, deploying the workload (image) onto a real physical machine instead of a virtualized instance on a hypervisor.The title for the project that I am working on is: “Extend sushy to support RAID”Reading the topic, the first word that jumps out is RAID, an acronym for Redundant Array of Independent Disks. It’s a technique of storing the same data in different places on multiple disks instead of a single disk for increased I/O performance, data storage reliability or both. There are different RAID levels that use a mix/match of striping, parity or mirroring, each optimized for a specific situation.

The main theme behind the existence of Ironic is to allow its users (mostly system admins) to access the hardware, running their servers, remotely.This gives them the ability to manage and control the servers 24X7, crucial in case of any server failures at odd hours. Here’s where Baseboard Management Controller (BMC) comes to the rescue. It’s an independent satellite computer that can typically be a system-on-chip microprocessor or a standalone computer that’s used to perform system monitoring and management related tasks.

At the heart of a BMC lies Redfish, a protocol used by BMC (on bare metal machines) to communicate remotely via JSON and OData. It’s a standard RESTful API offered by DMTF to get and set hardware configuration items on physical platforms. Sushy is a client-side Python library used by Ironic to communicate with Redfish-based systems.

To test and support the development of the sushy library, a set of simple simulation tools called sushy-tools is used, since it’s not always possible to have an actual bare metal machine with Redfish at hand.

The package offers two emulators, a static emulator and a dynamic emulator. The static emulator simulates a read-only BMC and is a simple REST API server that responds the same things to client queries. The dynamic emulator simulates the BMC of a Redfish bare metal machine and is used to manage libvirt VMs (mimicking actual bare metal machines), resembling how a real Redfish BMC manages actual bare metal machine instances. Like its counterpart, the dynamic emulator also exposes REST API endpoints that can be consumed by a client using the sushy library.

The aim of my project is to add functionality to the existing sushy library so that clients can configure RAID-based volumes for storage on their bare metal instances remotely. There are two aspects to this project, one is the addition of code in the sushy library and the other is adding support for emulating the storage subsystem in sushy-tools so that we’re actually able to test the added functionality in sushy against something. Since there is already one contributor working on adding RAID implementation to sushy, my task in the project involves adding the RAID support for emulation to the sushy-tools dynamic emulator (more specifically the libvirt driver).

Early challenges

The first task that the mentors asked me to do was learn how to get and receive data from the libvirt VMs using sushy via the sushy-tools dynamic emulator. Honestly, the task did feel a bit intimidating in the beginning. I had to go back to the networking basics and brush up on them. After that, I spent quite some time reading blogs and following videos on libvirt VMs. I encountered some problems while setting up and creating the VMs using the virt-install command and had to ultimately fall back to the virt-manager GUI to spin up the VMs.

I also wasn’t able to set up the local development environment for sushy-tools due to the absence of instructions for the same on the README. I have to admit that I thought ‘maybe the instructions aren’t mentioned because they are too obvious’ and felt a bit embarrassed about them not being obvious to me. Then I reminded myself that I’m here to learn, so I went ahead and asked my mentor who very warmly helped me out with the necessary commands to set up the repository. After successfully setting up the environment, I even submitted a patch for adding the corresponding instructions to the docs to make it a bit easier for new-comers like me to start contributing to the project.

While working the first task, I came across some anomalies in the behavior of the dynamic emulator which led me to dig up the code to find out what was happening. I ultimately found out that it was actually a bug and submitted a patch for the same.

Right now, I’m working on exposing the volume API in the dynamic emulator. There are a lot of decisions to be made, related to the mapping of the sushy storage resources to the libvirt VMs. I’ve been researching about them and am regularly having discussions with the mentors on the same and will hopefully include the final implementation mapping in my next blog post.

This post first appeared on Varsha Verma’s blog.

Superuser is always interested in open infrastructure community content. Get in touch: editorATopenstack.org.

Photo // CC BY NC

The post Openstack Ironic: Outreachy project works to extend sushy support appeared first on Superuser.

by Varsha Verma at June 25, 2019 02:02 PM

Aptira

Virtual Network Function (VNF) Service Function chaining using Cloudify

Virtual Network Function (VNF) Service Function Chaining using Cloudify

The linkage of network functions to form a service is often a very complex procedure. We were asked to validate service function chaining on a Virtual Network Function (VNF) using Cloudify.


The Challenge

As part of a recent PoC (Proof of Concept) exercise, one of our Customers asked us to validate the operation of a Service Function Chaining concept by deploying Telco workloads on a private cloud. By definition, Service Function Chaining (SFC) is the instantiation of multiple service functions to form an end-to-end chain and steering the traffic through them, thereby creating a Service Function Path.


The Aptira Solution

To demonstrate SFC functionality, we used two Network Services, available as Virtual Network Functions (VNFs): Clearwater Virtual Infrastructure Managers (VIMs) as well as F5 VNFs vFirewall and a Virtual Logic Traffic Manager (vLTM) load balancer.

The end-to-end service objective was to enable SIP calls between SIP clients on different infrastructure. For validation purposes the configuration simulated two data centers using one OpenStack cloud instance at each site. This configuration enables SIP traffic between SIP clients to pass through the F5 VNFs deployed in one data center and vIMS deployed in another data centre via SDN-WAN network.

The SDN-WAN topology to connect these two data centres was simulated using a set of OpenVSwitch controlled by an SDN Controller (i.e. OpenDayLight).

A high-level diagram is shown below.

Virtual Network Function (VNF) Service Function Chaining using Cloudify

Cloudify was configured as the Network Functions Virtualisation Orchestrator (NFVO) to model and control the entire configuration, which was modelled using a TOSCA template. The TOSCA template includes the node types and node template definitions for each VIM resource such as – Subnet, Floating IP, VM, Security groups.

Once these VNFs are orchestrated using Cloudify as NFVO, it uses the deployment proxy mechanism/plugin to setup an SFP path to enable the SIP traffic. Another TOSCA blueprint manages the establishment of the SFP. The SFP is modelled using a TOSCA blueprint that includes the deployment details of service functions that are to be chained:

  • Deployment details (or ids) of F5 VNFs and vIMS
  • Traffic policies to allow SIP traffic
  • Waypoints of the Software Defined Networking Wide Area Network (SDN-WAN) topology

The SFP blueprint also has details of the traffic type/rule classifiers based on which the SIP traffic must be routed. Using the deployment proxy plugin, Cloudify performs resource discovery by fetching the deployment/infrastructure level details of the service functions.

Cloudify performs the following functions using the SFP blueprint:

  • Publishes the traffic rules in F5 VNFs to allow SIP traffic on a specific domain
  • Configures the network in such a way all the outgoing SIP packets are routed to SDN-WAN network
  • Configures the OpenFlow rules in the SDN-WAN topology so that traffic is routed to vIMS instance

Aptira’s specialist technical team designed and implemented the TOSCA blueprints which were set up in the PoC environment and tested for compliance with the validation requirements. Various tweaks and improvements were made before a final configuration was established.


The Result

The configuration was executed in the PoC environment and validated all the required functions for the Virtual Network Function Service Chaining. This configuration was a very good illustration of the breadth and depth of Cloudify capabilities and the strength of TOSCA as a modeling language.

The result is also a reusable SFP TOSCA model that can not only be used to orchestrate complex services and chain them but also allow run-time decision making for operators to steer the traffic seamlessly without any manual intervention thereby demonstrating zero touch orchestration.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Virtual Network Function (VNF) Service Function chaining using Cloudify appeared first on Aptira.

by Aptira at June 25, 2019 01:39 PM

June 24, 2019

OpenStack Superuser

What your open-source project can learn from how OpenStack communicates

Fear the word salad. That’s when you think you’re saying one thing but find random phrases coming out of your mouth, says Nicola Heald who worked at IBM, HP, Canonical and “a million different contract roles” before landing as a core developer at Automattic, the parent company of WordPress.

Speaking at the recent WordCamp Bristol, her talk gave tips on communicating based on how the OpenStack community works that provides pointers for any open-source community. She kicked things off by introducing OpenStack and why it could serve as a good model.

“OpenStack is big, really big. You might think WordPress is big, but it’s just peanuts,” Heald notes comparing the 560,000 lines of code at WordPress to the 9.5 million at OpenStack. When you take into account that it’s written in Python (a more concise language than PHP) it’s far more complex, she says. So how does OpenStack keep the lines of communication open and land releases at a sustainable rate? How are these releases communicated to clients and users? And how do all the project teams cross-coordinate?

Each one of the OpenStack projects represents a team of developers and testers who they all talk to each other “and they do a pretty good job.” The key rests with the Four Opens: Open source, open design, open development and open community.

There’s more to communication than words: there’s tooling, the behavior of teams, there’s onboarding of new contributors, there are the “loose tiles” (when you introduce someone to the project or ask someone to achieve something to the project.) Heald found it difficult to begin using WordPress because of all of the jargon involved adding that having specific project jargon “also communicates something about the project.”

OpenStack ensures easy access to communication, namely with IRC and text files. “There’s no secret sauce about these, but they are completely open and accessible to anyone. “There are a million IRC clients and about as many text editor programs. Anyone can access these.” While Heald admits that IRC is not the easiest thing to connect to, the protocol is open meaning it’s not tied to any one company, so no one can deprecate the features you use, require fees or API keys.

Using the walled gardens of proprietary tools for communications goes beyond shutting people out of meetings, Heald adds, it bars them from documentation, leadership, governance and, ultimately, the direction of the projects.

Communicating open leadership

Transparent governance also makes a difference: you know who is in charge of the projects and which companies have an interest in developing them, she says. There’s a Board of Directors, Technical Committee and a User Committee, representing the downstream users of the OpenStack software who can influence the direction of the project before it’s released.

Users are part of the leadership, they meet at the Forum, “which sounds very sci-fi I think, like the council of OpenStack, on planet Forum.” It’s held every six months at the Summit, and plays a  part of the Four Opens. It closes the feedback loop and improves the next release.

“In OpenStack, it’s all open, you can find out exactly who to go to, who to ask, who are the end users influencing the future path of product.” Otherwise developers only hear these questions at a conference and get some input and take it back. Tutorials aren’t developed before hand.“They make sure that when a product goes out to end users, those things are covered.”

Catch the full 35-minute talk here.

The post What your open-source project can learn from how OpenStack communicates appeared first on Superuser.

by Nicole Martinelli at June 24, 2019 02:04 PM

Aptira

Orchestrating and Managing a Wide Area Network Software Defined Network (WAN-SDN) using a Cloudify Service Orchestrator

Orchestrating and Managing a Wide Area Network Software Defined Network (WAN-SDN) using a Cloudify Service Orchestrator

A use case that is often requested by our customers is managing and orchestrating a Wide Area Network Software Defined Network (WAN-SDN) in order to connect services located in different geographic locations.


The Challenge

This customers SDN-WAN network connects two separate datacenters. Therefore, any communication between these two datacenters needs to take place via the WAN transport network which is managed by a Software Defined Networking (SDN) controller.

The integration of a WAN-SDN controller with their Service Orchestrator (SO), as well as management of the WAN Network using the Service orchestrator was a challenge for our client. We also needed to simulate a lab environment to be aligned with their requirements.

Therefore, Aptira’s team of experts in Cloud, SDN & Network Functions Virtualisation (NFV) and Service Orchestration gathered together to solve the problem, build the required lab environment and make a plan to resolve the challenges.


The Aptira Solution

We configured two OpenStack Cloud instances as our data centers and created test VMs in each data center. The purpose of the VMs was to send traffic from one datacenter to the other one via the WAN network.

Then, we created a WAN Network consisting of OpenVswitches in a pentagon-like topology. Each datacenter was connected to the WAN topology via an edge OpenVswitch. The WAN topology is managed by an SDN controller: in this case OpenDayLight (ODL).

After preparing all the required resources, the Service orchestrator (in this case Cloudify) was integrated with ODL via RESTCONF API.

The SO was able to send requests to ODL to perform required operations, such as:

  • Creation of a VPN tunnel between 2 data centers
  • Installation of new flow rules on OVS for traffic engineering and load balancing
  • Deletion of flows and VPN tunnels if required

The Result

This solution fills the gap between Software Defined Networking and legacy equipment by bringing automation and on-demand services to data centers. This in turn increases the end-user experience and decrease the overall costs.

By orchestrating and managing a Wide Area Network Software Defined Network (WAN-SDN) using a Cloudify Service Orchestrator, along with the help of OpenStack and SDN, all of the operations that take place in a different layer of telecommunication can be automated. Consequently this saves time and reduces operational load by removing manual steps, saving the client money and resources in the long term.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Orchestrating and Managing a Wide Area Network Software Defined Network (WAN-SDN) using a Cloudify Service Orchestrator appeared first on Aptira.

by Aptira at June 24, 2019 01:00 PM

Cloudwatt

Introducing nixpkgs-tungsten: The most convenient way to working with Tungsten Fabric

TungstenFabric (formely OpenContrail, the opensource version of Contrail by Juniper Networks) is one of the most established SDN solutions on the market today. At Cloudwatt, the SDN features of TungstenFabric provide the networking features of our IaaS offer. TungstenFabric allows us to provide virtual private networks, security groups and also more advanced features such as SNAT and LBaaS. We have been using OpenContrail ever since 2014 (using version 1.6 back then) and over time have also made some upstream contributions.

Last month we published nixpkgs-tungsten - a Nix based collection of tools and packages for working with and deploying TungstenFabric services.

In this blog post we will explore nixpkgs-tungsten, explain the motivation behind it, present its main features and show how to get started with it using the please script we provide with it.

Motivation

While the commercial version of TungstenFabric comes with paid support by Juniper we decided to use the open source version instead. Technical support is convenient but in our case we prefer the level of freedom and control that we get from using the open source version. Driving factors to this decision were:

  • Having full control over the packaging process
  • Being able to test recent features or apply security patches
  • Being able to customize TungstenFabric packages to our specific needs
  • Contributing to, and benefiting from the F/OSS ecosystem

The sources to all TungstenFabric components are available under the Juniper organization on GitHub. The juniper/contrail-packages provides scripts organized around a Makefile for creating binary packages. There are a few shortcomings however:

  • Only supports creating rpm packages (deb support has been deprecated)
  • No tools for provisioning build time requirements
  • No straight-forward way to customize
  • No granular builds (thus not appropriate for an active development workflow)
  • Long build times (caused by the lack of control over the build granularity)

For these reasons we decided to instead invest into building our own TungstenFabric distribution based on Nix. We decided to open source nixpkgs-tungsten because we like to give back to the community we benefit so much from. We hope that it might provide value to others interested in TungstenFabric as well.

Note that nixpkgs-tungsten is not our only Nix based project. We also maintain other projects which in turn import nixpkgs-tungsten in order to:

  • Build debian packages used in production
  • Apply Cloudwatt specific patches to TungstenFabric sources
  • Run additional tests specific to our production environment

We will however only cover nixpkgs-tungsten itself in this blog post.

About Nix

Nix enables us to record and provision all dependencies of a project in a declarative and composable manner. The information is maintained as part of the version controlled source code of the project. This is achieved through several inter-connected solutions that are all part of the Nix ecosystem:

  • The Nix language: A simple, domain specific functional programming language for describing packages
  • The nixpkgs collection of Nix expressions: Thousands of Nix expressions describing programs, libraries and services
  • The Nix package manager: Command-line tools for building and installing nix expressions
  • NixOS: An operating system built on top of Nix package manager

This article will not explore the Nix language itself in much more detail and rather focus on what have built with it. But if you want to learn more, there are many external resources to learn more.

What’s In The Box

nixpks-tungsten provides the following, most relevant features:

  • A collection of easy-to-change, installable and configurable components, tools and services
  • Lightweight development environments to work on any of these packages
  • Creation, provisioning and running of QEMU based Virtual Machines to test all packages

Requirements for using nixpkgs-tungsten:

  1. A Linux setup with permissions to run sudo
  2. A Linux kernel with kvm support if you want to use the virtualization features
  3. curl

The included ./please bash script exposes the main functionality of nixpkgs-tungsten. Neither an existing Nix installation nor prior experience in using Nix are required to start using nixpkgs-tungsten.

Getting Started

To get started, clone the repository and run the ./please init script:

$ git clone https://github.com/cloudwatt/nixpkgs-tungsten
$ cd nixpkgs-tungsten
$ ./please init

This will install Nix and configure it for use with nixpkgs-tungsten. Most importantly, this will ensure that Nix uses a TungstenFabric specific binary cache. With the binary cache configured, packages that have already been built on our CI server can be retrieved from there and don’t have to be built again locally.

To verify that the setup was successful you can run the doctor command:

$ ./please doctor
[please]: Running sanity checks:

- Nix installed :  OK
- contrail channel: OK
- contrail cache: OK
- kvm support: OK

All essential tests passed.

Using Completions

The please script supports context-sensitive bash completion. While using it isn’t strictly necessary, it does add a level of convenience. The completion can be enabled with a single command as follows:

$ source <(please completions)

Afterwards typing ./please<TAB> should yield several completion options. You will also get completions per command. Try ./please install <TAB>.

Finding And Installing Packages

nixpkgs-tungsten provides various TungstenFabric (called contrail for historical reasons) related packages. Use the list command to get an overview of everything that is available:

$ ./please list
contrailApiCliWithExtra
contrailGremlin
contrailIntrospectCli
gremlinChecks
gremlinConsole
gremlinFsck
gremlinServer
contrail32.analyticsApi
contrail32.apiServer
contrail32.collector
contrail32.configUtils
contrail32.control
contrail32.discovery
...

While most of the TungstenFabric components are probably not that interesting to be installed on your development machine, there are others which are. One example is the contrail-api-cli. In the output above it is listed as contrailApiCliWithExtra. We can use the install command to install it:

$ ./please install contrailApiCliWithExtra
[please]: Running "nix-env -f default.nix -iA contrailApiCliWithExtra"

installing 'contrail-api-cli-with-extra-0.4.0rc1'
these paths will be fetched (1.66 MiB download, 79.71 MiB unpacked):
  /nix/store/004n2glrkaa1y4p5v296h9bw57j7rb61-python2.7-contrail-api-cli-extra-0.5.9
  /nix/store/0994dq5q810d8bb8f3nyhpg3lf4awm9r-python2.7-debtcollector-1.21.0
  /nix/store/0inigr6iwym7vl8vpnjjig6fs6c7grya-python2.7-six-1.11.0
  /nix/store/0kswkvj5ycbkrcfvhk2pjw8jgsqg76qj-python2.7-certifi-2018.8.24
  /nix/store/1xqi4pljh380fi0ynps40r2ysrflibff-python2.7-wcwidth-0.1.7
  /nix/store/3g4zz45zr6lgz56pb0n2f9zl0hpr8xnl-python2.7-funcsigs-1.0.2

[...]

building '/nix/store/6a9wnl4jci3xf9bp4w7iqb2yqi5v86rq-user-environment.drv'...
created 5364 symlinks in user environment

After the installation is complete, you can start using the tool:

$ contrail-api-cli
usage: contrail-api-cli [-h] [--debug]
                        [--schema-version {5.0,4.1,1.10,3.0,2.21,3.2,3.1}]
                        [--logging-conf LOGGING_CONF]
                        [--config-dir CONFIG_DIR] [--host HOST] [--port PORT]
                        [--protocol PROTOCOL] [--base-uri BASE_URI]
                        [--insecure] [--os-cacert <ca-certificate>]
                        [--os-cert <certificate>] [--os-key <key>]
                        [--timeout <seconds>] [--collect-timing]
                        [--os-auth-type <name>] [--os-username OS_USERNAME]
                        [--os-password OS_PASSWORD] [--ns COMMAND_NAMESPACE]
                        {schema,shell,exec,edit,tree,batch,cat,relative,ln,kv,man,rm,pyt
hon,du,ls,find-orphaned-projects,fix-subnets,fix-vn-id,apply-sg,fix-fip-locks,reschedule
-vm,rpf,fix-sg,graph,fix-zk-ip,check-bad-refs,manage-rt,provision,dot,fix-ri,fsck}
                        ...

Of course you can also uninstall packages again:

$ ./please uninstall contrailApiCliWithExtra
[please]: Running "nix-env -e contrail-api-cli-with-extra-0.4.0rc1"

uninstalling 'contrail-api-cli-with-extra-0.4.0rc1'

$ contrail-api-cli
contrail-api-cli: comand not found

A Development Workflow For TungstenFabric Packages

Let’s say you want to add a new feature to the TungstenFabric vrouter agent. Using please this is a straight forward process where you don’t have to spend any time setting up the required environment with the necessary tools and libraries - they will be provided to you.

1. Finding the right package

We may start by using please list to look for the right package:

$ ./please list Agent
contrail32.vrouterAgent
contrail41.vrouterAgent
contrail50.vrouterAgent

2. Using the shell command

Assuming we are working on version 5.0 we may then continue with the shell command:

$ ./please shell contrail50.vrouterAgent
[please]: Running "nix-shell default.nix -A contrail50.vrouterAgent"

Nix will now download all build dependencies as defined in the corresponding vrouter-agent.nix file. Once all downloads have been completed you will be placed in a shell.

3. Getting the sources

The next step is to retrieve the sources of the vrouter agent using unpackPhase:

$ unpackPhase
unpacking source archive /nix/store/25j5fv2d1qvlhlrqsryg1s8nsd4c8q9a-contrail-workspace
source root is contrail-workspace

In all Nix shell environments there is always going to be a command called unpackPhase. Running it will unpack the sources and print the directory name. In our case contrail-workspace.

4. Building

The vrouter agent is built using scons. It can be built as follows:

$ scons -j4 --optimization=production contrail-vrouter-agent

The build will start right away.

5. Summary

With a few simple steps we were able to trigger a build for the vrouter agent. The same applies for any other package provided by nixpkgs-tungsten. Note that you of course don’t have to run unpackPhase multiple times.

Starting a VM for testing purposes

nixpkgs-tungsten comes with a couple of test scenarios which make use of the NixOS testing framework. In these tests we define and build machine configurations, start them using QEMU and carry out various assertions. We use these tests for quality control to ensure that the packages we are creating are actually behaving as they should. We also run them on our CI to validate all our changes.

In addition to just executing predefined test scenarios we can also start these virtual machines in an interactive fashion. This way we can get a fully working TungstenFabric setup in a single virtual machine to conduct any kind of tests.

The run-vm can be used for this purpose:

$ ./please run-vm contrail50.test.allInOne

The VM is configured to automatically forward ssh trafic to the host machine on port 2222. This allows you to run commands such as:

$ ssh -p 2222 root@localhost

or

$ ssh -p 2222 root@localhost -t contrail-api-cli ls virtual-network

Note that there are other test setups (as you may discover by using the bash completion for the run-vm command), but the allInOne is the one you most probably want to use since it includes all available TungstenFabric packages.

Beyond The Basics

The goal of this blog post was to introduce nixpkgs-tungsten, briefly explain our motivation for developing it and to showcase how easy it is to get started with it.

What we haven’t looked at are the NixOS modules that we are providing for multiple TungstenFabric packages and how we make use of them. This however requires some more context and is best left for another blog post.

by Tobias Pflug at June 24, 2019 12:00 AM

June 21, 2019

Emilien Macchi

Developer workflow with TripleO

In this post we’ll see how one can use TripleO for developing & testing changes into OpenStack Python-based projects (e.g. Keystone).

 

Even if Devstack remains a popular tool, it is not the only one you can use for your development workflow.

TripleO hasn’t only been built for real-world deployments but also for developers working on OpenStack related projects like Keystone for example.

Let’s say, my Keystone directory where I’m writing code is in /home/emilien/git/openstack/keystone.

Now I want to deploy TripleO with that change and my code in Keystone. For that I will need a server (can be a VM) with at least 8GB of RAM, 4 vCPU and 50GB of disk, and CentOS7 or Fedora28 installed.

Prepare the repositories and install python-tripleoclient:

If you’re deploying on recent Fedora or RHEL8, you’ll need to install python3-tripleoclient.

Now, let’s prepare your environment and deploy TripleO:

Note: change the YAML for your own needs if needed. If you need more help on how to configure Standalone, please check out the official manual.

Now let’s say your code needs a change and you need to retest it. Once you modified your code, just run:

Now, if you need to test a review that is already pushed in Gerrit and you want to run a fresh deployment with it, you can do it with:

I hope these tips helped you to understand how you can develop and test any OpenStack Python-based project without pain, and pretty quickly. On my environment, the whole deployment takes less than 20 minutes.

Please give any feedback in comment or via email!

by Emilien at June 21, 2019 05:07 PM

Aptira

Kubernetes Training: Prep for the Certified Kubernetes Administrator (CKA) Exam

Aptira Kubernetes Training - Certified Kubernetes Administrator Exam

Kubernetes: Container Orchestration for Modern Applications

Container Orchestration platforms such as Kubernetes provide users with greater flexibility when running applications on both virtual infrastructure and physical hardware. It is specifically designed for deploying and managing containerised applications at scale across all major public Clouds and private infrastructure. Kubernetes provides on-demand scalability, simple roll-outs and roll-backs, quality control and monitoring to ensure the health of your application.

Kubernetes Training

In order to help System Administrators and DevOps professionals who would like to learn more about using Kubernetes in Cloud environments, we’ve put together a 2 day Kubernetes training course designed to teach the essentials of Kubernetes container management. Students will learn how to setup Kubernetes, as well as how to use it for automated deployment, scaling and management of containerised applications.

Topics include:

But all this flexibility can be complex to manage. Kubernetes is large and for unexperienced users, it can be quite difficult to deploy production.

  • Kubernetes architecture
  • Installing Kubernetes
  • Creating pods, deployments and services
  • Automated deployments, scaling and management of containerised applications

In order to fully grasp the complexity of Kubernetes, hands-on experience is a necessity. This course contains both theory, and extensive labs to give participants real life experience with Kubernetes.

Passing the Certified Kubernetes Administrator (CKA) Exam

Once you’re confident with Kubernetes, why not get yourself certified?

The Cloud Native Computing Foundation offers a certification program that allows users to demonstrate their competence in a hands-on, command-line environment. The purpose of the Certified Kubernetes Administrator (CKA) program is to provide assurance that CKAs have the skills, knowledge, and competency to perform the responsibilities of Kubernetes administrators.

It is an online, proctored, performance-based test that requires solving multiple issues from a command line.

We currently have 4 Solutionauts who are CKA certified (with more planning to sit the exam). They’ve put together a few notes to help you prepare for – and pass – the exam:

  • Practice a lot. Students need to be very proficient in using Kubernetes before undertaking the exam. Undertaking Kubernetes training before the exam would be ideal.
  • Familiarise yourself with the official docs. When in the exam, you will need to quickly jump to the page you want. For example, search by keywords so you know which page contains the content you want to look up.
  • Avoid manually typing as this is time consuming. Copy and paste wherever possible.
  • Use ‘kubectl run –dry-run -o yaml ….’ to generate a template for pods, jobs or deployments so you can modify them for your tasks. This can save a lot of time.
  • Read the tasks carefully – don’t make stupid mistakes.
  • Use your time wisely. Skip some difficult/time-consuming tasks and make a note to come back to them later. In some cases, you may have to give up some of the harder tasks and save the time for the tasks where you are more likely to earn points.

If you’re planning on deploying Kubernetes in your organisation, increasing your Kubernetes skills and taking the CKA exam – let us know. Our Solutionauts can help.

We’re also offering end of financial year discounts on all of our technology training – including Kubernetes. This discount applies to pre-paid training, booking multiple courses, bundling with your hardware, software licences and any of our services (including Kubernetes and Container Orchestration). So if you’re looking to upgrade your infrastructure, or learn how to manage it more efficiently – now is the time.

This deal is running until the end of June, but can be used at any time during the next 12 months. For more information on our DevOps services, or to get the best discount for you – chat with our Solutionauts today.

The post Kubernetes Training: Prep for the Certified Kubernetes Administrator (CKA) Exam appeared first on Aptira.

by Jessica Field at June 21, 2019 01:59 PM

OpenStack Superuser

Inside open infrastructure: The latest from the OpenStack Foundation

Welcome to the latest edition of the OpenStack Foundation Open Infrastructure newsletter, a digest of the latest developments and activities across open infrastructure projects, events and users. Sign up to receive the newsletter and email community@openstack.org to contribute.

Spotlight on KubeCon + CloudNativeCon Europe

Chris Hoge reports on the ongoing collaboration between the OpenStack and Kubernetes communities at the Barcelona event. The OpenStack community was present in both the SIG-Cloud-Provider working session and in the SIG-Cluster-Lifecycle at the contributor’s summit as well as a SIG deep dive and cloud provider sessions. More on his takeaways here.

OpenStack Foundation news

Open Infrastructure Summit Shanghai

      • The Call for Presentations is now open. Submit your session ideas in English or Mandarin before July 2.
      • Registration is also open, buy now to take advantage of early bird prices. You can pay in U.S. dollars or yuan if you need an official invoice (fapiao.)
      • If your organization can’t fund your travel, apply for the Travel Support Program by August 8.
      • Interested in sponsoring the Summit? Learn more here.

Open Infrastructure Project Teams Gathering (PTG)

    • There’Both SIGs/WGs and project teams can get together in  The highlights include:
      • Quarter day slots are available
      • The PTG will last three-and-a-half days while the Forum will last one-and-a-half.
      • Project onboarding will be incorporated in the PTG. Updates will remain part of the Summit.
      • There will be more shared space than previous events.
    • Registration for the PTG is included in the cost of registration for the Summit.

OpenStack Foundation project news

OpenStack

  • If you’re running OpenStack, please share your feedback and deployment information in the 2019 OpenStack User Survey. It only takes 20 minutes and anonymous feedback is shared directly with developers!
  • We’re now past the Train-1 development milestone, chugging toward the final Train release October 16. Train community goals have now been confirmed: Support for IPv6-only deployments (led by Ghanshyam Mann), enabling PDF generation support for project docs (led by Alexandra Settle) and updating Python 3 test runtimes (led by Corey Bryant).
  • Matt Riedemann started a thread to ask Nova operators about compute service delete behavior with respect to resource providers with allocations. Please add your comments if you disagree with the current proposed plan.

Airship

  • Airship reached a critical growth milestone as an open-source project: open elections for project governance. In the first step to holding elections, the Airship 2019 Technical Committee nominations are now open. Anyone who’s demonstrated a commitment to Airship including community building, communications, or direct code contributions within the last 12 months is eligible.
  • With the Airship 1.0 release landing just a few months ago, the design and development crew is ready to embark on a new journey. Join the Airship community as it plans and develops Airship 2.0 on its new Airship Evolution blog series. In the coming months they cover the existing design, how to evolve that design, as well as the anticipated benefits of the design changes.
  • If you’re a Suse Linux user, you can now participate in the Suse Cloud Technology Preview. Following the principles of upstream first and community involvement, the technology preview offers Suse users an opportunity to give direct feedback that will help to guide the upstream development of Airship.

StarlingX

  • StarlingX just had their first TSC election completed. For the new TSC group please see the governance webpage.
  • The community is working on finalizing the release content for the second release of the project planned for August this year. Check out the release plan wiki page for more information.

Kata Containers

Zuul

Look for OSF at these community events

July

August

September

October

November

Questions / feedback / contribute

This newsletter is written and edited by the OpenStack Foundation staff to highlight open infrastructure communities. We want to hear from you!
If you have feedback, news or stories that you want to share, reach us through community@openstack.org . To receive the newsletter, sign up here.

The post Inside open infrastructure: The latest from the OpenStack Foundation appeared first on Superuser.

by OpenStack Foundation at June 21, 2019 01:42 PM

Chris Dent

Placement Update 19-24

Here's a 19-24 pupdate. Last week I said there wouldn't be one this week. I was wrong. There won't be one next week. I'm taking the week off to reset (I hope). I've tried to make sure that there's nothing floating about in Placement land that is blocking on me.

Most Important

The spec for nested magic is close. What it needs now is review from people outside the regulars to make sure it doesn't suffer from tunnel vision. The features discussed form the foundation for Placement being able to service queries for real world uses of nested providers. Placement can already model nested providers but asking for the right one has needed some work.

Editorial

I read an article this morning which touches on the importance of considering cognitive load in software design. It's fully of glittering generalities, but it reminds me of some of the reasons why it was important to keep a solid boundary between Nova and Placement and why, now that Placement is extracted, the complexity of this nested magic is something we need to measure against the cognitive load it will induce in the people who come after us as developers and users.

What's Changed

  • Support for mappings in allocation candidates has merged as microversion 1.34.

  • Gibi made it so OSProfiler works with placement again.

Specs/Features

Some non-placement specs are listed in the Other section below.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 23 (3) stories in the placement group. 0 (0) are untagged. 4 (1) are bugs. 5 (-1) are cleanups. 11 (0) are rfes. 3 (1) are docs.

If you're interested in helping out with placement, those stories are good places to look.

1832814: Placement API appears to have issues when compute host replaced is an interesting bug. In a switch from RDO to OSA, resource providers are being duplicated because of a change in node name.

osc-placement

osc-placement is currently behind by 11 microversions.

Main Themes

Nested Magic

The overview of the features encapsulated by the term "nested magic" are in a story and spec.

Code related to this:

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A spec has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound.

Cleanup

We continue to do cleanup work to lay in reasonable foundations for the nested work above. As a nice bonus, we keep eking out additional performance gains too.

Other Placement

Miscellaneous changes can be found in the usual place.

There are five os-traits changes being discussed. And one os-resource-classes change.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

Go outside. Reflect a bit. Then do.

by Chris Dent at June 21, 2019 11:20 AM

June 20, 2019

OpenStack Superuser

OpenStack and Kubernetes show the power of open collaboration at KubeCon + CloudNativeCon Europe

BARCELONA — Under beautiful blue Mediterranean skies, members of the OpenStack and Kubernetes communities once again demonstrated the power of open collaboration at KubeCon EU.

Setting the stage for the big show, the event kicked off with the contributor’s summit, where the OpenStack community was present in both the SIG-Cloud-Provider working session and in the SIG-Cluster-Lifecycle session. The Cloud Provider team continued working through the process of transitioning cloud-specific SIGs to working groups under the governance of SIG-Cloud-Provider. This aligns with the goal of removing cloud-specific code from upstream Kubernetes, while still giving cloud integrators a means of supporting their developers and users within the Kubernetes community.

Meanwhile, work continued on managing the complete life cycle of Kubernetes clusters on various cloud architectures using the Kubernetes API in the “Cluster Lifecycle” session. OpenStack’s representation in this work is strong, with a OpenStack implementation of the new Cluster API under active development. With the Cluster API implementation users will be able to manage the entire life cycle of Kubernetes on OpenStack with basic declarations of machines and nodes. While installing Kubernetes is the first goal, its flexible and generic API will allow for more advanced management features like upgrades, auto-scaling, and auto-healing, all in a generic way that still takes advantages of the OpenStack API. Adding to this exciting work is the introduction of Metal3 to the Kubernetes community. Metal3 uses the cluster API to deliver Kubernetes on bare metal using OpenStack Ironic.

The main event kicked off the next day, with several sessions devoted to cloud provider integrations. SIG-OpenStack community leaders including myself, Aditi Sharma and Christoph Glaubitz coordinated an 80-minute session that covered a diverse set of topics, including the OpenStack cloud provider, the OpenStack Cluster API provider and a cluster autoscaler for the Magnum cluster management project.

Working in collaboration with the cloud-provider community, SIG-Cloud-Provider met for another 80-minute session to cover shared goals. This included the primary goal of removing upstream provider code from Kubernetes (a note to Kubernetes users: if you’re depending on the upstream code, start migrating now to external providers, including OpenStack, before they disappear in late 2019!), code organization for sub-projects and the future of user and working groups for cloud providers.

On the last day, I teamed up with Fabio Rapposelli for a dive into the technical depths of Building a Controller Manager for cloud platforms, offering up tips and tricks drawn from the OpenStack provider.

About the author

Chris Hoge is strategic program manager at the OSF.

Photo // CC BY NC

The post OpenStack and Kubernetes show the power of open collaboration at KubeCon + CloudNativeCon Europe appeared first on Superuser.

by Chris Hoge at June 20, 2019 02:04 PM

About

Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.

Subscriptions

Last updated:
July 23, 2019 01:07 AM
All times are UTC.

Powered by:
Planet