August 16, 2019

Chris Dent

Placement Update 19-32

Here's placement update 19-32. There will be no update 33; I'm going to take next week off. If there are Placement-related issues that need immediate attention please speak with any of Eric Fried (efried), Balazs Gibizer (gibi), or Tetsuro Nakamura (tetsuro).

Most Important

Same as last week: The main things on the Placement radar are implementing Consumer Types and cleanups, performance analysis, and documentation related to nested resource providers.

A thing we should place on the "important" list is bringing the osc placement plugin up to date. We also need to discuss what would we would like the plugin to be. Is it required that it have ways to perform all the functionality of the API, or is it about providing ways to do what humans need to do with the placement API? Is there a difference?

We decided that consumer types is medium priority: The nova-side use of the functionality is not going to happen in Train, but it would be nice to have the placement-side ready when U opens. The primary person working on it, tssurya, is spread pretty thin so it might not happen unless someone else has the cycles to give it some attention.

On the documentation front, we realized during some performance work last week that it easy to have an incorrect grasp of how same_subtree works when there are more than two groups involved. It is critical that we create good "how to use" documentation for this and other advanced placement features. Not only can it be easy to get wrong, it can be challenge to see that you've got it wrong (the failure mode is "more results, only some of which you actually wanted").

What's Changed

  • Yet more performance fixes are in the process of merging. Most of these are related to getting _merge_candidates and _build_provider_summaries to have less impact. The fixes are generally associated with avoiding duplicate work by generating dicts of reusable objects earlier in the request. This is possible because of the relatively new RequestWideSearchContext. In a request that returns many provider summaries _build_provider_summaries continues to have a significant impact because it has to create many objects but overall everything is much less heavyweight. More on performance in Themes, below.

  • The combination of all these performance fixes, and because of microversions, makes it reasonable for anyone running placement in a resource constrained environment (or simply wanting things to be faster) to consider running Train placement with any release of OpenStack. Obviously you should test it first, but it is worth investigating. More information on how to achieve this can be found in the upgrade to stein docs

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 23 (1) stories in the placement group. 0 (0) are untagged. 4 (1) are bugs. 4 (0) are cleanups. 11 (0) are rfes. 4 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 12 microversions.

  • https://review.opendev.org/666542 Add support for multiple member_of. There's been some useful discussion about how to achieve this, and a consensus has emerged on how to get the best results.

  • https://review.opendev.org/640898 Adds a new '--amend' option which can update resource provider inventory without requiring the user to pass a full replacement for inventory. This has been broken up into three patches to help with review.

Main Themes

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

As mentioned above, this is currently paused while other things take priority. If you have time that you could spend on this please respond here expressing that interest.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

As said above, there's lots of performance work in progress. We'll need to make a similar effort with regard to docs. For example, all of the coders involved in the creation and review of the same_subtree functionality struggle to explain, clearly and simply, how it will work in a variety of situations. We need to enumerate the situations and the outcomes, in documentation.

One outcome of this work will be something like a Deployment Considerations document to help people choose how to tweak their placement deployment to match their needs. The simple answer is use more web servers and more database servers, but that's often very wasteful.

On the performance front, there is one major area of impact which has not received much attention yet. When requesting allocation candidates (or resource providers) that will return many results the cost of JSON serialization is just under one quarter of the processing time. This is to be expected when the response body is 2379k big, and 154000 lines long (when pretty printed) for 7000 provider summaries and 2000 allocation requests.

But there are ways to fix it. One is to ask more focused questions (so fewer results are expected). Another is to limit=N the results (but this can lead to issues with migrations).

Another is to use a different JSON serializer. Should we do that? It make a big difference with large result sets (which will be common in big and sparse clouds).

Other Placement

Miscellaneous changes can be found in the usual place.

There are two os-traits changes being discussed. And zero os-resource-classes changes.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

Have a good next week.

by Chris Dent at August 16, 2019 02:34 PM

August 13, 2019

CERN Tech Blog

Nova support for a large Ironic deployment

CERN runs OpenStack Ironic to provision all the new hardware deliveries and the on-demand requests for baremetal instances. It replaced already most of the workflows and tools to manage the lifecycle of physical nodes, but we continue to work with the upstream community to improve the pre-production burn-in, the up-front performance validation and the integration of retirement workflows. During the last 2 years the service has grown from 0 to ~3100 physical nodes.

by CERN (techblog-contact@cern.ch) at August 13, 2019 02:00 PM

RDO

Community Blog Round Up 13 August 2019

Making Host and OpenStack iSCSI devices play nice together by geguileo

OpenStack services assume that they are the sole owners of the iSCSI connections to the iSCSI portal-targets generated by the Cinder driver, and that is fine 98% of the time, but what happens when we also want to have other non-OpenStack iSCSI volumes from that same storage system present on boot? In OpenStack the OS-Brick […]

Read more at https://gorka.eguileor.com/host-iscsi-devices/

Service Assurance on small OpenShift Cluster by mrunge

This article is intended to give an overview on how to test the

Read more at http://www.matthias-runge.de/2019/07/09/Service-Assurance-on-ocp/

Notes on testing a tripleo-common mistral patch by JohnLikesOpenStack

I recently ran into bug 1834094 and wanted to test the proposed fix. These are my notes if I have to do this again.

Read more at http://blog.johnlikesopenstack.com/2019/07/notes-on-testing-tripleo-common-mistral.html

Developer workflow with TripleO by Emilien

In this post we’ll see how one can use TripleO for developing & testing changes into OpenStack Python-based projects (e.g. Keystone).

Read more at https://my1.fr/blog/developer-workflow-with-tripleo/

Avoid rebase hell: squashing without rebasing by OddBit

You’re working on a pull request. You’ve been working on a pull request for a while, and due to lack of sleep or inebriation you’ve been merging changes into your feature branch rather than rebasing. You now have a pull request that looks like this (I’ve marked merge commits with the text [merge]):

Read more at https://blog.oddbit.com/post/2019-06-17-avoid-rebase-hell-squashing-wi/

Git Etiquette: Commit messages and pull requests by OddBit

Always work on a branch (never commit on master) When working with an upstream codebase, always make your changes on a feature branch rather than your local master branch. This will make it easier to keep your local master branch current with respect to upstream, and can help avoid situations in which you accidentally overwrite your local changes or introduce unnecessary merge commits into your history.

Read more at https://blog.oddbit.com/post/2019-06-14-git-etiquette-commit-messages/

Running Keystone with Docker Compose by OddBit

In this article, we will look at what is necessary to run OpenStack’s Keystone service (and the requisite database server) in containers using Docker Compose.

Read more at https://blog.oddbit.com/post/2019-06-07-running-keystone-with-docker-c/

The Kubernetes in a box project by Carlos Camacho

Implementing cloud computing solutions that runs in hybrid environments might be the final solution when comes to finding the best benefits/cost ratio.

Read more at https://www.anstack.com/blog/2019/05/21/kubebox.html

Running Relax-and-Recover to save your OpenStack deployment by Carlos Camacho

ReaR is a pretty impressive disaster recovery solution for Linux. Relax-and-Recover, creates both a bootable rescue image and a backup of the associated files you choose.

Read more at https://www.anstack.com/blog/2019/05/20/relax-and-recover-backups.html

by Rain Leander at August 13, 2019 08:00 AM

August 12, 2019

OpenStack Superuser

Inside open infrastructure: The latest from the OpenStack Foundation

Welcome to the latest edition of the OpenStack Foundation Open Infrastructure newsletter, a digest of the latest developments and activities across open infrastructure projects, events and users. Sign up to receive the newsletter and email community@openstack.org to contribute.

Spotlight on: The Open Infrastructure Summit Shanghai Agenda

The agenda for the Open Infrastructure Summit Shanghai went live this week! Join the global community in Shanghai from November 4-6 to experience:

  • Keynote and breakout sessions spanning 30+ open source projects from technical community leaders and organizations including:
    • Managing a growing OpenStack cloud in production at ByteDance (creator of TikTok) who runs an OpenStack environment of 300,000 cores and is still rapidly growing at a rate of 30,000 CPU cores per month
    • Monitoring and Autoscaling Features for Self -Managed Kubernetes clusters at WalmartLabs
    • Secured edge infrastructure for Contactless Payment System with StarlingX at China UnionPay
    • How to run a public cloud on OpenStack from China Mobile
    • Integrating RabbitMQ with OpenStack at LINE, the most popular messaging app in Japan
    • Project updates and onboarding from OSF projects: Airship, Kata Containers, OpenStack, StarlingX, and Zuul.
    • Collaborative sessions at the Forum, where open infrastructure operators and upstream developers will gather to jointly chart the future of open source infrastructure, discussing topics ranging from upgrades to networking models and how to get started contributing.
    • Hands-on training around open source technologies directly from the developers and operators building the software.
    • The Summit will be followed by the Project Teams Gathering (PTG): various open source contributor teams and working groups will meet to get work done, with a special focus this PTG around onboarding new team members.

Now, it’s time to register you and your team for the Shanghai Summit before prices increase next week on August 14 at 11:59pm PT (August 15 at 2:59pm China Standard Time). If your organization is recruiting new talent or wanting to share news around a new product launch, join the Summit as a sponsor by reaching out to summit@openstack.org.

OpenStack Foundation:

Open Infrastructure Summit:

  • The Community Contributor Award nominations are open until October 20th at 7:00 UTC. Community members from any Foundation project from Airship, Kata Containers, OpenStack, StarlingX and Zuul can be nominated! Recipients will be announced in Shanghai at the Summit.
  • Registration is open. Summit tickets grant you access to the PTG. Save on tickets by purchasing them now at the early bird price. There are 2 ways to register – in USD or in RMB (with fapiao)
  • Know an organization that’s innovating with open infrastructure? Nominate them for the Superuser Awards by September 27.
  • Need a Chinese Visa? Start the process now! Information here.
  • Have your brand in the spotlight by sponsoring the Summit! Learn more here.
  • The Travel Support Program is also available. Apply before August 13!

Project Teams Gathering:

  • PTG attendance surveys have been sent out to project/group/team leads and responses are due August 11. If you are a team lead and missed the email with the survey, please contact Kendall Nelson (knelson@openstack.org) ASAP.
  • Registration is open. PTG tickets are included with Summit registration. Save on tickets by purchasing them now at the early bird price. There are 2 ways to register – in USD or in RMB (with fapiao)
  • The Travel Support Program is also available. Apply before August 13!

Airship: Elevate Your Infrastructure

  • Directly following the Technical Committee election, the Airship project is holding its first Working Committee election. The Working Committee is intended to help influence the project strategy, help arbitrate when there is a disagreement between Core Reviewers within a single project or between Airship projects, define the project core principles, perform marketing and communications, and finally help provide product management as well as ecosystem support. The close of the Working Committee polling will mark the full transition to Airship being a community governed open source project with 100% elected leadership.

Kata Containers: The speed of containers, the security of VMs

  • Kata Containers 1.8 release landed on July 24. This latest release upgrades the QEMU hypervisor from a QEMU-lite base to upstream QEMU 4.0. Kata templating code is updated to make use of the upstream x-ignored-shared. Firecracker hypervisor is also updated to 0.17, and Kata now has support for using Firecracker’s jailer, adding extra security isolation for the VMM on the host. Fixes and usability improvements for virtio-fs have also been introduced.
  • The Kata Containers 1.9 Alpha release was also created. In the upcoming 1.9 release, which is expected to land in mid-October, Kata will introduce support for a new hypervisor: ACRN. View the latest Kata Containers releases here.
    The Kata community is excited to again have a significant presence at the upcoming Open Infrastructure Summit with 5 talks accepted. Check out the full line-up of Kata sessions here.

OpenStack: Open Source Software for Creating Private and Public Clouds

  • The OpenStack User Committee (UC) is tasked with representing OpenStack users in the project governance. Two UC seats will soon be renewed. The nomination period is currently underway.
  • The next OpenStack release (planned for October 16) is called Train. But what should be the name of the release after that? Our release naming process calls for a name starting with the letter U, ideally related to a geographic feature close to Shanghai, China. The community proposed several options, and a community poll will soon be opened. Watch out for it!
  • Each release cycle, we define common goals for the OpenStack project teams. The goal selection process for the ‘U’ release has started: please read Ghanshyam Mann’s openstack-discuss email if you want to make suggestions.
  • A security vulnerability in Nova Compute has been announced for all current versions, so anyone running it should make sure their deployment is updated with the corresponding release’s fix as soon as possible.

StarlingX: A Fully Featured Cloud for the Distributed Edge

  • See the list of StarlingX sessions on the upcoming Open infrastructure Summit in Shanghai here!
  • In preparation for the 2.0 release the community cut RC1 with a new branch this week. The testing of the stable codebase is still ongoing to assure high code quality when the release comes out at the end of August.

Zuul: Stop Merging Broken Code

Find OSF at these Open Infrastructure Community Events

August

September

October

November

Questions / feedback / contribute

This newsletter is written and edited by the OpenStack Foundation staff to highlight open infrastructure communities. We want to hear from you! If you have feedback, news or stories that you want to share, reach us through community@openstack.org . To receive the newsletter, sign up here.

The post Inside open infrastructure: The latest from the OpenStack Foundation appeared first on Superuser.

by Allison Price at August 12, 2019 02:00 PM

August 09, 2019

Chris Dent

Placement Update 19-31

Pupdate 19-31. No bromides today.

Most Important

Same as last week: The main things on the Placement radar are implementing Consumer Types and cleanups, performance analysis, and documentation related to nested resource providers.

We need to decide how much of a priority consumer types support is. I've taken the task of asking around with the various interested parties.

What's Changed

  • A more complex nested topology is now being used in the nested-perfload check job, and both that and the non-nest perfload run apache benchmark at the end. When you make changes you can have a look at the results of the placement-perfload and placement-nested-perfload gate jobs to see if there has been a performance impact. Keep in mind the numbers are only a guide. The performance characteristics of VMs from different CI providers varies wildly.

  • A stack of several performance related improvements has merged, with still more to come. I've written a separate Placement Performance Analysis that summarizes some of the changes. Many of these may be useful for other services. Each iteration reveals another opportunity.

  • In some environments placement will receive a URL of '' when '/' is expected. Auth handling for version control needs to handle this.

  • osc-placmeent 1.6.0 is in the process of being released.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 22 (-1) stories in the placement group. 0 (0) are untagged. 3 (0) are bugs. 4 (-1) are cleanups. 11 (0) are rfes. 4 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 12 microversions.

  • https://review.opendev.org/666542 Add support for multiple member_of. There's been some useful discussion about how to achieve this, and a consensus has emerged on how to get the best results.

  • https://review.opendev.org/640898 Adds a new '--amend' option which can update resource provider inventory without requiring the user to pass a full replacement for inventory. This has been broken up into three patches to help with review.

Main Themes

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

As said above, there's lots of performance work in progress. We'll need to make a similar effort with regard to docs.

One outcome of this work will be something like a Deployment Considerations document to help people choose how to tweak their placement deployment to match their needs. The simple answer is use more web servers and more database servers, but that's often very wasteful.

Other Placement

Miscellaneous changes can be found in the usual place.

There are two os-traits changes being discussed. And zero os-resource-classes changes.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

Somewhere in this performance work is a lesson for life: Every time I think we've reached the bottom of the "easy stuff", I find yet another bit of easy stuff.

by Chris Dent at August 09, 2019 02:07 PM

Galera Cluster by Codership

Setting Up a Galera Cluster on Amazon AWS EC2

Through Amazon Web Services (AWS), you can create virtual servers (i.e., instances). You can install database and Galera software on them. In this article, we’ll create three nodes, the minimum recommended for a healthy cluster, and configure them to use Galera Cluster.

Incidentally, there is a more detailed version of this article in the Tutorial section of our Library.

Assumptions & Preparation

We’re assuming you have an AWS account and know the basics of the EC2 (Elastic Compute Cloud) platform.

To access the nodes, you’ll need an encryption key. Create a new one specifically for Galera, using a tool such as ssh-keygen. Add that key to AWS, under Key Pairs.

Creating AWS Instances

To start creating instances in AWS, click on Instances, then Launch Instances. First, choose the operating system distribution. We chose here “CentOS 7 (x86_64) – with Updates HVM”.

Next, choose an instance type. Because we’re using this cluster as a training tool, we chose t2.micro, which is free for a year.

Next is the instance details. In the first box, for the number of instances, enter 3. You can leave everything else at their default values.

Adding storage is next. If you chose the free tier, the default is 8 GB. For training, this is plenty. You can click past the screen on Adding Tags.

Next is Security Group (i.e., AWS’s firewall). Create a new one for Galera and add an SSH rule to allow you to log in. For the source, choose My IP.

With that done, click on Review and Launch to see the choices you made. If everything is fine, click Launch.

A message will ask for an encryption key. Click Choose an Existing Key Pair and select the Galera one. Read and accept the warning and then click Launch Instance.

When all three nodes are running, label them (e.g., galera1). Check each instance to get their external IP addresses.

Installing Software on Nodes

You’re now ready to install the database and Galera software. Use ssh to log into each node through their external IP addresses, using your encryption key.

Install rsync, which Galera uses to synchronize new nodes, and firewalld on each node with a package-management utility like yum:

sudo yum -y install rync firewalld

The database is next. You might install MySQL or MariaDB, depending on your preferences. Both work well with Galera Cluster. There are several methods by which you may install the database and Galera software. For instructions on this, go to our documentation page on Installing Galera Cluster.

Configuring the Nodes

You’ll need to edit the database configuration file (i.e., /etc/my.cnf.d/server.cnf) on each node. There are some parameters related to MySQL or MariaDB and the InnoDB storage engine that you might want to add for better performance and troubleshooting. See the Tutorial for these. As for Galera, add a [galera] section to the configuration file:

[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so

wsrep_node_name='galera1'
wsrep_node_address="172.31.19.208"

wsrep_cluster_name='galera-training'
wsrep_cluster_address="gcomm://172.31.19.208,172.31.26.197,172.31.15.54"

wsrep_provider_options="gcache.size=300M; gcache.page_size=300M"
wsrep_slave_threads=4
wsrep_sst_method=rsync

The wsrep_on enables Galera. The file path for wsrep_provider may have to be adjusted to your server.

The wsrep_node_name needs to be unique for each node. The wsrep_node_address is the IP address for the node. For AWS, use the internal ones.

The wsrep_cluster_name is the cluster’s name. The wsrep_cluster_address contains the addresses of all nodes.

Security Settings

You now have to open certain ports. Galera Cluster uses four TCP ports: 3306 (MySQL’s default), 4444, 4567, and 4568. It also uses one UDP: 4567. For SELinux, open these ports by executing the following on each node:

semanage port -a -t mysqld_port_t -p tcp 3306
semanage port -a -t mysqld_port_t -p tcp 4444
semanage port -a -t mysqld_port_t -p tcp 4567
semanage port -a -t mysqld_port_t -p udp 4567
semanage port -a -t mysqld_port_t -p tcp 4568
semanage permissive -a mysqld_t

You’ll have to do the same for the firewall:

systemctl enable firewalld
systemctl start firewalld

firewall-cmd --zone=public --add-service=mysql --permanent
firewall-cmd --zone=public --add-port=3306/tcp --permanent
firewall-cmd --zone=public --add-port=4444/tcp --permanent
firewall-cmd --zone=public --add-port=4567/tcp --permanent
firewall-cmd --zone=public --add-port=4567/udp --permanent
firewall-cmd --zone=public --add-port=4568/tcp --permanent

firewall-cmd --reload

Now you need to add some related entries to AWS. Click Security Groups and select the Galera group. Under the Actions, select Edit Inbound Rules.

Click Add Rule and select the type, MySQL/Aurora and enter the internal IP address for the first node (e.g., 172.31.19.208/32). Next, add another rule, but this time a Custom TCP Rule for port 4444 — using the same internal address. Now add another custom TCP entry, but for port, enter “4567 – 4568”. Last, add a custom UDP entry for port 4567.

Repeat these four entries for each node, adjusting the IP addresses. When finished, click Save.

Starting Galera

When starting a new cluster, you tell the first node that it’s first by using the --wsrep-new-cluster option with mysqld. To make it easy, if you’re using MariaDB 10.4 with version 4 of Galera, you can use the galera_new_cluster script. Execute it only on the first node. This will start MySQL and Galera on that one node. On the other nodes, execute the following:

systemctl start mysql

Once MySQL has started on each, enter the line below from the command-line on one of the nodes. There’s no password yet, so just hit Enter.

mysql -p -u root -e "SHOW STATUS LIKE 'wsrep_cluster_size'"

+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+

You can see here there are three nodes in the cluster. That’s what we want. Galera cluster was successfully installed using AWS.

by Sakari Keskitalo at August 09, 2019 06:50 AM

August 08, 2019

Chris Dent

Placement Performance Analysis

Performance has always been important to Placement. In a busy OpenStack cloud, it will receive many hits per second. Any slowness in the placement service will add to the latency present in instance creation and migration operations.

When we added support for requesting complex topologies of nested resource providers, performance took an expected hit. All along, the plan was to make it work and then make it fast. In the last few weeks members of the Placement team have been working to improve performance.

Human analysis of the code can sometimes suggest obvious areas for performance improvement but it is also very easy to be misled. It's better to use profiling and benchmarking to get accurate measurements of what code is using the most CPU and to effectively compare different revisions of the code.

I've written two other postings about how to profile WSGI apps and analyse the results. Using those strategies we've iterated through a series of changes using the following process:

  1. profile to find the most expensive chunk of code
  2. determine if it can be improved and how
  3. change the code
  4. benchmark to see if it really helps, if it does, keep it, otherwise try something else
  5. repeat

The most recent big feature added to placement was called same_subtree. It adds support for requiring that a subset of the solution set for a request be under the same ancestor resource provider. This helps to support "affinity" within a compute host (e.g., "this FPGA is under the same NUMA node as this FPGA").

What follows are some comparison numbers from benchmarks run with the commit that added same_subtree and recent master (between which several performance tweaks have been added). The test host is a Linux VM with 16 GB of RAM, 16 VCPU. Placement is running standalone (without keystone), using PostgreSQL as its database and uwsgi as the web server with the following startup

uwsgi --http :8000 --wsgi-file .tox/py37/bin/placement-api --processes 4 --threads 10

all on that same host.

Apache benchmark is run on an otherwise idle 8 core machine on the same local network. Headers are set with -H 'x-auth-token: admin' and -H 'openstack-api-version: placement latest' to drive appropriate noauth2 and microversion settings.

The server is preloaded with 7000 resource providers created using the nested-perfload topology.

The URL requested is:

GET /allocation_candidates?
     resources=DISK_GB:10&
     required=COMPUTE_VOLUME_MULTI_ATTACH&
     resources_COMPUTE=VCPU:1,MEMORY_MB:256&
     required_COMPUTE=CUSTOM_FOO&
     resources_FPGA=FPGA:1&
     group_policy=none&
     same_subtree=_COMPUTE,_FPGA

The Older Code

ab -c 1 -n 10 [the rest] (1 concurrency, 10 total requests):

Requests per second:    0.40 [#/sec] (mean)
Time per request:       2472.930 [ms] (mean)

ab -c 40 -n 400 [the rest] (40 concurrency, 400 total requests):

Requests per second:    1.46 [#/sec] (mean)
Time per request:       27454.696 [ms] (mean)

(For concerned benchmark purists: throughout this process I've also been running with thousands of requests instead of tens or hundreds to make sure that the mean values I'm getting here aren't because of the short run time. They are not. Also, not reported here, but I've also been doing benchmarks to compare how concurrent I can get before something explodes. As you might expect: as individual requests become lighter, the wider we can get.)

The New and Improved Code

(These numbers are not quite up to date. They are from a recent master but there are at least four more performance-related patches yet to merge. I'll update when that's all in.)

ab -c 1 -n 10 [the rest] (1 concurrency, 10 total requests):

Requests per second:    0.70 [#/sec] (mean)
Time per request:       1423.695 [ms] (mean)

ab -c 40 -n 400 [the rest] (40 concurrency, 400 total requests):

Requests per second:    2.90 [#/sec] (mean)
Time per request:       13772.054 [ms] (mean)

How'd We Get There?

This is a nice improvement. It may not seem like that much — over 1 second per request is rather slow in the absolute — but there is a lot happening in the background and a lot of data being returned.

One response is a complex nested JSON object of 2583330 bytes. It has 154006 lines when sent through json_pp.

There are several classes of changes that were made. These might be applicable to other environments (like yours!):

  • If using SQLAlchemy, using the RowProxy object directly, within the persistence layer, is okay and much faster that casting to a dict or namedtuple (which have interfaces the RowProxy already provides).

  • Use __slots__ in frequently used objects. It really does speed up attribute access time.

  • Profiling can often reveal sets of data that are retrieved multiple times. If you can find these and build them incrementally in the context of a single request/operation it can be a big win. See Add RequestWideSearchContext.summaries_by_id and Track usage info on RequestWideSearchContext for examples.

  • If you're doing anything with membership checking with a list and you're able to make it a set, do.

  • When using SQLAlchemy's in_ operator with a large number of values, an expanding bindparam can make a big difference in performance.

  • Implementing __copy__ on simple classes of object that are copied many times in single requests. Python's naive copy is expensive, in aggregate.

Also, not based on the recent profiling, but in earlier work comparing non-nested setups (we've gone from 1.2 seconds for a GET /allocation_candidates?resources=DISK_GB:10,VCPU:1,MEMORY_MB:256 request against 1000 providers in early January to .53 seconds now) we learned the following:

  • Unless you absolutely must (perhaps because you are doing RPC), avoid using oslo versioned objects. They add a lot of overhead for type checking and coercing when getting and setting attributes.

What's Next?

I'm pretty sure there are a lot more improvements to be made. Each pass through the steps listed above exposes another avenue for investigation. Thus far we've been able to make improvements without too much awareness of the incoming request: we've not been adding conditionals or special-cases. Adding those will probably take us into a new world of opportunities.

Most of the application time is spent interacting with the database. Little has yet been done to explore tweaking the schema (things like de-normalization) or tweaking the database configuration (threads available, cache sizes, using SSDs). All of that will have impact.

And, in the end, because Placement is a simple web application over a database, the easiest way to get more performance is to make more web and database servers and load balance them. However, that's a cop out, we should save cycles where we can. Everything is expensive at scale.

by Chris Dent at August 08, 2019 04:30 PM

Galera Cluster by Codership

Setting Up a Galera Cluster on Amazon AWS EC2

Through Amazon Web Services (AWS), you can create virtual servers (i.e., instances). You can install database and Galera software on them. In this article, we’ll create three nodes, the minimum recommended for a healthy cluster, and configure them to use Galera Cluster.

Incidentally, there is a more detailed version of this article in the Tutorial section of our Library.

Assumptions & Preparation

We’re assuming you have an AWS account and know the basics of the EC2 (Elastic Compute Cloud) platform.

To access the nodes, you’ll need an encryption key. Create a new one specifically for Galera, using a tool such as ssh-keygen. Add that key to AWS, under Key Pairs.

Creating AWS Instances

To start creating instances in AWS, click on Instances, then Launch Instances. First, choose the operating system distribution. We chose here “CentOS 7 (x86_64) – with Updates HVM”.

Next, choose an instance type. Because we’re using this cluster as a training tool, we chose t2.micro, which is free for a year.

Next is the instance details. In the first box, for the number of instances, enter 3. You can leave everything else at their default values.

Adding storage is next. If you chose the free tier, the default is 8 GB. For training, this is plenty. You can click past the screen on Adding Tags.

Next is Security Group (i.e., AWS’s firewall). Create a new one for Galera and add an SSH rule to allow you to log in. For the source, choose My IP.

With that done, click on Review and Launch to see the choices you made. If everything is fine, click Launch.

A message will ask for an encryption key. Click Choose an Existing Key Pair and select the Galera one. Read and accept the warning and then click Launch Instance.

When all three nodes are running, label them (e.g., galera1). Check each instance to get their external IP addresses.

Installing Software on Nodes

You’re now ready to install the database and Galera software. Use ssh to log into each node through their external IP addresses, using your encryption key.

Install rsync, which Galera uses to synchronize new nodes, and firewalld on each node with a package-management utility like yum:

sudo yum -y install rync firewalld

The database is next. You might install MySQL or MariaDB, depending on your preferences. Both work well with Galera Cluster. There are several methods by which you may install the database and Galera software. For instructions on this, go to our documentation page on Installing Galera Cluster.

Configuring the Nodes

You’ll need to edit the database configuration file (i.e., /etc/my.cnf.d/server.cnf) on each node. There are some parameters related to MySQL or MariaDB and the InnoDB storage engine that you might want to add for better performance and troubleshooting. See the Tutorial for these. As for Galera, add a [galera] section to the configuration file:

[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so

wsrep_node_name='galera1'
wsrep_node_address="172.31.19.208"

wsrep_cluster_name='galera-training'
wsrep_cluster_address="gcomm://172.31.19.208,172.31.26.197,172.31.15.54"

wsrep_provider_options="gcache.size=300M; gcache.page_size=300M"
wsrep_slave_threads=4
wsrep_sst_method=rsync

The wsrep_on enables Galera. The file path for wsrep_provider may have to be adjusted to your server.

The wsrep_node_name needs to be unique for each node. The wsrep_node_address is the IP address for the node. For AWS, use the internal ones.

The wsrep_cluster_name is the cluster’s name. The wsrep_cluster_address contains the addresses of all nodes.

Security Settings

You now have to open certain ports. Galera Cluster uses four TCP ports: 3306 (MySQL’s default), 4444, 4567, and 4568. It also uses one UDP: 4567. For SELinux, open these ports by executing the following on each node:

semanage port -a -t mysqld_port_t -p tcp 3306
semanage port -a -t mysqld_port_t -p tcp 4444
semanage port -a -t mysqld_port_t -p tcp 4567
semanage port -a -t mysqld_port_t -p udp 4567
semanage port -a -t mysqld_port_t -p tcp 4568
semanage permissive -a mysqld_t

You’ll have to do the same for the firewall:

systemctl enable firewalld
systemctl start firewalld

firewall-cmd --zone=public --add-service=mysql --permanent
firewall-cmd --zone=public --add-port=3306/tcp --permanent
firewall-cmd --zone=public --add-port=4444/tcp --permanent
firewall-cmd --zone=public --add-port=4567/tcp --permanent
firewall-cmd --zone=public --add-port=4567/udp --permanent
firewall-cmd --zone=public --add-port=4568/tcp --permanent

firewall-cmd --reload

Now you need to add some related entries to AWS. Click Security Groups and select the Galera group. Under the Actions, select Edit Inbound Rules.

Click Add Rule and select the type, MySQL/Aurora and enter the internal IP address for the first node (e.g., 172.31.19.208/32). Next, add another rule, but this time a Custom TCP Rule for port 4444 — using the same internal address. Now add another custom TCP entry, but for port, enter “4567 – 4568”. Last, add a custom UDP entry for port 4567.

Repeat these four entries for each node, adjusting the IP addresses. When finished, click Save.

Starting Galera

When starting a new cluster, you tell the first node that it’s first by using the --wsrep-new-cluster option with mysqld. To make it easy, if you’re using MariaDB 10.4 with version 4 of Galera, you can use the galera_new_cluster script. Execute it only on the first node. This will start MySQL and Galera on that one node. On the other nodes, execute the following:

systemctl start mysql

Once MySQL has started on each, enter the line below from the command-line on one of the nodes. There’s no password yet, so just hit Enter.

mysql -p -u root -e "SHOW STATUS LIKE 'wsrep_cluster_size'"

+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+

You can see here there are three nodes in the cluster. That’s what we want. Galera cluster was successfully installed using AWS.

by Russell Dyer at August 08, 2019 02:56 PM

Aptira

Designing & Building a Network Functions Virtualisation Infrastructure (NFVi) Orchestration Layer using Cloudify

One of our customers was building a greenfield Network Functions Virtualisation Infrastructure (NFVi) and requires Orchestration capabilities, but lacked the skills to do this themselves. Designing an ideal deployment model of the Orchestration system using Cloudify is a major challenge, but this type of challenge is what the Aptira engineers relish.


The Challenge

This greenfield NFVi platform consists of a private Cloud with a high-fidelity full-stack configuration that includes a Cloud platform / Virtualised Infrastructure Manager (VIM), Orchestration, Software Defined Networking (SDN), and solution-wide alarming and monitoring spread across multiple data centres across multiple regions.

Their internal team did not have a deep skill base in the area of Network Function Virtualisation Orchestration (NVFO) and so turned to Aptira to augment their core team with these skills.

In this engagement Aptira was responsible for designing and building Orchestration layer using Cloudify in the NFVI platform. The requirements for the platform were world-class enterprise Telco standards, and presented multiple design challenges:

  • National service level scalability
  • High availability across geo-distributed NFV systems
  • Lack of concrete use case (since it was still early NFV days), and
  • The myriad technical and operational requirements associated with such a large-scale platform

The Aptira Solution

There were many stated requirements for the NFVi platform, but two requirements would determine the success or failure of the design: Scalability and Performance. These two key design requirements for building a large scale and geo-distributed NFV systems would be critical to the design.

Aptira’s analysis of the customer requirements zeroed in on one key factor: the number and distribution of VNF’s deployed and managed in the platform, combined with the frequency of configuration change. New VNF’s or changed orchestration models further increase the demand on the orchestration function.

Orchestration has been implemented in the customer’s NFVi platform using Cloudify to manage the VNF lifecycles. This increase in the number of VNF deployments may impact scalability and performance requirements in a non-linear manner. As such, these factors have impacted the design of the deployment architecture of the orchestration layer.

The key design considerations for the deployment architecture include (but are not limited to) the following:

  • Number of VNFs to be managed
  • Operational design of VNFs
  • Number of NFVi PoPs across which VNFs are to be orchestrated
  • Latency/delay between Cloudify and the VNFs
  • Number of technology domains across which Cloudify has to orchestrate
  • Envisioned roadmap of the expansion of NFVi deployments

Factoring all these design elements into our analysis, Aptira designed two deployment options for consideration by our customer:

  • Flat model: in which only one instance of Cloudify will be deployed. This Cloudify instance manages VNF instances and orchestration across different NFVi Domains/PoPs as shown in figure 1.
  • Hierarchical model: in which Cloudify is deployed in each of the NFVi domains managing VNFs and orchestrating resources across domain specific NFVi-PoPs. And then a Global orchestrator to handle the orchestration across multiple NFVi/technology domains as shown in figure 2.

Each model has its pros and cons:

Whilst the Flat model is simple to deploy and is able to handle most of the orchestration related transactions, it suffers when transactions are to be handled across multiple data centres thereby bringing in dependency on WAN latency.

The Hierarchical model requires careful consideration of resource allocation and deploying them in failure domains but has significant advantages while handling operational aspects of VNF’s such as Close Loop Automation Policy (CLAMP). Localizing such actions increases the uptime of VNFs.

Aptira presented two options mainly due to the absence of defined tenant workloads and use cases. Our intent was to demonstrate to the customer the full range of possibilities and to work with the customer on how to choose the appropriate deployment model depending on the emerging tenant requirements.


The Result

Aptira were able to validate both deployment models to the customer using a real telco use case, and also prepared a design paper for use by the solution architects working on the entire NFVi solution. This allowed customer to plan their deployments and talk to their tenants about the use cases that can be realized with such a model.


Keep your data in safe hands.
See what we can do to protect and scale your data.

Secure Your Data

The post Designing & Building a Network Functions Virtualisation Infrastructure (NFVi) Orchestration Layer using Cloudify appeared first on Aptira.

by Aptira at August 08, 2019 01:30 PM

August 07, 2019

Aptira

Swinburne Nextcloud Storage

Aprira Swinburne Nextcloud Case Study

Aptira previously built a large Ceph storage cluster for Swinburne. While the Ceph storage has been reliable, Swinburne wanted to offer a Dropbox-like user experience for staff and students on top of this on-premises storage.


The Challenge

Swinburne wanted to improve access to the Ceph storage in a number of ways:

  • Improve ease-of-use and features for users: the standard storage protocols offered by Ceph are not readily accessible by less technical users. Swinburne wanted to make storage available to a broader cohort of users by adding a user-friendly interface with sharing and collaboration features.
  • Reduce the maintenance required of their IT services department to manage it: At the time Swinburne were using a variety of methods to provision access to storage – all of them requiring manual steps before the storage could be delivered to a user. Keeping track of the current storage allocations had also become a burden for staff.
  • Integrate authentication with their existing Azure AD system: Allow users to login via SSO.
  • Integrate storage account requests into their existing ITSM system to enable self-service provisioning for users.

Swinburne had identified a few candidate products that might fulfill their requirements, but had not looked at each in any great depth due to internal resourcing constraints.


The Aptira Solution

Aptira first undertook an evaluation of four candidate storage applications. We rapidly deployed each application in an environment within Swinburne so the features and functionality of each could be compared. We produced a detailed evaluation report that allowed Swinburne to make an informed decision about which application to move forward with. Two leading candidates were put forward by Aptira and those deployments were converted into a larger-scale proof-of-concept that included integration with the actual Ceph storage so Swinburne staff and IT services team could get a feel for using each application.

The Nextcloud application was eventually chosen as it met the majority of their user and business requirements. From here Aptira developed a comprehensive solution architecture, paying particular concern to high availability and the ability to scale as the user base increased.

According to the solution architecture, Aptira deployed:

  • A MariaDB Galera cluster
  • A Kubernetes cluster to host the Nextcloud platform
  • Nextcloud, Redis and a MariaDB proxy as containers

Kubernetes was selected as the container orchestration platform due to its self-healing and scaling capabilities, and its ability to simplify application deployment and configuration. While the Nextcloud community provides a pre-built container image, it was not suitable for a multi-node production deployment, so we developed a custom image using the existing image as a base.

Maintainability was a significant concern for Swinburne so we ensured that all components of the architecture were deployed using Ansible to eliminate any manual steps in the deployment. We integrated our Ansible work into Swinburne’s existing Ansible Tower deployment, creating job templates so that deployments could be triggered from the Tower server. Since all of our work was being stored in Git on Swinburne’s GitLab server, we also created CICD pipelines to both build the Nextcloud container image and to trigger deployment to their test and production environments via Ansible Tower. During handover, Swinburne IT staff were able to deploy changes to the test environment by simply committing code to the repository.

Finally, we worked with ITSM staff to integrate the new service into Swinburne’s self-service portal, so users can request access to storage and make changes to their allocated quota.


The Result

Swinburne staff now have a stable and performant web-based storage service where they can upload, manage and share on-premises data.

As the uptake of the service increases, IT staff also have the confidence that the service can be scaled out to handle the increasing interest from users.

By recommending applications with an external API, Aptira made sure that Swinburne’s ITSM system would easily integrate with Nextcloud and satisfy Swinburne’s requirement to have a single pane of glass for all user service requests. With ITSM integration, Swinburne IT have also gained a charge-back capability to recover costs from other departments.

The solution was built with 100% open source components, reducing vendor lock-in.

While Aptira is happy to recommend and deploy greenfield DevOps infrastructure to support a company’s CICD needs, this project showed that we can also customise our solutions to fit in with our customers’ existing DevOps infrastructure, configuring a complete deployment pipeline for provisioning the entire solution.


OTHER SWINBURNE CASE STUDIES

  • Swinburne Case Study 1: We teamed up with SUSE up to build a very high-performing and scalable storage landscape at a fraction of the cost than we would have been able to with traditional storage systems.
  • Swinburne Case Study 2: Swinburne need to set up a massive (think petabyte) storage system for their researchers to store valuable research data.
  • Swinburne Case Study 3: As SUSE Storage 5 was released, Swinburne wanted to take advantage of its new features like, so we planned an upgrade to this latest version.
  • Swinburne Case Study 4: Swinburne wanted to offer a Dropbox-like user experience for staff and students on top of this on-premises storage.

Keep your data in safe hands.
See what we can do to protect and scale your data.

Secure Your Data

The post Swinburne Nextcloud Storage appeared first on Aptira.

by Aptira at August 07, 2019 01:41 PM

August 02, 2019

Chris Dent

Placement Update 19-30

Pupdate 19-30 is brought to you by the letter P for Performance.

Most Important

The main things on the Placement radar are implementing Consumer Types and cleanups, performance analysis, and documentation related to nested resource providers.

What's Changed

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 23 (2) stories in the placement group. 0 (0) are untagged. 3 (1) are bugs. 5 (0) are cleanups. 11 (1) are rfes. 4 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 12 microversions.

  • https://review.opendev.org/666542 Add support for multiple member_of. There's been some useful discussion about how to achieve this, and a consensus has emerged on how to get the best results.

  • https://review.opendev.org/640898 Adds a new '--amend' option which can update resource provider inventory without requiring the user to pass a full replacement for inventory

Main Themes

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

I started some performance analysis this week. Initially I started working with placement master in a container but as I started making changes I moved back to container-less. What I discovered was that there is quite a bit of redundancy in the code in the objects package that I was able to remove. For example we were creating at least twice as many ProviderSummary objects than required in a situation with multiple request groups. It's likely there would have been more duplicates with more request groups. That's improved in this change, which is at the end of a stack of several other like-minded improvements.

The improvements in that stack will not be obvious until the more complex nested topology is generally available. My analysis was based on that topology.

Not to put too fine a point on it, but this kind of incremental analysis and improvement is something I think we (the we that is the community of OpenStack) should be doing far more often. It is incredibly revealing about how the system works and opportunities for making the code both work better and be easier to maintain.

One outcome of this work will be something like a Deployment Considerations document to help people choose how to tweak their placement deployment to match their needs. The simple answer is use more web servers and more database servers, but that's often very wasteful.

Other Placement

Miscellaneous changes can be found in the usual place.

There is one os-traits changes being discussed. And two os-resource-classes changes.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

I started working with around approximately 20,000 providers this week. Only 980,000 to go.

by Chris Dent at August 02, 2019 12:35 PM

August 01, 2019

Thomas Goirand

My work during DebCamp / DebConf

Lots of uploads

Grepping my IRC log for the BTS bot output shows that I uploaded roughly 244 times in Curitiba.

Removing Python 2 from OpenStack by uploading OpenStack Stein in Sid

Most of these uploads were uploading OpenStack Stein from Experimental to Sid, with a breaking record of 96 uploads in a single day. As the work for Python 2 removal was done before the Buster release (uploads in Experimental), this effectively removed a lot of Python 2 support.

Removing Python 2 from Django packages

But once that was done, I started uploading some Django packages. Indeed, since Django 2.2 was uploaded to Sid with the removal of Python 2 support, a lot of dangling python-django-* needed to be fixed. Not only Python 2 support needed to be removed from them, but often, patches were needed in order to fix at least unit tests since Django 2.2 removed a lot of things that were deprecated since a few earlier versions. I went through all of the django packages we have in Debian, and I believe I fixed most of them. I uploaded 43 times some Django packages, fixing 39 packages.

Removing Python 2 support from non-django or OpenStack packages

During the Python BoF at Curitiba, we collectively decided it was time to remove Python 2, and that we’ll try to do as much of that work as possible before Bullseye. Details of this will come from our dear leader p1otr, so I’ll let him write the document and wont comment (yet) on how we’re going to proceed. Anyway, we already have a “python2-rm” release tracker. After the Python BOF, I then also started removing Python 2 support on a few package with more generic usage. Hopefully, touching only leaf packages, without breaking things. I’m not sure of the total count of packages that I touched, probably a bit less than a dozen.

Horizon broken in Sid since the beginning of July

Unfortunately, Horizon, the OpenStack dashboard, is currently still broken in Debian Sid. Indeed, since Django 1.11, the login() function in views.py has been deprecated in the favor of a LoginView class. And in Django 2.2, the support for the function has been removed. As a consequence, since the 9th of July, when Django 2.2 was uploaded, Horizon’s openstack_auth/views.py is boken. Upstream says they are targeting Django 2.2 for next February. That’s a way too late. Hopefully, someone will be able to fix this situation with me (it’s probably a bit too much for Django my skills). Once this is fixed, I’ll be able to work on all the Horizon plugins which are still in Experimental. Note that I already fixed all of Horizon’s reverse dependencies in Sid, but some of the patches need to be upstreamed.

Next work (from home): fixing piuparts

I’ve already written a first attempt at a patch for piuparts, so that it uses Python 3 and not Python 2 anymore. That patch is already as a merge request in Salsa, though I haven’t had the time to test it yet. What’s remaining to do is: actually test using Puiparts with this patch, and fix debian/control so that it switches to Python 2.

by Goirand Thomas at August 01, 2019 11:34 AM

July 31, 2019

Mirantis

Can we stop pretending everything is going to run in containers?

Containers are not the only technology out there -- and they never will be.

by Nick Chase at July 31, 2019 04:05 PM

Osones

Multi-AZ, remote backend, cinder-volume with OpenStack-Ansible

This article describes a common pattern we've been using at Osones and alter way for our customers deploying OpenStack with OpenStack-Ansible.

This pattern applies to the following context:

  • Multi-site (let's consider two) deployment, each site having its own (remote) block-storage storage solution (could be NetApp or similar, could be Ceph)
  • Each site will be an availability zone (AZ) in OpenStack, and in Cinder specifically
  • The control plane is spread across to the two sites (typically: two controllers on one site, one controller on another site)

Cinder is the OpenStack Block Storage component. The cinder-volume process is the one interacting with the storage backend. With some drivers, such as LVM, the storage backend is local to the node where cinder-volume is running, but in the case of drivers such as NetApp or Ceph, cinder-volume will be talking to a remote storage system. These two different situations imply a different architecture: in the first case cinder-volume will be running on dedicated storage nodes, in the second case cinder-volume can perfectly run along other control-plane services (API services, etc.), typically on controller nodes.

An important feature of Cinder is the fact that it can expose multiple volume types to the user. A volume type translates the idea of different technologies, or at least different settings, different expectations (imagine: more or less performances, more or less replicas, etc.). A Cinder volume type matches a Cinder backend as defined in a cinder-volume configuration. A single cinder-volume can definitely manage multiple backends, and that especially makes sense for remote backends (as defined previously).

Now when one wants to make use of the Cinder availability zones feature, it's important to note that a single cinder-volume instance can only be dedicated to a single availability zone. In other words, you cannot have a single cinder-volume part of multiple availability zones.

So in our multi-site context, each site having its own storage solution - considered remote to Cinder, and with cinder-volume running on the control plane, we'd be tempted to configure one cinder-volume with two backends. Unfortunately due to the limitation mentioned earlier, this is not possible if we want to expose multiple availabilty zones. It is therefore required to have one cinder-volume per availability zone. This is in addition to having cinder-volume running on all the controller nodes (typically: three) for obvious HA reasons. So we would end up with two cinder-volume (one per AZ) on each controller node; that would be six in total.

This is when OpenStack-Ansible and its default architecture comes in handy. OpenStack-Ansible runs most of the OpenStack (and some non-OpenStack as well) services inside LXC containers. When using remote backends, it makes sense to run cinder-volume in LXC containers, on control plane nodes. Luckily, it's as easy with OpenStack-Ansible to run one or many cinder-volume (or anything else, really) LXC containers per host (controller node), using the affinity option.

/etc/openstack_deploy/openstack_user_config.yml example to deploy two (LXC containers) cinder-volume per controller:

storage_hosts:
  controller-01:
    ip: 192.168.10.100
    affinity:
      cinder_volumes_container: 2
  controller-02:
    ip: 192.168.10.101
    affinity:
      cinder_volumes_container: 2
  controller-03:
    ip: 192.168.10.102
    affinity:
      cinder_volumes_container: 2

Then, thanks to the host_vars mechanism, it's also easy to push the specific availability zone configuration as well as the backend configuration to each cinder-volume. For example in the file /etc/openstack_deploy/host_vars/controller-01_cinder_volumes_container-fd0e1ad3.yml (name of the LXC container):

cinder_storage_availability_zone: AZ1
cinder_backends:
  # backend configuration

You end up with each controller being able to manage both storage backends in the two sites, which is quite good from a cloud infrastructure HA perspective, while correctly exposing the availability zone information to the user.

by Adrien Cunin at July 31, 2019 03:00 PM

OpenStack Superuser

OpenStack Homebrew Club: Meet the sausage cloud

Like a lot of engineers, Nick Jones never met a piece of hardware that didn’t spark joy. Or at least enough interest to keep around collecting dust until the next idea sparked.

One night at the pub, the community engineering lead at Mesosphere and former colleague Matt Illingworth realized that if they combined parts, they could build a “very small but serviceable” cloud platform: “A little too much for ‘homelab’ meddling, but definitely enough to do something interesting,” Jones says.

Our Homebrew series highlights how OpenStack powers more than giant global data centers, showing how Stackers are using it at home. Here we’re stretching the definition a little with this deployment is tucked away in a former bunker. Not exactly a  cluster in the closet, but decidedly in line with the hacker-hobbyist spirit.

Matt Illingworth checks the innards.

Now that the flame was lit, the pair ticked over options for where to put it. As luck would have it, a friend had plunked down for a decommissioned nuclear bunker tucked into the southern highlands of Scotland near Comrie. In one epic weekend, these brave hearts drove from Manchester to build a rack, install the hardware, deploy bootstrap and out-of-band infrastructure, configure basic networking and test it enough to manage with some confidence from remote. All in time to drive back 252 miles for their day jobs.

Figuring out what to call their creation kept them occupied from Glasgow to Lancaster.  Illington wanted something that was universally liked (skirting the horror of both vegans and many tech-conference attendees) and settled on sausage. That decision, in turn, flavored the names of the virtual machines: chipolata, hotdog, saveloy, cumberland and bratwurst.

“Anyway, it seemed like a good idea after having been awake for about 18 hours,” Jones says.

Once back home, with just a few more days of work using Canonical’s MAAS and the Kolla project, they had a functioning OpenStack platform up and running — as a public cloud.

“Anyone who still thinks OpenStack is hard to deploy and manage is dead wrong,” Jones says. Some 14 months and two upgrades later (made painless with Kolla, Jones adds) it’s still afloat. But perhaps not for long: they’re looking for folks to chip in to pay for costs and who might be interested in using it, too. (You can get in touch with Jones through his website.)

Superuser asked Jones a few questions about the particulars.

Tell us more about the hardware.

It’s a hobby project so hopefully we won’t be shamed for the pitiful state of the hardware but it’s good enough to be of use!
Right now it’s running on a selection of vintage HP BL460c G6 blades – 10 of them in total at the minute, each with 192GB RAM and a pair of mirrored SSDs. This gives us a reasonable amount of density and serviceable I/O, although they’re power hungry since they’re a very old generation of Xeon. Currently on 1GbE networking but we’re hoping to switch that out to 10GbE soon.

What are you running on it?

In terms of services, aside from the ’standard’ OpenStack services, it also runs Magnum for Kubernetes clusters on demand and Designate for DNS-as-a-Service. The one service we don’t yet run is Cinder so there’s no persistent storage available, but as with the networking upgrade we’re hoping to add a small amount of that in the not-too-distant future, again probably on donated hardware. No object storage either. Given the hardware we’d probably deploy Ceph to take care of both those.

Who’s using it and what are they doing with it?

Most of the users of the platform have found it useful to be able to spin up a handful of pretty big (over 16GB ram) VMs in order to be able to do remote development work. It’s really handy for people who don’t want to run big Devstack or Minikube (for example) clusters on their laptops locally and who’d rather just SSH into somewhere else to do that sort of thing, but not worry about a really expensive bill which would be the case pretty much everywhere else. With enough of us who find such a service useful all clubbing together, it just about covers the costs of running it.

What’s next

Longer-term plans are to put the configuration for the whole platform online and welcome pull requests to add or change configuration for various services – along with comprehensive testing, of course. This would probably appeal to a subset of OpenStack developers who’d like to test how their service runs on a public cloud.

More on the specifics of the deployment on his blog.

Got an OpenStack homebrew story? Get in touch: editorATopenstack.org

All photos courtesy Nick Jones.

The post OpenStack Homebrew Club: Meet the sausage cloud appeared first on Superuser.

by Nicole Martinelli at July 31, 2019 02:01 PM

Aptira

Comparison of Software Defined Networking (SDN) Controllers. Part 7: Comparison and Product Rating

This final part of our Software Defined Networking (SDN) Controller comparison series includes an in-depth evaluation and product rating for each of the most popular Open Source SDN controllers in industry and academia including: the Open Network Operation System (ONOS), OpenDayLight (ODL), OpenKilda, Ryu and Faucet.
It is important to understand the motivations behind the available platforms. Each design has different use cases as usage depends not only on the capability matrix, but also on the cultural fit of the organisation and the project.

Architecture

As with most platforms, there are trade-offs to be considered when comparing a centralised, tightly coupled control plane to a decentralised, scalable and loosely coupled alternative SDN controller.

Centralised architectures such as ONOS and ODL tend to be easier to maintain and confer lower latency between the tightly coupled southbound API, PCE and Northbound APIs. However, as the scale increases, centralised controllers can become a bottleneck. In an SD-WAN context this can increase control plane latency but can be mitigated in a distributed architecture.

Distributed architectures such as OpenKilda and Faucet are generally more complex to maintain and deploy but can allow the platform to scale more effectively. By decoupling the processing of PCE, Telemetry and Southbound interface traffic, each function can be scaled independently to avoid performance bottlenecks. Additionally, specialised tools to handle big datasets, time series databases or path computation at scale become viable without adversely impacting southbound protocol performance.

Ryu is different to the other options, although having a core set of programs that are run as a ‘platform’, it is better thought of as a toolbox, with which SDN controller functionality can be built.

Modularity and Extensibility

The modularity of each controller is governed by the design focus and programming languages. Platforms such as ONOS and ODL have built-in mechanisms for connecting code modules, at the expense of centralising processing to each controller. These two Java-based controllers take advantage of OSGi containers for loading bundles at runtime, allowing a very flexible approach to adding functionality.

Python based controllers such as Ryu provide a well-defined API for developers to change the way components are managed and configured.

Adding functionality to Faucet and OpenKilda is achieved through modifying the systems that make use of their northbound interfaces, such as the Apache Storm cluster or equivalent. This provides the added flexibility of using different tools and languages depending on the problem being solved. Additionally, increasing the complexity of northbound interactions does not negatively impact on the SDN directly.

Scalability

Of the options being considered, only ONOS and ODL contain internal functionality for maintaining a cluster. Each of these platforms is backed by a distributed datastore that shares the current SDN state and allows for controllers to failover in the event of a cluster partition. As new releases of each of the controllers emerge, this functionality looks to be evolving.

OpenKilda approaches cluster scalability in a modular way. While Floodlight is used as a southbound interface to the switch infrastructure, responsibility for PCE and telemetry processing is pushed northward into a completely separate Apache Storm based cluster. Each Floodlight instance is idempotent, with no requirement to share state. The Apache Storm cluster is by design horizontally scalable and allows throughput to be increased by adding nodes.

Both Ryu and Faucet contain no intrinsic clustering capability and require external tools such as Zookeeper to distribute a desired state. With both of these platforms, extra instances of the controller can be started independently as long as the backing configuration remains identical. PCE functionality for these controllers could be pushed down to the instance in the form of modules, or implemented in a similar manner to OpenKilda, backed by a processing cluster of choice.

As the scale of the SDN grows, it becomes untenable for a single localised cluster to handle the load from every switch on the network. Leaving aside geographic distribution of the controllers, breaking the network into smaller logical islands decreases the need for a single southward looking cluster to be massively scalable. With this design, coordination between the islands becomes critical and while a centralised view of the network is still required, the absence of PCE and telemetry processing should not affect data plane stability once flows are configured.

Ryu, Faucet, ODL and ONOS all look to scale in this way by including native BGP routing capabilities to coordinate traffic flows between the SDN islands. Universal PCE and telemetry processing will need to be developed for each of these cases with OpenKilda providing a working reference architecture for achieving this. Due to the state of the documentation for OpenKilda, the BGP will need to be developed.

Interfaces

Considering future compatibility requirements for southbound control, ONOS, ODL and Ryu include protocols beyond just OpenFlow. P4, Netconf and OF-Config could enable additional switch hardware options moving forward should it be required.

The northbound API turns out to be one of the key differentiators between the platforms on offer. ONOS and ODL offer the largest set of northbound interfaces with gRPC and RESTful APIs (among others) available, making them the easiest to integrate. Ryu and OpenKilda offer limited RESTful compared to ONOS and ODL. Faucet takes a completely different approach to applying changes, relying on configuration files to track intended system state instead of instantaneous API calls. This approach will require external tools for dynamically applying configuration but does open the SDN to administration by well-understood CI/CD pipelines and testing apparatus.

Telemetry

One of the primary problems with maintaining an SDN is extracting and using any available telemetry to infer system state and help remediate issues. On this front, ODL lacks functionality, with telemetry still being an experimental module in the latest upstream version. ONOS has modules available to allow telemetry to be used through Grafana or InfluxDB.

Faucet can export telemetry into Influxdb, Prometheus or flat text log files. While Prometheus saves data locally, it can also be federated, allowing centralised event aggregation and processing, while maintaining a local cache to handle upstream processing outages and maintenance.

OpenKilda uses Storm which provides a computation system that can be used for real-time analytics. Storm passes the time-series data to OpenTSDB for storing and analysing. Neo4j, a graph analysis and visualisation platform and provided the PCE functionality initially.

Ryu doesn’t provide any telemetry functionality. This needs to be provide via external tools.

Resilience and Fault Tolerance

The ONOS and ODL platforms implement native clustering as part of their respective offerings. ONOS and ODL provide fault tolerance in the system with an odd number of SDN controllers. In the event of master node failure, a new leader would be selected to take the control of the network. The mechanism of choosing a leader is slightly different in these two controllers, while ONOS focuses on eventually consistent ODL focuses on high availability.

The remaining controllers (OpenKilda, Ryu and Faucet) have no inbuilt clustering mechanism, instead relying on external tools to maintain availability. This simplifies the architecture of the controllers and releases them from the overhead of maintaining distributed databases for state information. High availability is achieved by running multiple, identically configured instances, or a single instance controlled by an external framework that detects and restarts failed nodes.

For Ryu, fault tolerance can be provided by Zookeeper for monitoring the controllers in order to detect controller’s failure and sharding state between cluster members. For Faucet in particular, which is designed to sit in a distributed, shared SDN and be controlled by static configuration files, restarting a controller is a quick, stable exercise that has no reliance on upstream infrastructure once the configuration is written.

Programming Language

ONOS, ODL and OpenKilda are written in Java, for which development resources are abundant in the market, with good supporting documentation and libraries available. While using Java should not be seen as a negative, Java processes can tend to be heavyweight and require resource and configuration management to keep them lean and responsive.

Ryu and Faucet are written in Python, a well-supported language and has an active community developing the framework. The documentation is concise and technical, aimed at developers to maximise the utility of the system. Python is not a fast language and has inherent limitations due to both the dynamic type representations being used and limited multi-threaded capabilities (when compared with Java, Golang or C++).

Community

Both ODL and ONOS benefit from large developer and user communities under the Linux Foundation Networking banner. Many large international players are involved in the development and governance of these projects, which could add to the longevity and security over time. A possible downside is, as with any large project, there are many voices trying to be heard and stability can be impacted by feature velocity. This has occurred with similar projects such as OpenStack in the immediate past.

OpenKilda is a small but active community which can limit the supportability, velocity and features of the platform. OpenKilda needs your support – chat with us to get involved.

Between these two extremes are RYU and Faucet. Both are well supported, targeted controllers. Due to the emerging nature of the field, both options look to have a bright future, with a simpler, streamlined approach to change submission and testing.

Evaluation Scoring Table

Based on the above criteria, we’ve scored each product against each weighted criterion. The results are below:

Criterion Weight ONOS  ODL OK Ryu Faucet
OpenFlow Support        20.0         20.0         19.0         12.0         20.0         20.0
Northbound API support        20.0         20.0         20.0         12.0         16.0           8.0
Southbound API support        10.0         10.0         10.0           6.0           8.0           8.0
Programming Language          5.0           4.0           4.0           4.5           4.5           4.5
Core Components features / services          5.0           4.5           4.5           3.5           2.0           3.5
Native Clustering Capabilities        10.0           9.0           7.0         10.0           2.0           5.0
Typical Architecture          3.0           2.7           2.4           2.7           2.4           2.7
Horizontal Scalability          5.0           3.5           3.0           4.5           1.0           4.0
Vertical Scalability          5.0           3.5           3.0           5.0           4.5           0.5
Extensibility          2.0           1.8           1.6           1.8           1.8           1.6
Community Size & Partnerships          5.0           4.5           4.5           1.0           4.5           3.5
Resilience and Fault Tolerance          5.0           4.0           3.0           4.5           4.0           4.5
Operations Support          5.0           4.5           2.5           4.0           2.5           3.5
Weighted Score 100 92 84.5 71.5 73.2 69.3

Product Rating

Based on our weighted criteria-based scoring, the evaluation ranks the products as per the below table:

Rank Product Score
1 ONOS  92.0%
2 ODL 84.5%
3 Ryu 73.2%
4 OK 71.5%
5 Faucet 69.3%

Conclusion

This effort spent investigating the current Software Defined Networking (SDN) Controller platforms can be used to provide insight for users into available Open Source SDN controllers. This might help them to choose the right SDN controller for their platform which match their network design and requirements.

SDN Controller Comparisons:

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Comparison of Software Defined Networking (SDN) Controllers. Part 7: Comparison and Product Rating appeared first on Aptira.

by Farzaneh Pakzad at July 31, 2019 01:02 PM

Chris Dent

Profiling Placement in Docker

Back in March, I wrote Profiling WSGI Apps, describing one way to profile the placement service. It was useful enough that a version of it was added to the docs.

Since then I've wanted something a bit more flexible. I maintain a container for placement on Docker hub. When I want to profile recent master instead of code in a proposed branch, using a container can be tidier. Since this might be useful to others I thought I better write it down.

Get and Confirm the Image

Make sure you are on a host with docker and then get the latest version of placedock:

docker pull cdent/placedock

We can confirm the container is going to work with a quick run:

docker run -it --env OS_PLACEMENT_DATABASE__SYNC_ON_STARTUP=True \
               --env OS_API__AUTH_STRATEGY=noauth2 \
               -p 127.0.0.1:8080:80 \
               --rm --name placement cdent/placedock

In another terminal check it is working:

curl -s http://127.0.0.1:8080/ |json_pp

should result in something similar to:

{
   "versions" : [
      {
         "min_version" : "1.0",
         "links" : [
            {
               "href" : "",
               "rel" : "self"
            }
         ],
         "status" : "CURRENT",
         "max_version" : "1.36",
         "id" : "v1.0"
      }
   ]
}

ctrl-c in the terminal that is running the container. We don't want to use that one because we need one that will persist data properly.

Dockerenv for Convenience

To enable persistence, a dockerenv file will be used to establish configuration settings. The one I use, with comments:

# Turn on debug logging
OS_DEFAULT__DEBUG=True
# Don't use keystone for authentiation, instead pass
# 'x-auth-token: admin' headers
OS_API__AUTH_STRATEGY=noauth2
# Make sure the database has the right tables on startup
OS_PLACEMENT_DATABASE__SYNC_ON_STARTUP=True
# Connection to a remote database. The correct database URI depends on
# your environment.
OS_PLACEMENT_DATABASE__CONNECTION=postgresql+psycopg2://cdent@192.168.1.76/placement?client_encoding=utf8
# The directory where profile output should go. Leave this commented until
# sufficient data is present for a reasonable test.
# OS_WSGI_PROFILER=/profiler

Create your own dockerenv file, set the database URI accordingly (if you're not sure about what this might be, Quick Placement Development may be useful), and start the container back up. This time we will use the dockerenv file and put the container in the background:

docker run -idt -p 127.0.0.1:8080:80 \
       --env-file dockerenv \
       --rm --name placement \
       -v /tmp/profiler:/profiler \
       cdent/placedock

We've added a volume so that, eventually, profiler output can be saved to the disk of your container host, rather than the container itself. For the time being profiling is turned off because we don't want to slow down the system while adding data to the system.

If you want to, confirm things are working again with:

curl -s http://127.0.0.1:8080/ |json_pp

Load Some Data

In some cases you don't need pre-existing data when profiling. If that's the case, you can skip this step. What you need to use to set up data may be very different from what I'm doing here.

Loading up the service with a bunch of data can be accomplished in various ways. I use a combination of shell scripts and gabbi. The shell script is responsible for the dynamic data while gabbi is responsible the static structure. Some pending changes use the same system for doing some performance testing for placement. We will borrow that system. To speed things up a bit we'll use parallel.

seq 1 100 | \
   parallel "./gate/perfload-nested-loader.sh http://127.0.0.1:8080 gate/gabbits/nested-perfload.yaml"

Note: This will not work on a mac if you are using the built in uuidgen. You may need a pipe to tr [A-Z] [a-z].

You can see how many providers you've created with a request like:

curl -s -H 'x-auth-token: admin' \
     -H 'openstack-api-version: placement latest' \
     http://127.0.0.1:8080/resource_providers | \
     json_pp| grep -c '"name"'

Once you have a sufficient number of resource providers and anything else you might like (such as allocations) in the system, you can start profiling.

Profiling

When we were loading data we had profiling turned off. Now we'd like to turn it on. Edit the dockerenv file to uncomment the OS_WSGI_PROFILER line and then docker kill placement and run the container again (using the same args as above). A restart will not work because we've change the environment.

Make a query, such as:

curl http://127.0.0.1:8080/

and look in /tmp/profiler for the profile output, the filename should looking something like this:

GET.root.19ms.1564579909.prof

If you have snakeviz installed you can inspect the profiling info from a browser (as described in the previous blog post:

snakeviz GET.root.19ms.1564579909.prof

That's not a very interesting request. It doesn't exercise much of the code nor access the database. If the system has been loaded with data as above, the following will query it:

curl -H 'x-auth-token: admin' \
     -H 'openstack-api-version: placement latest' \
"http://127.0.0.1:8080/allocation_candidates?\
resources=DISK_GB:10&\
required=COMPUTE_VOLUME_MULTI_ATTACH&\
resources_COMPUTE=VCPU:1,MEMORY_MB:256&\
required_COMPUTE=CUSTOM_FOO&\
resources_FPGA=FPGA:1&\
group_policy=none&\
same_subtree=_COMPUTE,_FPGA"

and then:

snakeviz GET.allocation_candidates.792ms.1564581384.prof

snakeviz sunburst

by Chris Dent at July 31, 2019 01:00 PM

StackHPC Team Blog

CloudKitty and Monasca: OpenStack charging without Telemetry

CloudKitty and Monasca project mascots

Tracking resource usage, and charging for it, is a requirement for many cloud deployments. Public clouds obviously need to bill their customers, but private clouds can also use chargeback and showback policies to encourage more efficient use of resources. In the OpenStack world, CloudKitty is the standard rating solution. It works by applying rating rules, which turn metric measurements into rated usage information.

For several years, gathering metrics in OpenStack has been implemented by two separate project teams: Telemetry and, more recently, Monasca. The future of Telemetry, which produces the Ceilometer software, is uncertain: historical contributors have stopped working on the project and its de-facto back end for measurements, Gnocchi, is also seeing low activity. Although Telemetry users have volunteered to maintain the project, the Monasca project appears to be healthier and more active.

Since deploying Monasca is our preferred choice to monitor OpenStack, we asked ourselves: can we use CloudKitty to charge for usage without deploying a full Telemetry software stack?

Ceilometer + Monasca = Ceilosca

Ceilometer is well integrated in OpenStack and can collect usage data from various OpenStack services, either by polling or listening for notifications. Ceilometer is designed to publish this data to the Gnocchi time series database for storage and querying.

In Monasca, metrics collected by the Monasca Agent focus more on monitoring the health and performance of the infrastructure and its services, rather than resource usage from end users (although it can gather instance metrics via the Libvirt plugin). Monasca stores these metrics in a time series database, with support for InfluxDB and Cassandra.

Despite this, we are not required to deploy and maintain Gnocchi just to collect usage data via Ceilometer: monasca-ceilometer, also known as Ceilosca, enables Ceilometer to publish data to the Monasca API for storage in its metrics database. Although Ceilosca currently lives in its own repository and must be installed by adding it to the Ceilometer source tree, there is an ongoing effort to integrate it directly into Ceilometer.

By default, Ceilosca will push several metrics based on instance detailed information, such as disk.root.size, memory, and vcpus, to Monasca under the service tenant. Each metric will be associated with a specific instance ID via the resource_id dimension. Metric dimensions also include user and project IDs. For example, to retrieve metrics associated with the p3 project, we can use the Monasca Python client:

monasca metric-list \
--tenant-id $(openstack project show service -c id -f value) \
--dimensions project_id=$(openstack project show p3 -c id -f value)

Once stored in Monasca, these metrics can be used by CloudKitty, thanks to the inclusion of a Monasca collector since the Queens release.

Let's see how we can apply a charge to the vcpus metric. We need to configure CloudKitty with the metrics.yml file to know about our metric:

metrics:
  vcpus:
    unit: vcpus
    groupby:
      - resource_id
    extra_args:
      resource_key: resource_id

Then, we configure the hashmap rating rules to apply a rate to CPU usage. We create a vcpus service and then create a mapping with a cost of 0.5 per CPU hour:

$ cloudkitty hashmap service create vcpus
+-------+--------------------------------------+
| Name  | Service ID                           |
+-------+--------------------------------------+
| vcpus | cb72cd89-43ef-46b9-b047-58e0b5335992 |
+-------+--------------------------------------+
$ cloudkitty hashmap mapping create 0.5 -s cb72cd89-43ef-46b9-b047-58e0b5335992 -t flat
+--------------------------------------+-------+------------+------+----------+--------------------------------------+----------+------------+
| Mapping ID                           | Value | Cost       | Type | Field ID | Service ID                           | Group ID | Project ID |
+--------------------------------------+-------+------------+------+----------+--------------------------------------+----------+------------+
| 68465dad-7c68-4f8e-a256-6a62735c1e3b | None  | 0.50000000 | flat | None     | cb72cd89-43ef-46b9-b047-58e0b5335992 | None     | None       |
+--------------------------------------+-------+------------+------+----------+--------------------------------------+----------+------------+

We then launch an instance. Once the instance becomes active, a notification is processed by Ceilometer and published to Monasca, recording that instance b7d926a8-cd63-4205-8f90-e3c610aeaad5 has 64 vCPUs.

$ monasca metric-statistics --tenant-id $(openstack project show service -c id -f value) vcpus avg "2019-07-30T14:00:00" --merge_metrics --group_by resource_id --period 1
+-------+---------------------------------------------------+----------------------+--------------+
| name  | dimensions                                        | timestamp            | avg          |
+-------+---------------------------------------------------+----------------------+--------------+
| vcpus | resource_id: b7d926a8-cd63-4205-8f90-e3c610aeaad5 | 2019-07-30T14:43:01Z |       64.000 |
+-------+---------------------------------------------------+----------------------+--------------+

With the default Kolla configuration, Nova also sends a report notification every hour, which is also stored in Monasca. Similarly, when an instance is terminated, a notification is published and converted into a final measurement in Monasca. However, using the default CloudKitty configuration, every instance measurement is interpreted as if the associated instance ran for the whole hour. For example, an instance launched at 10:45 and terminated at 11:15 would result in two whole hours being charged, instead of just 30 minutes. This can be mitigated by reducing the [collect]/period setting in cloudkitty.conf, for example down to one minute, and adjusting the charge rate to match the new period. For this approach to work, we need to have at least one measurement stored for each period. This isn't possible with audit notifications sent by Nova because one hour is the lowest possible period. An alternative is to rely on continously updated metrics collected by Ceilometer, such as CPU utilisation. However, this kind of Ceilometer metrics is unavailable in our bare metal environment.

Once CloudKitty has analysed usage metrics, we can extract rated data to CSV format. As can be seen below, two whole hours have been charged for 0.5 each. In this case, the instance had been launched around 14:45 and terminated around 15:20. We have compared using pure Ceilometer and Gnocchi instead of Ceilosca and Monasca and noticed the exact same issue.

$ cloudkitty dataframes get -f df-to-csv --format-config-file cloudkitty-csv.yml
Begin,End,Metric Type,Qty,Cost,Project ID,Resource ID,User ID
2019-07-30T14:00:00,2019-07-30T15:00:00,vcpus,64.0,32.0,35be5437552f40cba2aa6e5cb47df613,b7d926a8-cd63-4205-8f90-e3c610aeaad5,53ed408e5a7a4e79baa76803e1df61d6
2019-07-30T15:00:00,2019-07-30T16:00:00,vcpus,64.0,32.0,35be5437552f40cba2aa6e5cb47df613,b7d926a8-cd63-4205-8f90-e3c610aeaad5,53ed408e5a7a4e79baa76803e1df61d6

A downside of using Ceilosca instead of Ceilometer with Gnocchi is that metadata such as instance flavour is not available for CloudKitty to use for rating by default, at least in the Rocky release that we used. We will update this post if we can develop a configuration for Ceilosca that supports this feature.

OpenStack usage metrics without Ceilometer

Monasca has plans to capture OpenStack notifications and store them with the Monasca Events API, although this is not yet implemented. CloudKitty would require changes to support charging based on these events, since it is currently designed around metrics. It is worth pointing out that an ElasticSearch storage driver has just been proposed in CloudKitty, so these two new designs may line up in the future.

In the meantime, an alternative is to bypass Ceilometer completely and rely on another mechanism to publish metrics to Monasca. As mentioned earlier in this article, Monasca can provide instance metrics via the Libvirt plugin. However, this won't cover other services for which we may want to charge, such as volume usage.

Since the Monasca Agent can scrape metrics from Prometheus exporters, we are exploring whether we can leverage openstack-exporter to provide metrics to be rated by CloudKitty. Stay tuned for the next blog post on this topic!

by Pierre Riteau at July 31, 2019 08:50 AM

Colleen Murphy

How to get work done in open source

After working in open source for a while, I've been on both sides of the code submission dance: proposing a change, and reviewing a change.

Proposing a change to a project can feel harrowing: you don't know how the maintainers are going to respond to it, you don't know whether …

by Colleen Murphy at July 31, 2019 03:00 AM

July 30, 2019

OpenStack Superuser

Building a virtuous circle with open infrastructure: Inclusive, global, adaptable

Technology is constantly evolving. From couch surfing to surfing the web, creative problem solving either rides the wave in real time or gets towed under.

This is where open infrastructure is barreling ahead.

Take, for example, traffic. Great for websites – not so much for commutes. Just as a city expands roads and adds stoplights to meet the needs of a growing population, technology infrastructure demands the same type of thinking. Adaptability is key to keep things moving

These two words — open infrastructure — carry a lot of meaning. Let’s break them down in context.

Open refers to open-source components, meaning that the source code is available to anyone to use, study and share with others. Open source is becoming increasingly important to organizations worldwide. Why? Because they’re not locked into closed proprietary boxes, people can build on them and tailor them as needed, with the freedom and flexibility to innovate more effectively. It also allows them to find and create custom solutions faster than the market can provide.

But open source is only half of this equation. As Mark Collier, OpenStack Foundation COO says, “Open source is required, but it’s not enough.” It rests on a base of infrastructure.

Infrastructure is the backbone that supports hardware, software, networks, data centers, facilities and related equipment used to develop, test, operate, monitor, manage or support information technology services.

In its simplest form, Open Infrastructure is IT infrastructure built from open-source technologies, available for all users to work with, to improve and contribute back.

To ensure all these users benefit from open-source software, are able to engage with the community and chart the future course for its development, the Four Opens were set out in early 2010 as guiding principles.

  • Open source – all software developed must be done under an open source license. The software must be usable and scalable for all users anywhere in the world and cannot be feature or performance limited.
  • Open community – the OpenStack community is a level playing field, where anyone can rise or get elected to leadership positions. There is no reserved seat: contribution is the only valid currency.
  • Open development – all development processes are transparent and inclusive. Everyone is welcome to participate and suggestions are considered regardless of prior levels of contribution.
  • Open design – design is not done behind closed doors. It is done in the open, and includes as many people as possible.

These principles have been fundamental to making the OpenStack community what it is today. Currently, the community is drafting blog posts around the Four Opens chronicling the learnings and successes from this approach to share with the broader Open Infrastructure ecosystem. If you want to get involved, you can do so here.

“From the beginning, OpenStack knew there would be a need to interact and integrate with external projects,” says Allison Randal, a board member of OpenStack Foundation since 2012, in an Open Infra Summit lightning talk. That said, as the number of adjacent use cases expanded — from edge, CI/CD, containers and more — the landscape also shifted. The shift was towards offering broader solutions to organizations and projects, supporting their overall IT infrastructure needs.

As the projects grew in number and in size, it became clearer that the ongoing success of OpenStack and these projects were interdependent. This made the shift in focus even more natural and further aligned OpenStack with the goal of supporting and strengthening community.

To further encourage collaboration and boost inclusivity, the OpenStack Summit changed its name to the  Open Infrastructure Summit. The Summit provides common ground from all corners of the community — from 5G to hybrid cloud and dozens of adjacent open-source projects — and the name change reflects this. The goal is to make the summit more open and welcoming to all projects and invite everyone to learn alongside the people building and operating open infra.

Recognizing the shift in open infrastructure, the OSF is embracing it by supporting the communities helping to shape this movement. The OSF also recognizes that no single technology solution is going to support this transition and the integration and knowledge sharing around these open technologies is key to successful implementation.

As our world continues to evolve, open infrastructure will adapt and evolve with it.

Superuser is always interested in community content. Got something to say? Get in touch: editorATopenstack.org

The post Building a virtuous circle with open infrastructure: Inclusive, global, adaptable appeared first on Superuser.

by Ashleigh Gregory and Nicole Martinelli at July 30, 2019 02:04 PM

Aptira

Comparison of Software Defined Networking (SDN) Controllers. Part 6: Faucet

Comparison of Software Defined Networking (SDN) Controllers. Faucet

The final Open Source Software Defined Networking (SDN) Controller to be compared in this series is Faucet. Built on top of Ryu, Faucet is a lightweight SDN Controller adding a critical northbound function for operations teams.

Faucet is a compact open source OpenFlow controller, which enables network operators to run their networks the same way they do server clusters. Faucet moves network control functions (like routing protocols, neighbor discovery, and switching algorithms) to vendor independent server-based software, versus traditional router or switch embedded firmware, where those functions are easy to manage, test, and extend with modern systems management best practices and tools.

Architecture

As shown in the figure below, architecturally, each Faucet instance has two connections to the underlying switches. One for control and configuration updates, the other (Gauge) is a read-only connection specifically for gathering, collating and transmitting state information for processing elsewhere using Influxdb or Prometheus.

Comparison of Software Defined Networking (SDN) Controllers. Faucet Diagram

Modularity and Extensibility

Python based controllers provide a well-defined API for developers to change the way components are managed and configured.

Adding functionality to Faucet is achieved through modifying the systems that make use of its Northbound interfaces. This provides the added flexibility of using different tools and languages depending on the problem being solved. Additionally, increasing the complexity of northbound interactions does not negatively impact the SDN directly.

Scalability

Faucet is designed to be deployed at scale such that each instance is close to the subset of switches under its control. Each instance of Faucet is self-contained and can be deployed directly to server hardware or through containers, moving the administration back into well understood areas of automation.

Due to the lightweight nature of the code and the smaller control space for each instance, no clustering is required – each instance is completely idempotent and concerns itself with only what it is configured to control.

Cluster Scalability

  • Faucet contains no intrinsic clustering capability and requires external tools such as Zookeeper to distribute state if this is desired. Extra instances of the controller can be started independently as long as the backing configuration remains identical.
  • PCE functionality for these controllers could be pushed down to the instance in the form of modules, or implemented in a similar manner to OpenKilda, backed by a processing cluster of choice.

Architectural Scalability

  • It does not yet support a cooperative cluster of controllers.

Interfaces

  • Southbound: It supports multiple southbound protocols for managing devices, such as OpenFlow, VLANs, IPv4, IPv6, static and BGP routing, port mirroring, policy-based forwarding and ACLs matching.
  • Northbound: YAML configuration files track the intended system state instead of instantaneous API calls, requiring external tools for dynamically applying configuration. However, it does open the SDN to administration by well-understood CI/CD pipelines and testing apparatus.

Telemetry

Faucet can export telemetry into Influxdb, Prometheus or flat text log files. While Prometheus saves data locally, it can also be federated, allowing centralised event aggregation and processing, while maintaining a local cache to handle upstream processing outages and maintenance.

Resilience and Fault Tolerance

Faucet has no inbuilt clustering mechanism, instead relying on external tools to maintain availability. High availability is achieved by running multiple, identically configured instances, or a single instance controlled by an external framework that detects and restarts failed nodes.

For Faucet in particular, which is designed to sit in a distributed, shared SDN and be controlled by static configuration files, restarting a controller is a quick, stable exercise that has no reliance on upstream infrastructure once the configuration is written.

Programming Language

Faucet is written in Python.

Community

Faucet has an active community developing the framework and it is well supported.

Conclusion

Faucet is configured via a YAML file, which makes it a suitable option for CI/CD and testing environments. Faucet uses Prometheus for telemetry processing while other components such as PCE needs to be developed.
This is the last controller we will evaluate as part of this series. The next post will include a scored rating and detailed evaluation for each SDN controller.

SDN Controller Comparisons:

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Comparison of Software Defined Networking (SDN) Controllers. Part 6: Faucet appeared first on Aptira.

by Farzaneh Pakzad at July 30, 2019 01:32 PM

July 29, 2019

OpenStack Superuser

Beating the learning curve at OSCON

PORTLAND, Ore. — When I registered for this year’s Open Source Conference (OSCON), which was also my first, I selected four tutorials as a part of my ticket. Options ranged from making art with open-source libraries to database management. Many of the sessions looked like the winning card for buzzword bingo: Blockchain! machine learning! serverless! After some deliberation, I went with hands-on sessions about Rust, p5,js, building an AI assistant and constructing a programming language.

First day jitters

Leading up to the event, participants were asked to have all the prerequisites set up for the tutorials. We got daily reminders. Daily. For each tutorial. This was annoying to say the least, especially since many of the tutorials told you to clone some git repo and then they didn’t include a link to it in the daily email. There was no way to opt out when you had completed the requirements either.

Monday morning,  I arrived bright and early at the venue courtesy MAX. I’d expected the light rail system jammed with conference goers (think: the rib-crushing crowds on trains to FOSDEM) and was pleasantly surprised when it wasn’t. I walked right up to registration, typed in my email address, looked up my ticket and got my badge printed in a flash. (For people still intent on making QR codes good for something, you could log in that way,too.) You were offered a weekly MAX ticket along with your voucher for the conference t-shirt and conference book (it is O’Reilly, after all).

The first tutorial was great! I learned the basics of Rust via a fun lab that involved sword fighting. The instructor was deft at breaking up the material into smaller topics complete with examples before jumping into exercises in the lab covering the new material.

If only the second one had been a bit less dry. It sounded promising: build an AI assistant that you could interact with using the open-source project Rasa. The tutorial was essentially ‘teaching’ the AI assistant by listing thousands of example inputs into the config. The more examples you provide, the more accurate the response of the assistant. It was less engaging than the earlier tutorial, but the instructor was much newer to teaching than the previous one. With a few more repetitions, this could improve.

Tuesday, it was time for the next two hands-on sessions: Processing Foundation’s p5.js project and building a programming language. Despite all the preparatory nudges, it wasn’t clear to me that it was basically a refresher for a few of my college classes  (I had a visualization class that used processing and a C++ class where we built a natural language processor). That said, both instructors were very good. The first tutorial was similar to that of the previous day where some slides walked through particular aspects of the language and some examples before offering a wider view on a larger project to apply the knowledge. The afternoon was a little more continuous and sans slides.

General assembly

The next two days kicked off with keynotes before hitting a roster of presentations. As carefully staged as these performances are, you can’t control everything: Wednesday morning’s keynotes were interrupted by a fire alarm. OSCON organizers still managed to end on time with only a few small changes to the keynote schedules – most speakers had their time cut a few minutes across both days and one keynote got bumped to day two.

The content of the keynotes split among two main themes: the importance of community and how it adds value, stability and marketability to any open-source project and how open source is part of the future for most businesses (hopefully not the entire business plan, but definitely playing a role.) I found myself agreeing with many of the key messages and appreciating the general rallying cry to open source and not one specific project or foundation. A single project won’t solve all the industry’s problems just as a single foundation is not the best home for all projects.

I crammed my agenda with sessions on open-source community, governance models in open source and themes like how to be a good community member etc. Nothing was earth-shattering or exactly new, but it’s always a pleasure to see a lot of good speakers share their experiences and observations. I also really enjoyed that many of the things to aspire to/good community traits/best practices are already built into the OpenStack community. It made me appreciate the stability of our community and the efforts of those who’ve come before me.

Overall, I’d give the event a A- or B+. OSCON brought together a cool mix of open-source projects from many foundations on a relatively level playing field.

Next year, you’ll still find me haunting the halls — even if my talks don’t get picked again, ahem — and I hope next time OpenStack, Kata,  Zuul, Airship and StarlingX can have a larger presence there.

The post Beating the learning curve at OSCON appeared first on Superuser.

by Kendall Nelson at July 29, 2019 02:01 PM

Aptira

Comparison of Software Defined Networking (SDN) Controllers. Part 5: Ryu

Comparison of Software Defined Networking (SDN) Controllers. Ryu

Our Open Source Software Defined Networking (SDN) Controller comparison continues with Ryu. Ryu is a very different proposition to the other options being put forward. Although boasting a core set of programs that are run as a ‘platform’, Ryu is better thought of as a toolbox, with which SDN controller functionality can be built.

Ryu is a component-based software defined networking framework. It provides software components with well defined API that make it easy for developers to create new network management and control applications. Ryu means “flow” in Japanese and is pronounced “ree-yooh”.

Architecture

A Ryu SDN controller composes of these components:

Comparison of Software Defined Networking (SDN) Controllers. Ryu Diagram
  • Southbound interfaces allow communication of SDN switches and controllers
  • Its core supports limited applications (e.g. Topology discovery, Learning switch) and libraries
  • External applications can deploy network policies to data planes via well-defined northbound APIs such as REST

Modularity and Extensibility

Ryu is structured differently from other solutions in that it provides simple supporting infrastructure that users of the platform must write code to utilise as desired. While this requires development expertise, it also allows complete flexibility of the SDN solution.

Scalability

Ryu does not have an inherent clustering ability and requires external tools to share the network state and allow failover between cluster members.

Cluster Scalability

  • External tools such as Zookeeper distribute a desired state. Extra instances of the controller can be started independently as long as the backing configuration remains identical.

Architectural Scalability

  • While Ryu supports high availability via a Zookeeper component, it does not yet support a co-operative cluster of controllers.

Interfaces

  • Southbound: It supports multiple southbound protocols for managing devices, such as OpenFlow, NETCONF, OF-Config, and partial support of P4
  • Northbound: Offer RESTful APIs only, which are limited compared to ONOS and ODL

Telemetry

Ryu doesn’t provide any telemetry functionality. This needs to be provided via external tools.

Resilience and Fault Tolerance

Ryu has no inbuilt clustering mechanism, instead relying on external tools to maintain availability. High availability is achieved by running multiple, identically configured instances, or a single instance controlled by an external framework that detects and restarts failed nodes.

Fault tolerance can be provided by Zookeeper for monitoring the controllers in order to detect controller’s failure and sharding state between cluster members.

Programming Language

Ryu is written in Python.

Community

An active community developing the framework, it is a well supported and targeted controller.

Conclusion

Ryu is like a toolbox with software components, which provides the SDN controller functionality. It has a support of various southbound interfaces for managing network devices. It is very popular in academia and has been used in OpenStack as a Network controller.
Next, we will be evaluating Faucet.

SDN Controller Comparisons:

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Comparison of Software Defined Networking (SDN) Controllers. Part 5: Ryu appeared first on Aptira.

by Farzaneh Pakzad at July 29, 2019 01:19 PM

July 26, 2019

Ed Leafe

Why OpenStack Failed, or How I Came to Love the Idea of a BDFL

OK, so the title of this is a bit clickbait-y, but let me explain. By some measures, OpenStack is a tremendous success, being used to power several public clouds and many well-known businesses. But it has failed to become a powerful player in the cloud space, and I believe the reason is not technical in … Continue reading "Why OpenStack Failed, or How I Came to Love the Idea of a BDFL"

by ed at July 26, 2019 04:56 PM

Chris Dent

Placement Update 19-29

Welcome to a rushed pupdate 19-29. My morning was consumed by other things.

A reminder: The Placement project holds office hours every Wednesday at 1500 UTC in the #openstack-placement IRC channel. If you have a topic that needs some synchronous discussion, then is an ideal time. Just start talking!

Most Important

The two main things on the Placement radar are implementing Consumer Types and cleanups, performance analysis, and documentation related to nested resource providers.

What's Changed

  • The api-ref has moved to docs.openstack.org from developer.openstack.org. Redirects are in place.

  • Both traits and resource classes are now cached per request, allowing for some name to id and id to name optimizations.

  • A new zuul template is being used in placement that means fewer irrelevant tempest tests are run on placement changes.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 21 (-1) stories in the placement group. 0 (0) are untagged. 2 (0) are bugs. 5 (0) are cleanups. 10 (0) are rfes. 4 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 12 microversions.

  • https://review.opendev.org/666542 Add support for multiple member_of. There's been some useful discussion about how to achieve this, and a consensus has emerged on how to get the best results.

Main Themes

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

Other Placement

Miscellaneous changes can be found in the usual place.

There are two os-traits changes being discussed. And zero os-resource-classes changes (yay!).

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

If we get the chance, it will be interesting to start working with placement with 1 million providers. Just to see.

by Chris Dent at July 26, 2019 03:40 PM

OpenStack Superuser

How to run a packaged function with Qinling

In my previous post about Qinling I explained how to run a simple function, how to get the output returned by this one and what Qinling really does behind the scenes from a high-level perspective.

Here I’ll explain how to run a packaged function including external Python libraries from PyPi (the Python Package Index), or your own repository or even directly from the code itself like a sub-package or other options.

The main difference between a simple function and a packaged function is that with a simple function you are limited with the libraries/packages installed within the runtime, from a serverless perspective.

Most of the time, the only the built-in packages available (JSON, HTTP, etc…) allow you to do the basics but will constrain your creativity — and we don’t want that!

Qinling can distinguish the difference between the two when you create the creation.

A function a bit more complex this time

Compared to previous post, this function will be a little bit more complex — but not too much, don’t worry. It’s written in Python 3, so just to reiterate, you’ll need Python 3 runtime.

This function will just return information about a CIDR, by default no argument is required but I’ll explain how to override the default one by using the openstack function create execution.

import json
from IPy import IP


def details(cidr="192.168.0.0/24", **kwargs):
    network = IP(cidr)
    version = network.version()
    iptype = network.iptype().lower()
    reverse = network.reverseName()
    prefix = network.prefixlen()
    netmask = str(network.netmask())
    broadcast = str(network.broadcast())
    length = network.len()

    payload = {"ip_version": version, "type": iptype, "reverse": reverse,
               "prefix": prefix, "netmask": netmask, "broadcast": broadcast,
               "length": length, "cidr": cidr}

    print("----------------------")
    print("Function:", details.__name__)
    print("JSON payload:", payload)
    print("----------------------\n")

    return build_json(payload)


def build_json(data):
    indentation_level = 4

    print("----------------------")
    print("Function:", build_json.__name__)
    print("JSON options")
    print("  - indentation:", indentation_level)
    print("  - sort: yes")
    print(json.dumps(data, sort_keys=True, indent=indentation_level))
    print("----------------------")

    return data

The important part in this code is line 2, the import of IPy library which doesn’t exist in the runtime. If this code is uploaded like that, then the function execution will fail.

To make this work, the library needs to be at the same level as the ip_range.py file.

$ mkdir ~/qinling
$ wget -O ~/qinling/ip_range.py https://git.io/fj0SQ
$ pip install IPy -t ~/qinling

The ~/qinling directory should looks like this after the previous commands:

$ ip_range.py  IPy-1.0.dist-info  IPy.py  __pycache__

Just a quick warning: the pip command used should be the same version as the one from the runtime, if not some surprises are expected.

The next step is to generate an archive. Qinling has a restriction on the format of the archive, it has to be a ZIP archive generated with the zip command[1].

$ cd ~/qinling/
$ zip -r9 ~/qinling/ip_range.zip ~/qinling/

Run the best function ever ^^

As mentioned above, Qinling has a mechanism to determine whether you’re running a package or not. There are four options available:

  • file: used only with a file, hello_qinling.py
  • package: used only with a ZIP archive, ip_range.zip
  • container/object: will be discussed in a different Medium post
  • image: will be discussed in a different Medium post

So, did you guess which one will be the winner this time? Well… package!

The file option is kind of a “wrapper,” based on python-qinlingclient code[2] when this option is selected then the client get the filename, remove the extension and create a ZIP archive.

$ openstack function create --name func-pkg-1 --runtime python3 --entry ip_range.details --package ~/qinling/ip_range.zip

If the wrong option is used let say --file for a package then the function will not be executed properly and an error will be raised. When the function is properly created, the execution will return something like that as output value.

$ openstack function execution create 1030e1ea-2374-40a7-bfbe-216bc5966f55
| result           | {"duration": 0.036, "output": "{
    "broadcast": "192.168.0.255",
    "cidr": "192.168.0.0/24",
    "ip_version": 4,
    "length": 256,
    "netmask": "255.255.255.0",
    "prefix": 24,
    "reverse": "0.168.192.in-addr.arpa.",
    "type": "private"
}"} |

In the function, there are few print used mostly for a learning purpose, the output will be available only via the openstack function execution log show command.

$ openstack function execution log show 5f2e7d71-7b26-4ab7-9e1a-854d8850e738
Start execution: 5f2e7d71-7b26-4ab7-9e1a-854d8850e738
----------------------
Function: details
JSON payload: {'ip_version': 4, 'type': 'private', 'reverse': '0.168.192.in-addr.arpa.', 'prefix': 24, 'netmask': '255.255.255.0', 'broadcast': '192.168.0.255', 'length': 256, 'cidr': '192.168.0.0/24'}
--------------------------------------------
Function: build_json
JSON options
  - indentation: 4
  - sort: yes
{
    "broadcast": "192.168.0.255",
    "cidr": "192.168.0.0/24",
    "ip_version": 4,
    "length": 256,
    "netmask": "255.255.255.0",
    "prefix": 24,
    "reverse": "0.168.192.in-addr.arpa.",
    "type": "private"
}
----------------------
Finished execution: 5f2e7d71-7b26-4ab7-9e1a-854d8850e738

What do you think? Pretty nice, right?

Change the default CIDR value

As mentioned previously, no argument is required to execute the function. By default, the classless inter-domain routing has been hardcoded to 192.168.0.0/24 but what if you want to change it? Maybe you want to update the code, create a function, or do something else.

The solution is to use the --input option and provide a JSON hash on this one.

$ openstack function execution create 1030e1ea-2374-40a7-bfbe-216bc5966f55 --input '{"cidr": "10.0.0.0/10"}'
| result           | {"duration": 0.035, "output": "{
    "broadcast": "10.63.255.255",
    "cidr": "10.0.0.0/10",
    "ip_version": 4,
    "length": 4194304,
    "netmask": "255.192.0.0",
    "prefix": 10,
    "reverse": "0-255.10.in-addr.arpa.",
    "type": "private"
}"} |

Now run the openstack function execution log show command to see the differences between the two CIDR.

Conclusion

I’ve just demonstrated a packaged function, how to pass argument to the function and how to get the output. My journey continues…To infinity and beyond!

Resources

 

About the author

Gaëtan Trellu is a technical operations manager at Ormuco. This post first appeared on Medium.

 

Superuser is always interested in open infra community topics, get in touch at editorATopenstack.org

 

Photo // CC BY NC

The post How to run a packaged function with Qinling appeared first on Superuser.

by Gaëtan Trellu at July 26, 2019 02:02 PM

Aptira

Comparison of Software Defined Networking (SDN) Controllers. Part 4: OpenKilda

Aptira Comparison of Software Defined Networking (SDN) Controllers. OpenKilda

Our Open Source Software Defined Networking (SDN) Controller comparison continues with OpenKilda. OpenKilda is a Telstra developed OpenFlow based SDN controller currently being used in production to control the large Pacnet infrastructure. It has been shown to be successful in a distributed production environment.

Designed to solve the problem of implementing a distributed SDN control plane with a network that spans the Globe, OpenKilda solves the problem of latency while providing a scalable SDN control & data-plane and end-to-end flow telemetry.

Architecture

The Architecture of OpenKilda is shown in the figure below:

Aptira Comparison of Software Defined Networking (SDN) Controllers. OpenKilda Diagram
  • Structurally, OpenKilda uses the Floodlight software to interact with switches using OpenFlow, but pushes decision making functionality into other parts of the stack.
  • Kafka is used as a message bus for the telemetry from the Floodlight and feeds information into an Apache Storm based cluster of agents for processing.
  • Storm passes the time-series data to OpenTSDB for storing and analysing.
  • Neo4j is a graph analysis and visualisation platform.

Modularity and Extensibility

OpenKilda is built on several well-supported open-source components to implement a decentralised, distributed control plane, backed by a unique, well-designed cluster of agents to drive network updates as required. The modular nature of the architecture lends itself to being reasonably easily added new features.

Scalability

OpenKilda is able to scale process intensive profiling and decision-making functionality horizontally and independently of the control plane.

Cluster Scalability

  • OpenKilda approaches cluster scalability in a modular way. While Floodlight is used as a Southbound interface to the switch infrastructure, responsibility for PCE and telemetry processing is pushed northward into a completely separate Apache Storm based cluster. Each Floodlight instance is idempotent, with no requirement to share state. The Apache Storm cluster is by design horizontally scalable and allows throughput to be increased by adding nodes.

Architectural Scalability

  • BGP is currently not implemented and may need to be developed.

Interfaces

  • Southbound: It supports OpenFlow
  • Northbound: Offer RESTful APIs only, which are limited compared to ONOS and ODL

Telemetry

Extracting usable telemetry from the infrastructure was a core design principle of OpenKilda, so one output from the Storm agents is streams of time-series data, collected by a Hadoop backed, OpenTSDB data store. This data can be used in a multitude of ways operationally, from problem management to capacity planning.

Resilience and Fault Tolerance

OpenKilda has no inbuilt clustering mechanism, instead relying on external tools to maintain availability. High availability is achieved by running multiple, identically configured instances, or a single instance controlled by an external framework that detects and restarts failed nodes.

Programming Language

OpenKilda is written in Java.

Community

While the functionality of OpenKilda in its intended space is promising, community support is still being cultivated, leaving much of the development and maintenance burden on its current users, with feature velocity slow. OpenKilda needs your support – chat with us to get involved.

Conclusion

OpenKilda has been introduced by Telstra and is already used in production within Telstra. It has a distributed architecture and leverages other well-supported Open source projects for Telemetry processing and implementing PCE functionality. From a technical point of view, it may not be suitable for geo-redundant environment or segment routing due to the lack of BGP and MPLS tagging.
Next, we will be evaluating Ryu.

SDN Controller Comparisons:

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Comparison of Software Defined Networking (SDN) Controllers. Part 4: OpenKilda appeared first on Aptira.

by Farzaneh Pakzad at July 26, 2019 01:18 PM

July 25, 2019

OpenStack Superuser

Inside open infrastructure: The latest from the OpenStack Foundation

Welcome to the latest edition of the OpenStack Foundation Open Infrastructure newsletter, a digest of the latest developments and activities across open infrastructure projects, events and users. Sign up to receive the newsletter and email community@openstack.org to contribute.

Spotlight on: Airship elections

The Airship team completed its first Technical Committee election.
The five elected members are:

  • James Gu, SUSE
  • Alexander Hughes, Accenture
  • Jan-Erik Mångs, Ericsson
  • Alexey Odinokov, Mirantis
  • Ryan van Wyk, AT&T

The Technical Committee, one of two governing bodies for Airship, is responsible for ensuring that Airship projects adhere to core principles, promote standardization and define and organize the project’s versioning and release process. The candidates represented six leaders from six different companies, a reflection of the growth of the Airship project since it launched in early 2018.

Congrats to the new Technical Committee members and also thanks to everyone who participated. Governance by community elected officials is one of the cornerstones of the Four Opens and a major step forward in the maturation of Airship. The Technical Committee is organizing its first meeting and will soon publish the schedule and agenda to the Airship mailing list.

The Technical Committee is one of two governing bodies within the Airship community. With that election wrapped up, they’ve turned their attention to the Working Committee election. The Working Committee guides the project strategy, helps arbitrate disagreements between core reviewers within a single project or between Airship projects, defines core project principles, assists in marketing and communications, provides product management, and offers ecosystem support.

Nominations for the Airship Working Committee are now open until July 30, 19:00 UTC. Anyone who has contributed to the Airship project within the last 12 months is eligible to run for the Working Committee and vote.

Visit the website for more information about how Airship can manage infrastructure deployments and life cycle. You’ll also learn more about how to get started by using Airship in a Bottle, attending one of the weekly meetings and getting involved with development.

Open Infrastructure Summit Shanghai and Project Teams Gathering (PTG)

OpenStack Foundation news

  • Registration is open. Summit tickets also grant access to the PTG. You can pay in U.S. dollars or yuan if you need an official invoice (fapiao.)
  • If your organization can’t fund your travel, apply for the Travel Support Program by August 8.
  • If you need a travel visa, get started now: Information here.
  • Put your brand in the spotlight by sponsoring the Summit: Learn more here.
  • Is your team coming to the PTG? Remember to answer the survey by August 11. If you’re a team lead and missed the email with the survey, please contact Kendall Nelson (knelson@openstack.org)

OpenStack Foundation Project News

OpenStack

  • July 25 marks the second milestone in the development  of the Train release. Feature development is now being finalized in preparation for the final release, planned for October 16.
  • The 2019 OpenStack User Survey is open until August 22. If you’re running OpenStack, please share your deployment choices and feedback.

StarlingX

  • Check out the new StarlingX main Wiki page for updates on current activities, tools, processes and how to participate in the community.
  • StarlingX and the OSF Edge Computing Group is collaborating to test minimal reference architectures to suit different edge use cases. Community members are deploying StarlingX with a distributed control architecture on hardware donated by Packet.com. See the StarlingX Wiki for more about what the deployment configuration looks like, which locations the components are running on and more.

Upcoming open infrastructure community events

August

OpenInfra Day Vietnam OpenStack Upstream Institute

September

24-26 OpenStack Day DOST, Berlin, Germany

24-26 Ansible Fest Atlanta, Georgia

Zuul booth

26-27 OpenCompute Regional Summit, Amsterdam, The Netherlands

OSF booth #B23

October

November

  • 18-21 KubeCon+CloudNativeCon, San Diego, California
  • OSF reception on Monday, November 18 at the Hilton Bayfront Hotel
  • OSF booth

Questions / feedback / contribute

This newsletter is written and edited by the OpenStack Foundation staff to highlight open infrastructure communities. We want to hear from you!
If you have feedback, news or stories that you want to share, reach us through community@openstack.org . To receive the newsletter, sign up here.

The post Inside open infrastructure: The latest from the OpenStack Foundation appeared first on Superuser.

by OpenStack Foundation at July 25, 2019 04:31 PM

Aptira

Comparison of Software Defined Networking (SDN) Controllers. Part 3: OpenDayLight (ODL)

Aptira Comparison of Software Defined Networking (SDN) Controllers. OpenDayLight ODL

Our Open Source Software Defined Networking (SDN) Controller comparison continues with OpenDayLight (ODL). ODL is more focused on the SD-LAN and Cloud integration spaces.

OpenDaylight is a modular open platform for customising and automating networks of any size and scale. The OpenDaylight Project arose out of the SDN movement, with a clear focus on network programmability. It was designed from the outset as a foundation for commercial solutions that address a variety of use cases in existing network environments.

Architecture

ODL consists of 3 layers:

Aptira Comparison of Software Defined Networking (SDN) Controllers. OpenDayLight ODL Diagram
  • Southbound plugins to communicate with the network devices
  • Core Services that can be used by means of Service Abstraction Layer (SAL) which is based on OSGi to help components going in and out of the controller while the controller is running
  • Northbound interfaces (e.g. REST/NETCONF) that allow operators to apply high-level policies to network devices or integration of ODL with other platforms

Modularity and Extensibility

Built-in mechanisms provided by ODL simplify the connection of code modules. The controller takes advantage of OSGi containers for loading bundles at runtime, allowing a very flexible approach to adding functionality.

Scalability

ODL uses a model-based approach, which implies a global, in-memory view of the network is required to perform logic calculations. ODL’s latest release further advances the platform’s scalability and robustness, with new capabilities supporting multi-site deployments for geographic reach, application performance and fault tolerance.

Cluster Scalability

  • ODL contains internal functionality for maintaining a cluster, AKKA as a distributed datastore shares the current SDN state and allows for controllers to failover in the event of a cluster partition
  • As a cluster grows however, communication and coordination activities rapidly increase, limiting performance gains per additional cluster member

Architectural Scalability

  • ODL includes native BGP routing capabilities to coordinate traffic flows between the SDN islands
  • Introduction of OpenDaylight into OpenStack provided multi-site networking while boosts networking performance

Interfaces

  • Southbound: It supports an extensive list of Southbound interfaces including OpenFlow, P4, NETCONF, SNMP, BGP, RESTCONF and PCEP.
  • Northbound: ODL offers the largest set of northbound interfaces with gRPC and RESTful APIs. The northbound interfaces supported by ODL include OSGi for applications in the same address space as the controller and the standard RESTful interface. DLUX is used to represent Northbound interfaces visually to ease integration and development work.

Telemetry

At a project level, ODL has limited telemetry related functionality. With the latest development release, there are moves toward providing northbound telemetry feeds, but they are in early design and not likely to be ready for production in the short term.

Resilience and Fault Tolerance

ODL fault tolerance mechanism is similar to ONOS, with an odd number of SDN controllers required to provide fault tolerance in the system. In the event of master node failure, a new leader would be selected to take the control of the network. The mechanism of choosing a leader is slightly different in these controllers – while ONOS focuses on eventually consistent, ODL focuses on high availability.

Programming Language

From a language perspective, ODL is written in Java.

Community

TODL is the second of the SDN controllers under the Linux Foundation Networking umbrella. This project has the largest community support of all open source SDN controllers in the market, with several big-name companies actively involved with development.

Conclusion

OpenDayLight is the most pervasive open-source SDN controller with extensive northbound and southbound APIs. In addition to resiliency and scalability, the modular architecture of ODL makes it a suitable choice for different use-cases. This is why OpenDayLight has been integrated into other open-source SDN/NFV orchestration and management solutions such as OpenStack, Kubernetes, OPNFV and ONAP which are very popular platforms in telco environments.
Next, we will be evaluating OpenKilda.

SDN Controller Comparisons:

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Comparison of Software Defined Networking (SDN) Controllers. Part 3: OpenDayLight (ODL) appeared first on Aptira.

by Farzaneh Pakzad at July 25, 2019 01:32 PM

July 24, 2019

SUSE Conversations

SUSE OpenStack Cloud 9 – Now Included in SUSE YES Certification

More and more, businesses are seeking cloud solutions that provide an easy to deploy and manage, heterogeneous cloud infrastructure for provisioning development, test and production workloads in a way that is supportable, compliant and secure. In addition, they want a solution that has gone through an official certification program to give them confidence that their […]

The post SUSE OpenStack Cloud 9 – Now Included in SUSE YES Certification appeared first on SUSE Communities.

by Daryl Stokes at July 24, 2019 03:49 PM

Aptira

Comparison of Software Defined Networking (SDN) Controllers. Part 2: Open Network Operation System (ONOS)

Aptira Comparison of Software Defined Networking (SDN) Controllers. Open Network Operation System (ONOS)

We begin our Open Source Software Defined Networking (SDN) Controller comparison with the Open Network Operating System (ONOS). ONOS is designed to be distributed, stable and scalable with a focus on Service Provider networks.

The Open Network Operation System is the only SDN controller platform that supports the transition from legacy “brown field” networks to SDN “green field” networks. This enables exciting new capabilities, and disruptive deployment and operational cost points for network operators.

Architecture

ONOS is designed as a three-tier architecture as follows:

Aptira Comparison of Software Defined Networking (SDN) Controllers. Open Network Operation System (ONOS) Diagram
  • Tier 1 comprises of modules related to protocols which communicate with the network devices (Southbound in the figure)
  • Tier 2 composes of the core of ONOS and provides network state without relying on any particular protocol
  • Tier 3 comprises of applications, ONOS apps, which use network state information presented by Tier 2

Modularity and Extensibility

ONOS has built-in mechanisms for connecting/disconnecting components while the controller is running. This allows a very flexible approach to adding functionality to the controller.

Scalability

ONOS is designed specifically to horizontally scale for performance and geo-redundancy across small regions.

Cluster Scalability

  • The cluster configuration is simple, with new controllers being able to join and leave dynamically, giving flexibility over time.
  • The Atomix distributed datastore, which prioritises data consistency, should reduce the outages caused by cluster partitioning as all hosts are guaranteed to have the correct data.
  • As a cluster grows however, communication and coordination activities rapidly increase, limiting performance gains per additional cluster member.

Architectural Scalability

  • ONOS includes native BGP routing capabilities to coordinate traffic flows between the SDN islands.
  • There are several documented instances of ONOS (e.g. ICONA, SDN-IP) being used successfully in a geo-redundant architecture for controlling large scale SD-WANs.

Interfaces

  • Southbound: It supports an extensive list of Southbound interfaces including OpenFlow, P4, NETCONF, TL1, SNMP, BGP, RESTCONF and PCEP.
  • Northbound: ONOS offers the largest set of northbound interfaces with gRPC and RESTful APIs.
  • GUI: The ONOS GUI is a single-page web-application, providing a visual interface to the Open Network Operation System controller (or cluster of controllers).
  • Intent-based framework: ONOS has the implementation of the inbuilt Intent based framework. By abstracting a network service into a set of criteria a flow should meet, the generation of the underlying OpenFlow (or P4) configuration is handled internally, with the client system specifying only what the functional outcome should be.

Telemetry

Telemetry feeds are available through pluggable modules that come with the software, with Influx DB and Grafana plug-ins included in the latest release.

Resilience and Fault Tolerance

ONOS has a very simple administration mechanism for clusters with native commands for adding and removing members.

The Open Network Operation System provides fault tolerance in the system with an odd number of SDN controllers. In the event of Master node failure, a new leader would be selected to take the control of the network.

Programming Language

ONOS is written in Java.

Community

The Open Network Operation System is supported under the Linux Foundation Networking umbrella and boasts a large developer and user community.

Conclusion

Given this evaluation, the Open Network Operation System is a suitable choice for Communication Service Providers (CSP). This is because ONOS supports an extensive list of northbound and southbound APIs so vendors do not have to write their own protocol to configure their devices. It also supports the YANG model which enables vendors to write their applications against this model. The scalability of ONOS make it highly available and resilient against failure which increases the customer user experience. Finally, the software modularity features of ONOS allows users to easily customise, read, test and maintain.
Next, we will be evaluating OpenDayLight.

SDN Controller Comparisons:

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Comparison of Software Defined Networking (SDN) Controllers. Part 2: Open Network Operation System (ONOS) appeared first on Aptira.

by Farzaneh Pakzad at July 24, 2019 02:52 AM

OpenStack Superuser

Superuser Awards open for Open Infrastructure Summit Shanghai

Nominations are open for Open Infrastructure Summit Shanghai Superuser Awards. The deadline is September 27.

All nominees will be reviewed by the community and the Superuser editorial advisors will determine the winner that will be announced onstage at the Summit in early November.

Open infrastructure provides resources to developers and users by integrating various open source solutions. The benefits are obvious, whether that infrastructure is in a private or a public context: the absence of lock-in, the power of interoperability opening up new possibilities, the ability to look under the hood, tinker with, improve the software and contribute back your changes.

The Superuser Awards recognize teams using open infrastructure to meaningfully improve business and differentiate in a competitive industry, while also contributing back to the open-source community.  They aim to cover the same mix of open technologies as our publication, namely OpenStack, Kubernetes, Kata Containers, Airship, StarlingX, Ceph, Cloud Foundry, OVS, OpenContrail, Open Switch, OPNFV and more.

Teams of all sizes are encouraged to apply. If you fit the bill, or know a team that does, we encourage you to submit a nomination here.

After the community has reviewed all nominees, the Superuser editorial advisors will select the winner.

When evaluating winners for the Superuser Award, judges take into account the unique nature of use case(s), as well as integrations and applications by a particular team. Questions include how is this team innovating with open infrastructure, for example working with container technology, NFV or unique workloads.

Additional selection criteria includes how the workload has transformed the company’s business, including quantitative and qualitative results of performance as well as community impact in terms of code contributions, feedback and knowledge sharing.

Winners will take the stage at the Open Infrastructure Summit in Shanghai. Submissions are open now until September 27, 2019. You’re invited to nominate your team or someone you’ve worked with, too.

Launched at the Paris Summit in 2014, the community has continued to award winners at every Summit to users who show how open infrastructure is making a difference and provide strategic value in their organization. Past winners include  AT&T, CERN, City Network, Comcast, NTT GroupTencent TStack, and VEXXHOST.

For more information about the Superuser Awards, please visit http://superuser.openstack.org/awards.

The post Superuser Awards open for Open Infrastructure Summit Shanghai appeared first on Superuser.

by Ashlee Ferguson at July 24, 2019 12:02 AM

July 23, 2019

Gorka Eguileor

Making Host and OpenStack iSCSI devices play nice together

OpenStack services assume that they are the sole owners of the iSCSI connections to the iSCSI portal-targets generated by the Cinder driver, and that is fine 98% of the time, but what happens when we also want to have other non-OpenStack iSCSI volumes from that same storage system present on boot? In OpenStack the OS-Brick […]

by geguileo at July 23, 2019 05:49 PM

OpenStack Superuser

OpenStack Ironic Bare Metal Program case study: StackHPC

The OpenStack Foundation announced in April 2019 that its Ironic software is powering millions of cores of compute all over the world, turning bare metal into automated infrastructure ready for today’s mix of virtualized and containerized workloads.

Some 30 organizations joined for the initial launch of the OpenStack Ironic Bare Metal Program, and Superuser is running a series of case studies to explore how people are using it.

StackHPC provides specialist consultancy services around the convergence of HPC and cloud. The company dedicates significant effort to upstream development of Ironic for scientific computing use cases.

Why did you choose OpenStack Ironic for bare-metal provisioning?

StackHPC’s technical team has world-leading expertise in the use of OpenStack cloud infrastructure for scientific computing. Our clients in technical computing often have very different priorities for trade-offs between performance and flexibility. Ironic stands at a point where extreme performance is needed as the overriding priority. We work to develop solutions in which Ironic is used in flexible ways, enabling clients to exploit the major advantages offered by open infrastructure without sacrificing the performance they require.

We also use Ironic as a standalone service called Bifrost in Kayobe, our open-source deployment tool. (Kayobe is an open-source bare metal OpenStack deployment framework that builds on Kolla-Ansible to add hardware provisioning and infrastructure-as-a-service capabilities.) Used this way, Ironic provides the minimal subset of OpenStack needed to support deployment of private cloud control planes using Kolla and Kolla-Ansible.

What was your solution before implementing Ironic?

The concept of the HPC-enabled cloud predates StackHPC — we’ve been pursuing the perfection of this paradigm from day one. StackHPC has been both users (and developers) of Ironic from the very beginning. Ironic’s potential was apparent from the outset and our team has been active in shaping some aspects of its evolution.

The first contribution of our team members was very simple – a power driver to enable Ironic to control smart power strips for powering servers. We needed this because our lab at the time was very basic and didn’t have hardware equipped with baseboard management controllers.

Over time, the team has contributed to a range of more challenging areas. For example, we were active in the development of multi-tenant network isolation and improved support for bare metal in Magnum. Most recently we’ve been doing some really powerful work around deep reconfiguration of bare metal (eg, redundant array of independent disks (RAID) and basic input/output system (BIOS) setup driven by the requirements of the image or flavor of the instance being deployed. More on that in a session from the Summit here.

We’re now at the point where Ironic is developing world-leading capabilities and it’s exciting for StackHPC to be a part of that.

What benefits does Ironic provide your users?

Ironic enables our clients to deploy on-premise high-performance computing infrastructure using the same methods they would use to deploy infrastructure in the cloud. This is driving a revolution in research computing infrastructure management.

Our clients are diverse and span a wide range of domains, but are united by a requirement for high-performance cloud. Many of the use cases are breathtaking in their ambition and their enthusiasm for a challenge rubs off on us. Bare metal – yet specialized to the workload – is the best tool for achieving these goals.

What feedback do you have for the upstream OpenStack Ironic team?

Ironic is easily the most exciting area in which OpenStack is developing today and I don’t think we have any specific feedback other than to keep up the good work!

Stig Telfer, StackHPC CTO, also co-chairs the Scientific Special Interest Group and there’s a great relationship between the SIG the Ironic team. The active members of the Scientific SIG have provided user stories for Ironic that have influenced the implementation of powerful features. In turn, the Ironic team has been so helpful and supportive of the Scientific SIG. It’s a close connection that has yielded many benefits.

Learn more

You’ll find an overview of Ironic on the project Wiki.

Discussion of the project  takes place in #openstack-ironic on irc.freenode.net. This is a great place to jump in and start your ironic adventure. The channel is very welcoming to new users – no question is a wrong question!

The team also holds one-hour weekly meetings at 1500 UTC on Mondays in the #openstack-ironic room on irc.freenode.netchaired by Julia Kreger (TheJulia) or Dmitry Tantsur (dtantsur).

Stay tuned for more case studies from organizations using Ironic.

Photo // CC BY NC

The post OpenStack Ironic Bare Metal Program case study: StackHPC appeared first on Superuser.

by Superuser at July 23, 2019 02:01 PM

Aptira

Comparison of Software Defined Networking (SDN) Controllers. Part 1: Introduction

Aptira Software Defined Networking SDN Controllers

The Software Defined Networking (SDN) technology landscape has evolved quickly over the last two years. Due to the developing nature of the SDN controller space, there is a plethora of software available for use.

The core concept of Software Defined Networking is separating the intelligence and control (e.g. routing) from forwarding elements (i.e. switches) and concentrating the control of the network management and operation in a logically centralised component – an SDN Controller. We’ve discussed this topic in more detail here.

Whilst many SDN controllers exist, we will compare the maturity of the most popular Open Source SDN controllers in industry and academia including: the Open Network Operation System (ONOS), OpenDayLight (ODL), OpenKilda, Ryu and Faucet. These SDN controllers will be rated against the following assessment criteria:

  • Architecture
  • Modularity and Extensibility
  • Scalability
    • Cluster Scalability
    • Architectural Scalability
  • Interfaces
    • Northbound API support
    • Southbound API support
  • Telemetry
  • Resilience and Fault Tolerance
  • Programming Language
  • Community

It is important to understand the motivations behind the available platforms. Each design has different use cases as usage depends not only on the capability matrix, but also on the cultural fit of the organisation and the project.

Our team of Solutionauts have used Software Defined Networking controllers for many different use cases, including: Traffic Engineering, Segment Routing, Integration and Automated Traffic Engineering.

Over the next few days, we will be comparing, rating and evaluating each of the most popular Open Source SDN controllers in use today. This comparison will be useful for organisations to help them select the right SDN controller for their platform which match their network design and requirements.

SDN Controller Comparisons:

Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Comparison of Software Defined Networking (SDN) Controllers. Part 1: Introduction appeared first on Aptira.

by Farzaneh Pakzad at July 23, 2019 01:05 PM

July 22, 2019

Galera Cluster by Codership

Galera Cluster with new Galera Replication Library 3.27 and MySQL 5.6.44, MySQL 5.7.26 is GA

Codership is pleased to announce a new Generally Available (GA) release of Galera Cluster for MySQL 5.6 and 5.7, consisting of MySQL-wsrep 5.6.44-25.26 and MySQL-wsrep 5.7.26-25.18 with a new Galera Replication library 3.27 (release notes, download), implementing wsrep API version 25. This release incorporates all changes into MySQL 5.6.44 (release notes, download) and MySQL 5.7.26 (release notes, download) respectively.

Compared to the previous 3.26 release, the Galera Replication library has a few fixes: to prevent a protocol downgrade upon a rolling upgrade and also improvements for GCache page storage on NVMFS devices.

One point of note is that this is also the last release for SuSE Linux Enterprise Server 11, as upstream has also put that release into End-of-Life (EOL) status.

You can get the latest release of Galera Cluster from http://www.galeracluster.com. There are package repositories for Debian, Ubuntu, CentOS, RHEL, OpenSUSE and SLES. The latest versions are also available via the FreeBSD Ports Collection.

by Colin Charles at July 22, 2019 03:06 PM

OpenStack Superuser

Using Istio’s Mixer for network request caching: What’s next

The humble micro-service gets a lot of love, says Zach Arnold dev-ops engineering manager. In theory, micro-services are great with absolutely no drawbacks, offering loose coupling, independent management and “all the other stuff that we say we love,” he adds.

In his experience working with them at financing startup Ygrene Energy Fund, that love doesn’t exactly come for free. He tallies up the costs for adding network hops, complex debugging scenarios, authorization and authentication, version coordination and a management burden for third-party dependencies, especially when security patches come in for other frameworks.

“I’m not leading a revolution in micro-services, I’m just hoping that maybe one less thing becomes a problem for people,” he says in a talk at KubeCon + CloudNativeCon.

Specifically, by employing a network tool like Istio to handle request caching. Right now the service-mesh project handles request routing, retries, fault tolerance, authentication and authorization but it doesn’t handle request caching — yet.

Currently, Istio acts a harness for Envoy. Istio uses an extended version of the Envoy proxy, a high-performance proxy developed in C++ to mediate all inbound and outbound traffic for all services in the service mesh. A team is at work building eCache: a multi-backend HTTP cache for Envoy, check out their efforts here.

Once this work is completed, it will be upstreamed into Istio and configurable using the same Policy DSL and will likely also offer support for TTLs, L1, L2 caching and warming.

Check out his full talk here and the slides here.

Get involved

Istio

Check out the documentation, join the Slack channel and get up to speed with the roadmap by reading the feature stages page and release notes.

Related projects
The GIT repository for eBay’s Envoy caching interface to ATS (Apache Traffic Server)’s cache back end.
Varnish, the de facto standard for HTTP caching in OSS.
The caching system for mod_pagespeed [blog] [code] is one implementation of an open-source multi-cache infrastructure.
Casper: Caching HTTP proxy for internal Yelp API calls.

Photo // CC BY NC

The post Using Istio’s Mixer for network request caching: What’s next appeared first on Superuser.

by Nicole Martinelli at July 22, 2019 02:03 PM

Aptira

Automated Network Traffic Engineering and Tunneling

Aptira Automated Network Traffic Engineering and Tunneling

Previously, network engineers were required to provision network services and keep track of changes in real time in order to implement Network Traffic Engineering (TE). This process was all manual – until we setup a process for automated Network Traffic Engineering and Tunneling.


The Challenge

One of our customers wanted to automate and manage their network services at the Service Orchestration level. They intended to build an orchestration platform to automate network services and remove manual processes. One of the key capabilities they are seeking to validate is the automation of network traffic engineering.

With the advent of Software Defined Networking (SDN) and its ability to provide a global view of the network, the provisioning of network services is now possible in real time. The challenge was to validate that the designed components were able to not only respond to traffic demands in real-time but also can be programmed to respond to future traffic demand.


The Aptira Solution

Aptira’s team of world-class SDN, Service Orchestration and Cloud engineers recognised the customer’s problem and were able to design a solution, using a combination of Software Defined Network (SDN) and Service Orchestration techniques.

To solve this challenge, we demonstrated how the combination of technologies such as Service Orchestration (i.e. Cloudify), SDN controller (i.e. ODL), and TICK stack can be used to implement network traffic engineering.

Aptira designed a Software Defined Networking Wide Area Network (SDN-WAN) topology and employed OpenDayLight (ODL) as an SDN controller to manage network resources. We then configured Cloudify as a Service Orchestrator (SO) to implement new service designs using TOSCA blueprints.

We designed the TOSCA blueprints in order to get updated information about the network topology based on recent updates in the network. The TOSCA blueprint triggered Cloudify to send a REST API request to OpenDayLight, querying the network topology and receiving the topology data of any changes. The TOSCA blueprint was then able to design a new network service based on these changes.

As an example of Traffic Engineering in real time, our solution performed the following steps (as shown in the figure) in a fully automated process, without human intervention:

  • Step 1: SDN switches (OVSes) send an update of the network topology in certain intervals to ODL
  • Step 2: The Telegraf agent (TICK stack’s module) running on the ODL detects a change in the network topology and sends an event to the TICK stack’s Policy Engine
  • Step 3: This “changed topology” event in turn triggers PCE blueprint in the Cloudify Service Orchestrator
  • Step 4: The PCE blueprint in Cloudify activates the path computation engine (PCE) module (developed by Aptira)
  • Step 5: The PCE module asks for an update of the network topology from ODL The PCE module
  • Step 6: PCE then sets up new traffic engineering path to optimize the network performance and guarantee the SLA and passes the new computed path to the SDN controller
  • Step 7: The SDN controller installs new rules on the switches included in the path and removes other rules from the switches if required
Aptira Automated Traffic Engineering and Tunneling: OpenFlow Diagram

This solution is self-healing, self-optimising in the case of link or network device failure. Moreover, the solution is not dependent on the SDN controller, which means any customer can adapt this solution to its production network with its own controller.


The Result

The solution was designed, implemented and tested by Aptira and configured into the evaluation platform, passing all use case scenarios devised by the customer. By automating the network traffic engineering and tunneling, this solution not only reduces manual intervention, but also reduces operational and development costs.


Remove the complexity of networking at scale.
Learn more about our SDN & NFV solutions.

Learn More

The post Automated Network Traffic Engineering and Tunneling appeared first on Aptira.

by Aptira at July 22, 2019 01:59 PM

Johan Guldmyr

Contributing To OpenStack Upstream

Recently I had the pleasure of contributing upstream to the OpenStack project!

A link to my merged patches: https://review.opendev.org/#/q/owner:+guldmyr+status:merged

In a previous OpenStack summit (these days called OpenInfra Summits), (Vancouver 2018) I went there a few days early and attended the Upstream Institute https://docs.openstack.org/upstream-training/ .
It was 1.5 days long or so if I remember right. Looking up my notes from that these were the highlights:

  • Best way to start getting involved is to attend weekly meetings of projects
  • Stickersssss
  • A very similar process to RDO with Gerrit and reviews
  • Underlying tests are all done with ansible and they have ARA enabled so one gets a nice Web UI to view results afterward. Logs are saved as part of the Zuul testing too so one can really dig into and see what is tested and if something breaks when it’s being tested.

Even though my patches were one baby and a bit over 1 year in time after the Upstream Institute I could still figure things out quite quickly with the help of the guides and get bugs created and patches submitted. My general plan when first attending it wasn’t to contribute code changes, but rather to start reading code, perhaps find open bugs and so on.

The thing I wanted to change in puppet-keystone was apparently also possible to change in many other puppet-* modules, and less than a day after my puppet-keystone change got merged into master someone else picked up the torch and made PRs to like ~15 other repositories with similar changes :) Pretty cool!

Testing is hard! https://review.opendev.org/#/c/669045/1 is one backport I created for puppet-keystone/rocky, and the Ubuntu testing was not working initially (started with an APT mirror issue and later it was slow and timed out)… After 20 rechecks and two weeks, it still hadn’t successfully passed a test. In the end we got there though with the help of a core reviewer that actually updated some mirror and later disabled some tests :)

Now the change itself was about “oslo_middleware/max_request_body_size” So that we can increase it from the default 114688. The Pouta Cloud had issues where our Federation User Mappings were larger than 114688 bytes and we coudln’t update them anymore, turns out they were blocked by oslo_middleware.

(does anybody know where 114688bytes comes from? Some internal speculation has been that it is from 128kilobytes minus some headers)

Anyway, the mapping we have now is simplified just a long [ list ] of “local_username”: “federation_email”, domain: “default”. I think next step might be to try to figure out if maybe we can make the rules using something like below instead of hardcoding the values into the rules

"name": "{0}" 

It’s been quite hard to find examples that are exactly like our use-case (and playing about with is not a priority right now, just something in the backlog, but could be interesting to look at when we start accepting more federations).

All in all, I’m really happy to have gotten to contribute something to the OpenStack ecosystem!

by guldmyr at July 22, 2019 05:57 AM

July 19, 2019

OpenStack Superuser

Why open source today is necessary but not sufficient – and what we should do about it

You could say that open source was born out of both frustration and necessity.

In 1980, Richard Stallman, irked after being blocked from modifying the program a glitchy new laser printer, kicked off the movement. He launched the GNU operating system which, combined with Linux, runs on tens of millions of computers today.

Stallman’s four essential freedoms – the right to run, copy, distribute, study, change and improve the software – define the free software world. Then open source was coined as a term to soothe the suits, promoting the practical values of freedom and downplaying the ethical principles that drove Stallman.

A generation later, necessity is not enough. “Despite being more business-friendly, open source was never a ‘business model,’” writes Thierry Carrez, VP of engineering at the OpenStack Foundation, in a three-part series on the topic. “Open source, like free software before it, is just a set of freedoms and rights attached to software. Those are conveyed through software licenses and using copyright law as their enforcement mechanism. Publishing software under a F/OSS license may be a component of a business model, but if is the only one, then you have a problem.”

Call it the free beer conundrum: the idea that you pay nothing for the product means that it’s free (as in free speech) but not something that comes without cost. The price tag, or lack thereof, has always been a red herring, Carrez notes. It’s really more about allowing the user to kick the tires before going all in. “You don’t have to ask anyone for permission (or enter any contractual relationship) to evaluate the software for future use, to experiment with it, or just to have fun with it. And once you are ready to jump in, there’s no friction in transitioning from experimentation to production.”

What really sets open source apart?  Sustainability, transparency, its appeal to developers and the community that supports it.

“With open source you have the possibility to engage in the community developing the software and to influence its direction by contributing directly to it. This is not about ‘giving back…’ Organizations that engage in open-source communities are more efficient, anticipate changes and can voice concerns about decisions that might adversely affect them. They can make sure the software adapts to future needs by growing the features they’ll need tomorrow.”

His series comes at a critical time for open source. It’s undeniably a success — powering everything from TVs, smartphones, supercomputers, servers and desktops to a $17,000 rifle — but that’s exactly why it’s so easy to take for granted. And notable companies still find ways to profit from the code while flipping the bird at the open-source ethos. It’s bigger than a few bad actors or lawsuits: some say there’s a fight on for the very soul of open source.

So now what? Carrez closes with a call to action: “Open-source advocates and enthusiasts need to get together, defining clear, standard terminology on how open source software is built and start communicating heavily around it with a single voice.  Beyond that, we need to create forums where those questions on the future of open source are discussed. Because whatever battles you win today, the world does not stop evolving and adapting.”

Check out his whole series here.

Photo // CC BY NC

The post Why open source today is necessary but not sufficient – and what we should do about it appeared first on Superuser.

by Nicole Martinelli at July 19, 2019 02:01 PM

Aptira

DevConf.IN 19: Cloud Orchestration using Cloudify

DevConf.IN is the annual developer’s conference organised by Red Hat, India. The conference provides a platform to the FOSS community participants and enthusiasts to come together and engage in knowledge sharing activities through technical talks, workshops, panel discussions, hackathons and much more.

The primary tracks for this year are:

  • Trending Tech
  • AI / ML
  • Storage
  • Networking
  • Open Hybrid Cloud
  • Kernel
  • FOSS Community & Standards
  • Academic Research/White Paper, and
  • Security

Our Senior Software Engineer, Alok Kumar, will be presenting at DevConf.IN, discussing Cloud Orchestration using Cloudify. Alok has been working with OpenStack, k8s and many telco tools, and loves to share and gather knowledge from folks of different backgrounds.

Application deployment, configuration management and system orchestration is easy now with Ansible and other tools. But multi-Cloud Orchestration can still be a challenging task and the tools aren’t very mature yet. During this session, Alok would like to share his experiences with one such Open Source automation tool – Cloudify – to help users understand how easy it can be. The session can be divided into the following topics:

  • Different types of orchestration and best suitable tool for each
  • The current problem with bashifying all your tasks
  • The solution
  • Additional details about Cloudify and some other use cases

If you are a technology enthusiast interested in the latest trends in Open Source and emerging digital technologies, this is the place for you to be.

When: August 2nd -3rd, 2019

Venue: Christ University – Bengaluru, India

Click here for more updates about DevConf.

Last Date of Registration: 31st July, 2019

Let us make your job easier.
Find out how Aptira's managed services can work for you.

Find Out Here

The post DevConf.IN 19: Cloud Orchestration using Cloudify appeared first on Aptira.

by Jessica Field at July 19, 2019 01:44 PM

Chris Dent

Placement Update 19-28

This is pupdate 19-28. Next week is the Train-2 milestone.

Most Important

Based on the discussion on the PTG attendance thread and the notes on the related etherpad I'm going to tell the Foundation there will be approximately seven Placement team members at Shanghai but formal space or scheduling will not be required. Instead any necessary discussions will be arranged on premise. If you disagree with this, please speak up soon.

The main pending feature is consumer types, see below.

What's Changed

  • A bug in the resource class cache used in the placement server was found and fixed. It will be interesting to see how this impacts performance. While it increases database reads by one (for most requests) it removes a process-wide lock, so things could improve in threaded servers.

  • os-resource-classes 0.5.0 was released, adding FPGA and PGPU resource classes.

    It's been discussed in IRC that we may wish to make 1.x releases of both os-resource-classes and os-traits at some point to make it clear that they are "real". If we do this, I recommend we do it near a cycle boundary.

  • An integrated-gate-placement zuul template has merged. A placement change to use it is ready and waiting to merge. This avoids running some tests which are unrelated; for example, cinder-only tests.

Specs/Features

Since spec freeze for most projects is next week and placement has merged all its specs, until U opens, I'm going to skip this section.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 22 (-1) stories in the placement group. 0 (0) are untagged. 2 (-1) are bugs. 5 (0) are cleanups. 10 (-1) are rfes. 4 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 11 microversions.

  • https://review.opendev.org/666542 Add support for multiple member_of. There's been some useful discussion about how to achieve this, and a consensus has emerged on how to get the best results.

Main Themes

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

As mentioned for a few weeks, one of the important cleanup tasks that is not yet in progress is updating the gabbit that creates the nested topology that's used in nested performance testing. We've asked the startlingx community for input.

Another cleanup that needs to start is satisfying the community wide goal of PDF doc generation. I did some experimenting on this and while I was able to get a book created, the number of errors, warnings, and manual interventions required meant I gave up until there's time to do a more in-depth exploration and learn the tools.

Other Placement

Miscellaneous changes can be found in the usual place.

There are two os-traits changes being discussed. And zero os-resource-classes changes (yay!).

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

If you've done any performance or scale testing with placement, I'd love to hear about your observations. Please let me know.

by Chris Dent at July 19, 2019 11:13 AM

July 18, 2019

OpenStack Superuser

Help bring key contributors to the Open Infrastructure Summit with the Travel Support Program

Most of us are more frequently awash in the glow of a monitor than actual sunlight, but we all know how much gets done talking to people in real life.

You can help bring more key contributors face-to-face at the Open Infrastructure Summit in Shanghai by donating to the Travel Support Program. Individual donors can help out by donating at registration for the November 4-6 event. If your organization can contribute to the Travel Support Program, check out the sponsor prospectus.

“It takes somewhere between $2,000 and $3,ooo to send someone via the TSP program to an Open Infra Summit. If 20ish people donate $100, that’s one more person able to attend,”  the OSF’s upstream developer advocate Kendall Nelson said in an impromptu fund-raising thread on Twitter. “Consider donating when you go to register.”

For every Summit, the OpenStack Foundation funds attendance for about 30 dedicated contributors from the open infrastructure community.  These include projects like Kubernetes, Kata Containers, AirShip, StarlingX, Ceph, Cloud Foundry, OVS, OpenContrail, Open Switch, OPNFV.   In addition to developers and reviewers, the program welcomes documentation writers, organizers of user groups around the world, translators, forum moderators and even first-time attendees. Applications for Shanghai are open until August 8, 2019.

TSP grantees from the previous Summit in Denver provide a typical snapshot. The committee picked a diverse group: five nationalities, of which five are Active Technical Contributors, two are Active User Contributors, four are Active User Groups members and the group as a whole contributes to 11 projects.

After every Summit, Superuser profiles the TSP grantees and ask them how it went. Invariably, they say the real-world connections and interactions make them more productive community members.

“The OpenStack community is spread all over the world. We work every day using mostly IRC and experiencing the difficulties of interacting with people in different time zones,” said Rossella Sblendido, software engineer at SUSE and Neutron core reviewer, who traveled from her native Italy to the Tokyo Summit on a TSP grant.

“The Summit is the only time when we’re all there in person at the same time. It’s really crucial to exchange ideas, coordinate and get ready for the next release. Being there makes the difference.”

Photo // CC BY NC

The post Help bring key contributors to the Open Infrastructure Summit with the Travel Support Program appeared first on Superuser.

by Superuser at July 18, 2019 02:01 PM

Aptira

Multi-Cloud Orchestration with Kubernetes, ONAP, Cloudify, Azure & OpenStack

Multi-Cloud Orchestration with Kubernetes, ONAP, Cloudify, Azure & OpenStack

This customer is the research and development arm of a major Australian telecommunications company.

They wanted to evaluate options to deploy and manage Virtual Network Functions (VNF) workloads across a Public Cloud platform (Microsoft Azure) and a Private Cloud platform (OpenStack) to determine whether the platform could be adopted within the organisation. The customer wanted to validate the operation of Service Chaining of VNFs deployed across these market-leading Private and Public Cloud platforms.

The customer also wanted to evaluate a new open-source automation platform called ONAP (Open Network Automation Platform). In particular, the customer wanted to assess its ability to integrate with the selected VIM (OpenStack) and to orchestrate VNF workloads using a market-leading product called Cloudify.


The Challenge

The network infrastructure of telecommunications organisations is increasingly virtualised in the same way that enterprise servers have been virtualised. Such virtualised network components are called Virtual Network Functions (VNFs). Once running on virtual resources, it is possible to move these VNFs between Cloud platforms as required by customer needs. For example, a firewall might be deployed onto a Microsoft Azure Cloud to connect Azure resources into a customer’s enterprise network.

As the overall profitability of the telecom market declines, cost control is of vital importance to telecommunications companies and automating the deployment of VNFs onto virtual resources is an important step in reducing operating costs while also improving agility. Overall, automating these services reduces a process that used to take days or weeks to something which takes minutes, and therefore enables customers to build their systems more efficiently.

Our client sought to evaluate the capabilities of Cloudify and the Open Source ONAP network orchestrator to automate the deployment and provisioning of VNFs to customer Clouds.

They had identified the Open Source ONAP network as a potentially strategic product but needed to evaluate at its current state and maturity given that it was relatively new. Cloudify is a relatively mature product and has been working with ONAP project on a number of integrations.

The customer needed to run up an instance of ONAP to make this assessment. Installing ONAP is a challenge because it is not a packaged product with a general-purpose deployment process. ONAP also consumes a significant amount of resources which must be present for the deployment to succeed, even if they are not necessary for the particular use-case.

Overall this configuration required integration between multiple components, some for the first time.


The Aptira Solution

The end deployment was complex since it required integration between different components that weren’t attempted initially. Following are some of the integrations that were completed as part of the evaluation:

  • Cloudify as Service Orchestrator integrated with Azure to deploy Clearwater vIMS
  • Cloudify as Service Orchestrator integrated with ONAP to deploy F5 vLB VNF respectively
  • Deployed ONAP atop Kubernetes and integrated OpenStack (VIM) with ONAP to deploy F5 vLB VNFs
  • Adapt existing Clearwater TOSCA blueprint to model and deploy vIMS Telco service on Microsoft Azure using Cloudify
  • Model a Service chaining blueprint to enable the SIP traffic to Clearwater vIMS via the F5 load balancer
  • Orchestrating connections between the OpenStack and Azure Cloud to route the SIP traffic

Below is a high-level design diagram, detailing the multi-Cloud orchestration and service chaining:

Aptira deployed the Beijing release of ONAP onto an OpenStack Cloud run by the customer. Kubernetes with Helm were used to deploy ONAP and manage the deployment post installation. This ONAP instance was configured with Cloudify to provision Virtual Network Functions (VNFs) onto an OpenStack Cloud, as well as Microsoft Azure.

The VNFs selected for this evaluation included:

  • The Open Source Clearwater IP Mulitmedia Subsystem (vIMS), which is used to deliver voice, video and multimedia services to mobile telephony users
  • A virtual load balancer from F5

We also demonstrated service chaining between F5 vLB VNF running on OpenStack and Clearwater vIMS running on Azure.


The Result

During the validation process, Aptira successfully:

  • Deployed and managed an F5 virtual load balancer on OpenStack and the Clearwater vIMS system on Microsoft Azure using Cloudify TOSCA blueprints and ONAP Artifacts
  • Registered multiple SIP clients and making calls between SIP clients
  • Triggered autoscaling of the virtual service elements to validate the ability of the configuration to handle Closed-loop automation based on increased traffic load

Now that the validation process is complete, the customer is now able to orchestrate and service chain VNF workloads across multiple Clouds.


Take control of your Cloud.
Get a customised Cloud strategy today.

Learn More

The post Multi-Cloud Orchestration with Kubernetes, ONAP, Cloudify, Azure & OpenStack appeared first on Aptira.

by Aptira at July 18, 2019 01:37 PM

Mirantis

Quick tip: Enable nested virtualization on a GCE instance

There are times when you need to run a virtual machine -- but you're already ON a virtual machine.  Fortunately, it's possible, but you need to enable nested virtualization.

by Nick Chase at July 18, 2019 02:39 AM

July 17, 2019

OpenStack Superuser

Report: Open-source object storage “more mainstream than ever”

It’s taken awhile, but object storage has moved out of the margins. The roots of the computer data storage architecture that manages data as objects stretch back to 1994 — the same year eBay was founded and the DVD launched — but now it’s “becoming more mainstream than ever,” according to a recent GigaOm report.

Typically employed for second-tier storage, backup and long-term archives, now it’s supporting cloud-native workloads that require data to remain always and quickly accessible. The proliferation of devices and data streaming from them is a key driver in this change. The 12-page report titled “Enabling Digital Transformation with Hybrid Cloud Data Management,” outlines new use cases along with adding a few inevitable buzzwords.

Author Enrico Signoretti notes that even the most conservative execs are “now confidently building hybrid cloud infrastructures for multiple use cases.”

Some key examples:

  • Cloud bursting: leveraging the vast amounts of available computing power in the cloud for highly-demanding workloads and fast analysis, while keeping full control over data and paying only for the time required.
  • Cloud tiering: offloading cold data to the cloud to take advantage of the low $/GB while maintaining flexibility.
  • Business continuity (BC) and disaster recovery (DR): eliminating the expense of a secondary DR site without sacrificing data protection or infrastructure resiliency.
  • Advanced data management and governance: complying with increasingly demanding regional regulations while serving global customers.
  • The proliferation of edge services: supporting users, applications, and data generators that are pushing and pulling data to and from core and cloud infrastructures.

The report, sponsored by Scality, spells out the benefits of open source object storage.

Object storage with the right orchestration solution can manage huge amounts of data safely and cost-effectively, Signoretti says, making it accessible from everywhere and from every device.
An object storage solution should:

  • Allow data mobility across cloud and on-premises infrastructure to support the use cases described earlier
  • Possess strong security features
  • Provide advanced data management capabilities to enable both architectural and business flexibility.

While the report specifically examines the merits of Scality’s products Zenko and RING8, the characteristics outlined above apply to projects in the open-source object storage panorama such as Ceph, Swift and Minio. For more, you can read the full report here.

Cover image // CC BY NC

The post Report: Open-source object storage “more mainstream than ever” appeared first on Superuser.

by Superuser at July 17, 2019 02:01 PM

Aptira

System Integration

Aptira System Integration

Why is System Integration Important?

Building a solution out of one technology is generally not going to give you the best results. By integrating specific technologies to build a customised solution, you’re able to solve problems in new and innovative ways. In Engineering, System Integration is defined as the process of bringing together the component sub-systems into one system. In our experience, integration extends to more than just technologies – it also involves the integration of many practices, occupations and organisational units into one discipline which previously were quite distinct and separate. It can be challenging, but the results are worth it.

We understand that you are probably looking for integration with your existing billing, monitoring and provisioning systems rather than having yet another system pushed on you. Your day-to-day processes can be streamlined, increasing efficiency and enabling you to focus on what’s actually important for your business.

System Integration Training

In order to help businesses successfully integrate new technologies into their organisations, we’ve specifically designed a course that will cover all the core features of Agile System Integration for Open Networking Projects. This course has been custom designed by our team who have successfully integrated new technolology solutions for some Australia’s largest and well-known brands.

This course will enable attendees to understand the holistic end-to-end scope of complex technical projects and the pressures that these projects place on existing methodologies. Graduates of this course will be able to select the right tools, processes and operating paradigms to manage or participate in these projects and contribute to high levels of success. We’ll cover a range of topics, including:

  • What is Agile System Integration?
  • Why is Agile System Integration necessary?
  • What problems indicate that I need Agile System Integration?
  • Definitions and context
  • Scope of concerns
  • Stakeholder management
  • Dealing with “multi-everything”
  • Managing precision with uncertainty
  • Reconciling different viewpoints, processes and paradigms
  • Practical considerations

The course will be delivered in workshop format as a combination of lectures and other media. Attendees work individually and in groups, and will take part in practical exercises, enabling students to gain real life experience with System Integration. As we have designed this course from the experiences of our customers, we can fully customise the content to suit your requirements. Generally, this course can be completed in 3-5 days, however as with all of our training courses we can tailor this in order to focus on particular technologies, needs and learning outcomes.

Enabling System Integration

We can work alongside you to provide mentoring and lead your team to achieve your integration goals. Or if you’d rather someone else take care of the hard part for you, our team can develop an integration plan that will drive innovation and increase efficiency for your organisation. We can work with your existing infrastructure teams to get the desired level of integration between your new and existing systems. Each solution is comprehensive and unique to fit your requirements and can easily be integrated into your business workflow, resulting in reduced cost and complexity. Chat with a Solutionaut today for more info.

Become more agile.
Get a tailored solution built just for you.

Find Out More

The post System Integration appeared first on Aptira.

by Jessica Field at July 17, 2019 01:01 PM

July 16, 2019

OpenStack Superuser

Running an OpenStack cloud? Check out the next Operators Meetup

If you run an OpenStack cloud, attending the next Ops Meetup is a great way to trade best practices and share war stories.

Ops Meetups give people who run clouds a place to meet face-to-face, share ideas and give feedback. The vibe is more round table-working group-unconference, with only a small number of presentations. The aim is to gather feedback on frequent issues and work to communicate them across the community, offer a forum to share best practices and architectures and increase constructive, proactive involvement from those running clouds. Ops Meetups are typically held as part of the six-monthly design summit and also once mid-cycle.

This time around, it’ll be held September 3-4 in New York City, hosted by Bloomberg LP at the company’s super-central Park Avenue offices.

You still have time to influence the sessions, so check out the Etherpad. Current session topics include deployment tools, long-term support, RDO and TripleO, Ceph and the always popular tracks dedicated to architecture show-and-tell and war story lightning talks. (Stay tuned for ticket info.)

In the meantime, if you have questions or want to get involved, the Ops Meetup Team holds meetings Tuesdays at 10:00 a.m. EST (UTC -5) and welcomes items added to the meeting agenda on an Etherpad. You’ll also find folks on the #openstack-operators IRC channel or post to the unified OpenStack mailing list using the [ops] tag.

“It’s another way Stackers come together to keep the momentum going as a community, in between Summits, in smaller, focused groups,” says OSF COO Mark Collier.

His experience of attending one of the early ones? “I was really struck by the atmosphere: all focus, no flash…Real life superusers from companies like Comcast, Time Warner Cable, GoDaddy, Yahoo, Sony Playstation, Symantec, Cisco, Workday, IBM, Bluebox, Intel and Paypal made the time to attend and collaborate.”

And, more importantly, the operators came ready to work, to talk about the pain points they’d encountered running thousands of nodes, battles with upgrades and making tough configuration decisions.

For more on what to expect from an Ops Meetup, check out these reports from previous editions held in Manchester and Mexico City.

Photo // CC BY NC

The post Running an OpenStack cloud? Check out the next Operators Meetup appeared first on Superuser.

by Superuser at July 16, 2019 02:02 PM

Aptira

Replacing an OSD in Nautilus

Now that you’ve upgraded Ceph from Luminous to Nautilus, what happens if a disk fails or the administrator needs to convert from filestore to bluestore? The OSD needs to be replaced.

The OSD to be replaced was created by ceph-disk in Luminous. But in Nautilus, things have changed. The ceph-disk command has been removed and replaced by ceph-volume. By default, ceph-volume deploys OSD on logical volumes. We’ll largely follow the official instructions here. In this example, we are going to replace OSD 20.

On MON, check if OSD is safe to destroy:


[root@mon-1 ~]# ceph osd safe-to-destroy osd.20
OSD(s) 20 are safe to destroy without reducing data durability.

If yes on MON, destroy it:


[root@mon-1 ~]# ceph osd destroy 20 --yes-i-really-mean-it
destroyed osd.20

The OSD will be shown as destroyed:


[root@mon-1 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 66.17834 root default
-7 22.05945 host compute-1
......
19 hdd 1.83829 osd.19 up 1.00000 1.00000
20 hdd 1.83829 osd.20 destroyed 0 1.00000
22 hdd 1.83829 osd.22 up 1.00000 1.00000

On OSD after replacing the faulty disk, use perccli to create a new VD with the same sdX device name. Then zap it.


[root@compute-3 ~]# ceph-volume lvm zap /dev/sdl
--> Zapping: /dev/sdl
--> --destroy was not specified, but zapping a whole device will remove the partition table
Running command: /usr/sbin/wipefs --all /dev/sdl
Running command: /bin/dd if=/dev/zero of=/dev/sdl bs=1M count=10
stderr: 10+0 records in
10+0 records out
10485760 bytes (10 MB) copied
stderr: , 0.0634846 s, 165 MB/s
--> Zapping successful for:

Find out the existing db/wal partitions used by the old OSD. Since the ceph-disk command is not available any more, I have written a script to show current mappings of OSD data and db/wal partitions.

Firstly, run “ceph-volume simple scan” to generate OSD json files in /etc/ceph/osd/. Then run this script.


[root@compute-3 ~]# cat ceph-disk-list.sh
#!/bin/bash
JSON_PATH="/etc/ceph/osd/"
for i in `ls $JSON_PATH`; do
OSD_ID=`cat $JSON_PATH$i | jq '.whoami'`
DATA_PATH=`cat $JSON_PATH$i | jq -r '.data.path'`
DB_PATH=`cat $JSON_PATH$i | jq -r '."block.db".path'`
WAL_PATH=`cat $JSON_PATH$i | jq -r '."block.wal".path'`
echo "OSD.$OSD_ID: $DATA_PATH"
#echo $DB_PATH
DB_REAL=`readlink -f $DB_PATH`
WAL_REAL=`readlink -f $WAL_PATH`
echo " db: $DB_REAL"
echo " wal: $WAL_REAL"
echo "============================="
done

It will show the mapping of existing ceph OSD (created by ceph-disk) and db/wal.


[root@compute-3 ~]# ./ceph-disk-list.sh
OSD.1: /dev/sdb1
db: /dev/nvme0n1p27
wal: /dev/nvme0n1p28
=============================
OSD.11: /dev/sdg1
db: /dev/nvme0n1p37
wal: /dev/nvme0n1p38
=============================
OSD.13: /dev/sdh1
db: /dev/nvme0n1p35
wal: /dev/nvme0n1p36
=============================
OSD.14: /dev/sdi1
db: /dev/nvme0n1p33
wal: /dev/nvme0n1p34
=============================
OSD.18: /dev/sdj1
db: /dev/nvme0n1p51
wal: /dev/nvme0n1p52
=============================
OSD.22: /dev/sdm1
db: /dev/nvme0n1p29
wal: /dev/nvme0n1p30
=============================
OSD.3: /dev/sdc1
db: /dev/nvme0n1p45
wal: /dev/nvme0n1p46
=============================
OSD.5: /dev/sdd1
db: /dev/nvme0n1p43
wal: /dev/nvme0n1p44
=============================
OSD.7: /dev/sde1
db: /dev/nvme0n1p41
wal: /dev/nvme0n1p42
=============================
OSD.9: /dev/sdf1
db: /dev/nvme0n1p39
wal: /dev/nvme0n1p40
=============================

Compare this list with the output of lsblk to find out free db/wal devices. Then create a new OSD with them:


[root@compute-3 ~]# ceph-volume lvm create --osd-id 20 --data /dev/sdl --bluestore --block.db /dev/nvme0n1p49 --block.wal /dev/nvme0n1p50
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new e795fd7b-df8d-48d7-99d5-625f41869e7a 20
Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e /dev/sdl
stdout: Physical volume "/dev/sdl" successfully created.
stdout: Volume group "ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e" successfully created
Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e
stdout: Logical volume "osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-20
Running command: /usr/sbin/restorecon /var/lib/ceph/osd/ceph-20
Running command: /bin/chown -h ceph:ceph /dev/ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e/osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a
Running command: /bin/chown -R ceph:ceph /dev/dm-2
Running command: /bin/ln -s /dev/ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e/osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a /var/lib/ceph/osd/ceph-20/block
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-20/activate.monmap
stderr: got monmap epoch 9
Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-20/keyring --create-keyring --name osd.20 --add-key AQD38iNdxf89GRAAO6HbRFcgCj6HSuyOsJRGeA==
stdout: creating /var/lib/ceph/osd/ceph-20/keyring
added entity osd.20 auth(key=AQD38iNdxf89GRAAO6HbRFcgCj6HSuyOsJRGeA==)
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-20/keyring
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-20/
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p50
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p49
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 20 --monmap /var/lib/ceph/osd/ceph-20/activate.monmap --keyfile - --bluestore-block-wal-path /dev/nvme0n1p50 --bluestore-block-db-path /dev/nvme0n1p49 --osd-data /var/lib/ceph/osd/ceph-20/ --osd-uuid e795fd7b-df8d-48d7-99d5-625f41869e7a --setuser ceph --setgroup ceph
--> ceph-volume lvm prepare successful for: /dev/sdl
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-20
Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e/osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a --path /var/lib/ceph/osd/ceph-20 --no-mon-config
Running command: /bin/ln -snf /dev/ceph-e65e64a4-eeec-434f-a93c-82d3e2cfa51e/osd-block-e795fd7b-df8d-48d7-99d5-625f41869e7a /var/lib/ceph/osd/ceph-20/block
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-20/block
Running command: /bin/chown -R ceph:ceph /dev/dm-2
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-20
Running command: /bin/ln -snf /dev/nvme0n1p49 /var/lib/ceph/osd/ceph-20/block.db
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p49
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-20/block.db
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p49
Running command: /bin/ln -snf /dev/nvme0n1p50 /var/lib/ceph/osd/ceph-20/block.wal
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p50
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-20/block.wal
Running command: /bin/chown -R ceph:ceph /dev/nvme0n1p50
Running command: /bin/systemctl enable ceph-volume@lvm-20-e795fd7b-df8d-48d7-99d5-625f41869e7a
stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-20-e795fd7b-df8d-48d7-99d5-625f41869e7a.service to /usr/lib/systemd/system/ceph-volume@.service.
Running command: /bin/systemctl enable --runtime ceph-osd@20
Running command: /bin/systemctl start ceph-osd@20
--> ceph-volume lvm activate successful for osd ID: 20
--> ceph-volume lvm create successful for: /dev/sdl

The new OSD will be started automatically, and backfill will start.

For further information, check out the official Ceph documentation to replace an OSD. If you’d like to learn more, we have Ceph training available, or ask our Solutionauts for some help.

The post Replacing an OSD in Nautilus appeared first on Aptira.

by Shunde Zhang at July 16, 2019 05:12 AM

July 15, 2019

OpenStack Superuser

How to run a simple function with Qinling

In my previous post on Qinling, I showed how to get started with OpenStack’s function-as-a-service project. Here, I’ll explain how to use it with a very simple function and show what Qinling does behind the scenes from a high-level perspective.

High level OpenStack Qinling workflow (thanks draw.io)

The diagram above offers a quick look at what happens when a function is executed. First of all, the execution can be triggered by multiple clients such as GitHub, Prometheus, Mistral, your dear colleague, etc… But keep in mind that there aren’t really any limits about what can act as a trigger — a lollipop, your dog, E.T.

The sidecar container deployed within the POD is only used to download the code from Qinling API, the /download endpoint is called by the runtime container via http://localhost:9091/download. You’ll find the runtime here.

A very simple function

Let’s use a very simple function to demonstrate Qinling basics. The function is written in Python, so you’ll need a Python runtime. (Stay tuned: runtimes will be the subject of an upcoming post.)

def main(**kwargs):
print("Hello Qinling \o/")

Here we have a function named main() that prints “Hello Qinling \o/”. To interact with the Qinling API, a client is required, it could be python-qinlingclient, httpie or curl. I’m going with the easiest option, the official client which I installed by pip.

$ openstack function create --name func-hq-1 \
--runtime python3 --file hello_qinling.py --entry hello_qinling.main

The command is simple — I asked to Qinling to create a function. hello_qinling.py is a simple file (not a package), I used the python3 runtime to execute my function. Last but not least, the entry point says to Qinling how to enter in the function, which in the example is hello_qinling.main (the file name and the function to execute in this file).

Get the function ID returned by the command..

Run, Qinling, run… and show me the output!

When a function is executed it should return a value, depending how it has been coded of course. Qinling can provide two different outputs/results:

  • return: Terminates and returns a value from a function
  • print: Displays a value to the standard output/console

Let’s execute the function with the ID from the function created above and see the result line.

$ openstack function execution create 484c45df-6637-4349-86cc-ecaa6041352e | grep result
| result           | {"duration": 0.129, "output": null}  |

By default, Qinling will display the return under the output JSON keyword (yes, the result field is JSON formatted). Of course if no return exists (as in our sample function) then the output value will be null. But then where the print defined in our function will be displayed ?

Qinling provides a way to show messages printed during the function execution. The function execution returned an ID, this one is required to display the function log.

$ openstack function execution log show 4a3cc6ae-3353-4d90-bae5-3f4bf89d4ae9
Start execution: 4a3cc6ae-3353-4d90-bae5-3f4bf89d4ae9
Hello Qinling \o/
Finished execution: 4a3cc6ae-3353-4d90-bae5-3f4bf89d4ae9

As expected “Hello Qinling \o/” as been printed.

Your turn

Now that all the tools have been provided, let’s try a little test. What will be the result of running this function?

def main(**kwargs):
    msg = "Hello Qinling \o/"
return msg

Just post the answer/output in the comments section and let’s see if you get the concept, if not then I failed!

About the author

Gaëtan Trellu is a technical operations manager at Ormuco. This post first appeared on Medium.

Superuser is always interested in open infra community topics, get in touch at editorATopenstack.org

Photo // CC BY NC

The post How to run a simple function with Qinling appeared first on Superuser.

by Gaëtan Trellu at July 15, 2019 02:02 PM

Aptira

Swinburne Ceph Upgrade and Improvement

Swinburne is a large and culturally diverse organisation. A desire to innovate and bring about positive change motivates their students and staff. The result is in an institution that grows and evolves each year.


The Challenge

Aptira deployed SUSE Enterprise Storage 4 for Swinburne university a year ago. As SUSE Storage 5 was released, Swinburne wanted to take advantage of its new features like CephFS and Bluestore, so they planned to upgrade to this latest version. The Ceph Filesystem (CephFS) is a POSIX-compliant filesystem that uses a Ceph Storage Cluster to store its data.

BlueStore is a new back end object store for the OSD daemons. The original object store, FileStore, requires a file system on top of raw block devices. Objects are then written to the file system. BlueStore does not require a file system, it stores objects directly on the block device. Thus BlueStore provides a high-performance backend for OSD daemons in a production environment.

As a SUSE partner, we were called in to help them upgrade their existing Ceph storage system and expand it with more storage nodes.

Moreover, Swinburne was concerned about an emerging problem with their Netapp instance, which was soon to be out of warranty, and therefore needed to decommission, and migrate data from Netapp to Ceph.


The Aptira Solution

Aptira’s solution was to upgrade and expand the storage cluster to Storage 5 using DeepSea, which was the tool originally used to deploy their storage 4 cluster. DeepSea is a tool developed by SuSE based on Salt Stack for deploying, managing and automating Ceph.

We set up an environment on our own lab equipment to simulate their environment and test the full upgrade process. This included installing the SUSE storage environment and corresponding services like SUSE manager, local SMT server on Aptira’s own infrastructure (“cloud-q”). On this test environment the full upgrade process was tested and proven to work.

When it comes to Swinburne’s product system upgrade, the base OS of each node was upgraded to SLES SP3 and SUSE Enterprise Storage 4.0 was upgraded to version 5.5. At the same time we did an expansion to both Ceph clusters too. Nine additional OSD nodes were added to each Ceph cluster, which brings the total storage of each cluster to 3 PB.

In addition, Aptira deployed NFS and Samba services on top of CephFS, in order to provide a seamless transition from Netapp to Ceph.

Samba is also integrated with Swinburne’s Windows Active Directory (AD), so users can easily access Ceph storage with their own Windows credentials. Since DeepSea does not allow fully customised installation for Samba and NFS, Aptira have written Ansible playbooks to install Samba and NFS.


The Result

Swinburne’s storage was successfully upgraded to SUSE Storage 5 and their node expansion has been successfully completed as well. Samba and NFS are running as gateways to CephFS.

From the result of some performance tests, they are providing good performance and the users are satisfied. An as-built document was written and handed to Swinburne to conclude this project.


Keep your data in safe hands.
See what we can do to protect and scale your data.

Secure Your Data

The post Swinburne Ceph Upgrade and Improvement appeared first on Aptira.

by Aptira at July 15, 2019 01:14 PM

Stackmasters team

Mastering OpenStack as a Stackmasters intern

I was looking for an opportunity to gather practical knowledge on OpenStack and Ansible. As a final year student and a member of the CONSERT lab in University of West Attica, I had already touched my fingers on cloud technology. And since Stackmasters is heavily involved in cloud management and OpenStack in particular, I chose them to apply for my internship. A Stackmasters intern, then!

Mastering OpenStack as a Stackmasters intern

Joining the team as a Stackmasters intern, I anticipated a typical environment like most of the companies offer in the Greek labor market. But I was proven wrong, quickly. Stackmasters, being part of the Starttech Ventures portfolio companies, is way too different. The friendly atmosphere makes you feel at home from day1. The open mindset of the people I had the chance to meet and work with, make up a unique working environment.

Experimenting with the Cloud

Starting with Ansible

My major goal to achieve within my three month stay, as a Stackmasters intern, was to deepen my knowledge on cloud technology!
At first, I started working with Ansible. Having some experience beforehand, I realized really fast that there are many more to get from such a tool. Ansible is a simple agent-less IT automation tool that helps you automate common or even more complex jobs. With Ansible you are able to automate all those tasks in a computer environment; like provision, deployment or even changing the behavior of services and resources. In Ansible’s absence you have to tackle such tasks manually, or with a bunch of scripts. Along with whatever consequences in terms of delays, maintenance and human mistakes this method brings about.

My first project was to understand Ansible’s best practices for structuring a task work; so that I would get the hang of it. And that’s exactly what I did. Following, my mentor Thanassis’ pertinent directions, I started with the basics. And I think I can proudly say that I finally managed to create a playbook that automatically handles the installation and configuration of an Apache server.

OpenStack Services’ turn

While setting up an Apache server seemed up to scratch in the first few days; still I was thirsty for a bigger challenge!
And so after completing my first project, I moved on to OpenStack. I was keen to gain expertise on OpenStack deployments and management of such environments. The first step in this journey was to study the OpenStack documentation guide; so as to manually deploy a small lab. With its core services running on two virtual machines.
And the learning goal? Well, to understand the architecture of a cloud with OpenStack services — Keystone, Glance, Nova, Newton, Horizon, Swift — as its components. Furthermore, to comprehend how each of those services interconnects and contributes to its provided features.

OpenStack-Ansible

I have to admit. I got frustrated with the complexity of such a system. And I was pretty unsure on whether I could go through the project with success.
The team at Stackmasters helped me understand quite a great detail on the OpenStack architecture. And what options I had in order to go on with my project. I got acquainted with a few OpenStack projects, developed by the community to ease the pain of management; such as deployment. Ironically, Kolla and OpenStack-Ansible (aka OSA) were the next things I checked. It felt natural to opt in for OSA.

Then, preparing and applying an OpenStack deployment became easier using OSA. As a next step, I practiced upgrades to the existing OpenStack installation.
Mission accomplished! I had gained a good understanding of how things are run when it comes to OpenStack!

Wrapping up my internship on Cloud Management, as a Stackmasters intern

I have to say, I have had the chance to gain great experiences in OpenStack. Mostly, I got lessons from fantastic professionals in Stackmasters. And I met interesting people at Starttech Ventures.

Thanassis Parathyras, CTO at Stackmasters, helped me into a smooth start and guided me, so that I could gradually delve into the concepts of OpenStack technology and community. Stackmasters team were very helpful for those three months.

The whole experience was definitely of benefit for me. Not only at a professional level, but also at a more personal one. For these reasons, I am confident that the skills I’ve gained as a Stackmasters intern, will have a major contribution to my future career development.

As a final point, what I’d definitely recommend to young IT and Computer Science graduates, is this:

Grab internship opportunities, get hands-on experience and explore how brilliant the Greek startup ecosystem is.

Mastering OpenStack as a Stackmasters intern was last modified: July 15th, 2019 by Nikos Kaftantzis

The post Mastering OpenStack as a Stackmasters intern appeared first on Stackmasters.

by Nikos Kaftantzis at July 15, 2019 09:46 AM

July 12, 2019

OpenStack Superuser

The ABCs of open-source license compliance

With open source software ubiquitous and irreplaceable, setting up a license compliance and procurement strategy for your business is indispensable. No software engineer I know wants to voluntarily talk about open source compliance, but avoiding those conversations can lead to a lot of pain. Remember the litigation for GPL-violations with D-Link, TomTom and many more in the early 2000s?

It’s better to keep in mind open-source license compliance from the early stages of development when creating a product: you want to know where all its parts are coming from and if they’re any good. Nobody thinks they will be asked for the bill of material for their software product until they are.

“Open source compliance is the process by which users, integrators and developers of open source software observe copyright notices and satisfy license obligations for their open source software components” — The Linux Foundation

Objectives for open source software (OSS) compliance in companies:

  • Protect proprietary IP
  • Facilitate the effective use of open source software
  • Comply with open source licensing
  • Comply with the third-party software supplier/customer obligations

What’s a software license, anyway?

Put very simply, a software license is a document that states what users are permitted to do with a piece of software. Open source software (OSS) licenses are licenses that the Open Source Initiative (OSI) has reviewed for respecting the Open Source Definition.  There are approximately 80 open source licenses (OSI maintains a list and so does the Free Software Foundation although these are called “free software” licenses), split between two larger families:

  • So-called “copyleft” licenses (GPLv2 and GPLv3) designed to guarantee users long-term freedoms, make it harder to lock the code in proprietary/non-free applications. The most important clause in these is that if you want to modify the software under copyleft license you have to share the modifications under a compatible license.
  • Permissive/BSD-like open source licenses guarantee freedom of using the source code, modifying it and redistribute, including as a proprietary product. (for example MIT, Apache.)

Despite the variety of licenses, companies sometimes invent new ones, modify them with new clauses and apply them to their products. This creates even more confusion among engineers. If your company’s looking to use open source software, tracking and complying with every open source license and hybrids can be a nightmare.

Establish an open-source license compliance policy

The goal is to have a full inventory of all the open source components in use and their  dependencies. It should be clear that there are no conflicts between licenses, all clauses are met and necessary attributions to the authors are made.

Whether you have an open source project using other open source components, or a proprietary project using open source components, it is important to establish a clear policy regarding OSS compliance. You want to create a solid, repeatable policy to outline what licenses are acceptable for your specific project.

Ways to execute OSS compliance

Manual

A surprising number of companies are still using this approach.  Basically, you create a spreadsheet and manually fill it out with components, versions, licenses and analyze it against your policy.

This works out well for smaller projects if they established a compliance policy (list of licenses or clauses acceptable in the company) from the beginning to spare themselves trouble in the future. In this scenario, every developer must review and log a software’s license before introducing the open source component.

The downside of this approach is that as the quantity of OSS components in the project grows, it becomes more difficult to keep track of relationships between licenses (if they all work together or there are conflicts). It is vital to list them as the dependency might have a different license than the actual library you are using.

Semi-automated

This is a more reliable approach and is becoming more popular, as the importance of open source compliance practices grows along with the risks associated with ignoring these practices. There are many tools available, in a way that it makes the prospect of automating overwhelming. Why semi-automated? Because there are always false positives if the license is not explicitly referenced in the header and you still have to read through some of them to discover special terms or conditions.

Of the tools I’ve seen, there are four main approaches:

  1. File scanners – usually involve all sorts of heuristics to detect licenses or components that usually would be missed by developers. Usually, these tools offer different formats for the output.
  2. Code scanners – exactly what it sounds like. You can use them periodically to check for new open- source components.
  3. Continuous integration (CI) scanners – these tools work with continuous integration or build tools. This will automatically detect all open-source components in the code every time you run a build. The idea is to create a unique identifier for each open-source component in the build and reference it against a database of existing components. You can also set policies to break the build if a blacklisted license is found.
  4. Component identification tools – these tools can help you produce a software bill-of-material (SBOM), the list of OSS components in your product.

A good place to start? The tools highlighted by the Open Compliance Program, a Linux Foundation initiative.

Conclusions

For smaller projects, fully manual tracking might be sufficient to achieve license compliance. For more complex projects, especially the ones built in an agile style with regular releases, automation is better. Whichever way you choose to handle OSS compliance, don’t ignore it for the sake of your project and sustaining the open-source community.

Dasha Gurova is the technical community manager at Scality, where a version of this post first appeared in the forum.

Superuser is always interested in open infrastructure community content, get in touch: editorATopenstack.org

Photo // CC BY NC

The post The ABCs of open-source license compliance appeared first on Superuser.

by Superuser at July 12, 2019 02:06 PM

Aptira

Upgrading Ceph from Luminous to Nautilus

Ceph Nautilus was released earlier in the year and it has many new features.

In CentOS, the Ceph Nautilus packages depend on CentOS 7.6 packages. Since we are using local YUM repo mirrors, we needed to download all CentOS 7.6 RPMs and Ceph Nautilus RPMs to our local YUM repo servers, and then update yum configs in all Ceph nodes to CentOS 7.6 and Ceph Nautilus.

Next we just followed the instructions as per the official Ceph documentation. The process is as follows:

Upgrade Ceph MONs:

  • yum update ceph mon packages
  • restart ceph mon service one by one

Upgrade Ceph Mgrs:

  • yum update ceph mgr packages
  • restart ceph mgr service one by one

Upgrade Ceph OSDs:

  • yum update ceph osd packages
  • restart ceph osd service one by one

Upgrade Ceph MDSs:

  • reduce mds ranks max_mds to 1
  • stop all standby mds services
  • restart the remaining active mds service
  • start all other mds services
  • restore original value of max_mds

Upgrade Ceph RADOSGW:

  • update and restart radosgw

Update CRUSH buckets:

  • switch any existing CRUSH buckets to straw2


ceph osd getcrushmap -o backup-crushmapceph osd crush set-all-straw-buckets-to-straw2

Enable V2 Network Protocol:

  • enable v2 network protocol using “ceph mon enable-msgr2”

Configure the Ceph Dashboard:

Ceph’s dashboard has been changed to enable SSL by default, so it will not work without certificates. You will need to either create certificates, or disable SSL. Since our Ceph is running in an internal network, we disabled SSL using the following command:


ceph config set mgr mgr/dashboard/ssl false

For more details about creating certificates, see the dashboard documentation.

We also enabled the Prometheus plugin in Ceph Mgr to collect metrics. In order to enable the plugin, simply run this command:


ceph mgr module enable prometheus

Then configure the scrape targets and some rules in Prometheus to scrape data from Ceph. More details can be found here.

After the scraping is working, you can use Grafana to visualise the metrics. Grafana Dashboards can be installed from RPMs or downloaded from its github repository.

For more information, check out the official Ceph documentation, as it describes how to upgrade from Luminous in great detail. If you’d like to learn more, we have Ceph training available, or ask our Solutionauts for some help.

Keep your data in safe hands.
See what we can do to protect and scale your data.

Secure Your Data

The post Upgrading Ceph from Luminous to Nautilus appeared first on Aptira.

by Shunde Zhang at July 12, 2019 01:50 PM

Chris Dent

Eight Hour Day Update

Since Denver in early May I've been running a timer to limit my work day to eight hours. I've stuck to it pretty well, long enough that I have a few observations.

Some of the expected positive outcomes are there: I have more time to attend to non-work tasks like feeding myself, getting a bit more exercise, and making plans and actions for things around my home.

But there are some negatives which suggest further effort is required.

The one that is on my mind today is that with only eight hours of continuous work in a day, it's been difficult to get anything of substance done. Not that I'm getting nothing done. Rather, the time I have available is mostly consumed by reputedly urgent requests—that are initially small but turn out not to be—from co-workers both internal to $EMPLOYER and in the OpenStack community.

In the past I would attend to these requests, clear them off the plate, and then do what I felt to be "the real work" (which could be defined, vaguely, as "improving OpenStack for the long term").

Now that I have a time limit, I rarely get to the point where I have a clean plate. If I do there's not enough time to gain the focus and flow required to do "the real work". As a result, my day to day satisfaction is poor.

I can think of a few strategies for resolving this, but both will be difficult to integrate with social mores in my daily environments:

  • Reserve entire days without attention to IRC, Slack and perhaps even email. That is, avoid interruption by being unreachable. This will be hard to do. The majority of the population in these environments is addicted to 24 hour synchronous communication and expects the same from others. People get used to it in some kind of bizarre form of Stockholm Syndrome and synchronous becomes the only reliable way to reach them.

    24 hours, seven days a week contact is not how work is supposed to work. If you are a member of a team and you work this way, you are effectively encouraging, and in some cases requiring, other members of your team to work the same way.

  • Enforce a queuing mechanism. Don't let other people turn me into a stack on which they can put themselves.

These are both hard because there is always a perceived urgency. Sometimes real, sometimes not. Denying or resisting that is easily perceived as rude or unhelpful.

One of the reasons I write the placement updates is to make it clear there is a queue of placement-related work for people who either cannot or do not want to be a part of the synchronous flow of information.

I need more tools like that.

I'm not going to go back to greater than eight hour days. Until I find some better ways to manage tasks and inputs my apologies (to me and to you) for not getting the good stuff done.

by Chris Dent at July 12, 2019 12:00 PM

Placement Update 19-27

Pupdate 19-27 is here and now.

Most Important

Of the features we planned to do this cycle, all are done save one: consumer types (in progress, see below). This means we have a good opportunity to focus on documentation, performance, and improving the codebase for maintainability. You do not need permission to work on these things. If you find a problem and know how to fix it, fix it. If you are not sure about the solution, please discuss it on this email list or in the #openstack-placement IRC channel.

This also means we're in a good position to help review changes that use placement in other projects.

The Foundation needs to know how much, if any, Placement time will be needed in Shanghai. I started a thread and an etherpad.

What's Changed

  • The same_subtree query parameter has merged as microversion 1.36. This enables a form of nested provider affinity: "these providers must all share the same ancestor".

  • The placement projects have been updated (pending merge) to match the Python 3 test runtimes for Train community wide goal. Since the projects have been Python3-enabled from the start, this was mostly a matter of aligning configurations with community-wide norms. When Python 3.8 becomes available we should get on that sooner than later to catch issues early.

Specs/Features

All placement specs have merged. Thanks to everyone for the frequent reviews and quick followups.

Some non-placement specs are listed in the Other section below.

Stories/Bugs

(Numbers in () are the change since the last pupdate.)

There are 23 (0) stories in the placement group. 0 (0) are untagged. 3 (1) are bugs. 5 (0) are cleanups. 11 (0) are rfes. 4 (0) are docs.

If you're interested in helping out with placement, those stories are good places to look.

osc-placement

osc-placement is currently behind by 11 microversions.

Main Themes

Consumer Types

Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting.

Cleanup

Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things.

As mentioned last week, one of the important cleanup tasks that is not yet in progress is updating the gabbit that creates the nested topology that's used in nested performance testing. The topology there is simple, unrealistic, and doesn't sufficiently exercise the several features that may be used during a query that desires a nested response. This needs to be someone who is more closely related to real world use of nested than me. efried? gibi?

Another cleanup that needs to start is satisfying the community wide goal of PDF doc generation. Anyone know if there is a cookbook for this?

Other Placement

Miscellaneous changes can be found in the usual place.

There are three os-traits changes being discussed. And one os-resource-classes change.

Other Service Users

New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed.

End

A colleague suggested yesterday that the universe doesn't have an over subscription problem, rather there's localized contention, and what we really have is a placement problem.

by Chris Dent at July 12, 2019 09:56 AM

July 11, 2019

OpenStack Superuser

How to test your developer workflow with TripleO

In this post we’ll see how to use TripleO for developing and testing changes in OpenStack Python-based projects.

Even though Devstack remains a popular tool, it’s not the only one that can handle your development workflow.

TripleO wasn’t just built for real-world deployments but also for developers working on OpenStack related projects, like Keystone for example.

Let’s say the Keystone directory where I’m writing code is in /home/emilien/git/openstack/keystone.

Now I want to deploy TripleO with that change and my code in Keystone. For that I’ll need a server (or a virtual machine) with at least 8GB of RAM, 4 vCPU and 50GB of disk and CentOS7 or Fedora28 installed.

First, prepare the repositories and install python-tripleoclient:

#sudo yum install -y git python-setuptools
git clone https://github.com/openstack/tripleo-repos
cd tripleo-repos
python setup.py install
tripleo-repos current # use -b if you're deploying a stable version
sudo yum install python-tripleoclient

If you’re deploying on recent Fedora or RHEL8, you’ll also need to install python3-tripleoclient.

Now, let’s prepare your environment and deploy TripleO:



# Change the IP you have on your host
export IP=192.168.33.20
export NETMASK=24
export INTERFACE=eth1

# cleanup
rm -f $HOME/containers-prepare-parameters.yaml $HOME/standalone_parameters.yaml

cat < $HOME/containers-prepare-parameters.yaml
parameter_defaults:
  ContainerImagePrepare:
  - push_destination: true
    set:
      name_prefix: centos-binary-
      name_suffix: ''
      namespace: docker.io/tripleomaster
      neutron_driver: ovn
      tag: current-tripleo
    tag_from_label: rdo_version
  - push_destination: true
    includes:
    - keystone
    modify_role: tripleo-modify-image
    modify_append_tag: "-devel"
    modify_vars:
      tasks_from: dev_install.yml
      source_image: docker.io/tripleomaster/centos-binary-keystone:current-tripleo
      python_dir:
        - /home/emilien/git/openstack/keystone
EOF

cat < $HOME/standalone_parameters.yaml
parameter_defaults:
  CloudName: $IP
  ControlPlaneStaticRoutes: []
  Debug: true
  DeploymentUser: $USER
  DnsServers:
    - 1.1.1.1
    - 8.8.8.8
  DockerInsecureRegistryAddress:
    - $IP:8787
  NeutronPublicInterface: $INTERFACE
  # domain name used by the host
  NeutronDnsDomain: localdomain
  # re-use ctlplane bridge for public net, defined in the standalone
  # net config (do not change unless you know what you're doing)
  NeutronBridgeMappings: datacentre:br-ctlplane
  NeutronPhysicalBridge: br-ctlplane
  # enable to force metadata for public net
  #NeutronEnableForceMetadata: true
  StandaloneEnableRoutedNetworks: false
  StandaloneHomeDir: $HOME
  StandaloneLocalMtu: 1500
  # Needed if running in a VM, not needed if on baremetal
  NovaComputeLibvirtType: qemu
EOF

sudo openstack tripleo deploy \
  --templates \
  --local-ip=$IP/$NETMASK \
  -e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \
  -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
  -e $HOME/containers-prepare-parameters.yaml \
  -e $HOME/standalone_parameters.yaml \
  --output-dir $HOME \
  --standalone

Note: change the YAML for your own needs if needed. If you need more help on how to configure Standalone, please check out the official manual.

Now, let’s say your code needs a change and you need to retest it. Once you modified your code, just run:

sudo buildah copy keystone /tmp/keystone /tmp/keystone
sudo podman exec -it -u root -w /tmp/keystone keystone python setup.py install
sudo systemctl restart tripleo_keystone 

At this stage, if you need to test a review that’s already pushed in Gerrit and you want to run a fresh deployment with it, here’s how you do it:

# Change the IP on your host
export IP=192.168.33.20
export NETMASK=24
export INTERFACE=eth1
# cleanup
rm -f $HOME/containers-prepare-parameters.yaml $HOME/standalone_parameters.yaml

cat < $HOME/containers-prepare-parameters.yaml
parameter_defaults:
ContainerImagePrepare:
- push_destination: true
set:
name_prefix: centos-binary-
name_suffix: ''
namespace: docker.io/tripleomaster
neutron_driver: ovn
tag: current-tripleo
tag_from_label: rdo_version
- push_destination: true
includes:
- keystone
modify_role: tripleo-modify-image
modify_append_tag: "-devel"
modify_vars:
tasks_from: dev_install.yml
source_image: docker.io/tripleomaster/centos-binary-keystone:current-tripleo
refspecs:
-
project: keystone
refspec: refs/changes/46/664746/3
EOF

cat < $HOME/standalone_parameters.yaml
parameter_defaults:
CloudName: $IP
ControlPlaneStaticRoutes: []
Debug: true
DeploymentUser: $USER
DnsServers:
- 1.1.1.1
- 8.8.8.8
DockerInsecureRegistryAddress:
- $IP:8787
NeutronPublicInterface: $INTERFACE
# domain name used by the host
NeutronDnsDomain: localdomain
# re-use ctlplane bridge for public net, defined in the standalone
# net config (do not change unless you know what you're doing)
NeutronBridgeMappings: datacentre:br-ctlplane
NeutronPhysicalBridge: br-ctlplane
# enable to force metadata for public net
#NeutronEnableForceMetadata: true
StandaloneEnableRoutedNetworks: false
StandaloneHomeDir: $HOME
StandaloneLocalMtu: 1500
# Needed if running in a VM, not needed if on baremetal
NovaComputeLibvirtType: qemu
EOF

sudo openstack tripleo deploy \
--templates \
--local-ip=$IP/$NETMASK \
-e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \
-r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
-e $HOME/containers-prepare-parameters.yaml \
-e $HOME/standalone_parameters.yaml \
--output-dir $HOME \
--standalone 

I hope these tips help you understand how to test any OpenStack Python-based project in a painless way — and pretty quickly. On my environment, the whole deployment takes less than 20 minutes.

About the author

Emilien Macchi, software engineer at Red Hat, describes himself as a French guy hiding somewhere in Canada. This post was first published on his blog.

Superuser is always interested in community content, get in touch: editorATopenstack.org

Photo // CC BY NC

The post How to test your developer workflow with TripleO appeared first on Superuser.

by Emilien Macchi at July 11, 2019 02:05 PM

Aptira

Lifecycle and Operational Management Orchestration

An international Telecommunications provider requires an Orchestration solution to perform a range of lifecycle and operational management functions.


The Challenge

One of our overseas customers has requested help with an orchestration solution, and had quite a list of requirements that this solution needed to address, including:

  • Deploy and configure Juniper vSRX as vCPE (Virtual Customer Premises Equipment) on the customers Cloud (OpenStack)
  • Create a Multiprotocol Label Switching Layer 2 Virtual Private Network (MPLS L2vPN) between the vCPE and routers in the data center
  • Provision bandwidth-on-demand for the L2VPN tunnel, so the required bandwidth can be updated with zero down time
  • Monitor the vCPE operational performance
  • Auto healing vCPE in case of failure
  • Autoscaling up/down depending on the load

In addition to their long list of requirements, being located overseas meant that we were required to operate remotely and across multiple time zones.


The Aptira Solution

In order to meet these specific requirements, Aptira proposed a solution based on the Cloudify Service Orchestrator product which has capabilities to provision both Virtual Network Functions (VNFs) and Physical Network Functions (PNFs).

Using Cloudify, we were able to:

  • Develop TOSCA templates to:
    • Provision the vCPE (Juniper vSRX) on customer OpenStack Cloud
    • Configure the Cisco switches and routers
    • Create a L2VPN tunnel between the on-premise Cisco routers and vCPE (Juniper vSRX) present on the customers OpenStack Cloud
  • Develop a custom Cloudify workflow to provision bandwidth on demand for MPLS L2vPN
  • Implement TOSCA templates for auto-scaling and auto-healing vCPE

As the client did not have their own lab for us to test this solution on, and given the restrictions of operating remotely, we developed the solution in our own internal lab.


The Result

Having implemented the fully-configured Cloudify solution to meet the requirements above, the Telecommunications provider is now able to successfully provision vCPE (Juniper vSRX) on their OpenStack Cloud, as well as provision the PNF (Cisco router and switch) configuration in their data center. In addition to this, a L2vPN tunnel has been created between the vCPE and PNF.

We have tested the L2VPN in both Port mode and VLAN mode, and also tested multiple scenarios of a production environment, including bandwidth on demand, auto scale up/down and auto healing.


How can we make OpenStack work for you?
Find out what else we can do with OpenStack.

Find Out Here

The post Lifecycle and Operational Management Orchestration appeared first on Aptira.

by Aptira at July 11, 2019 01:24 PM

July 10, 2019

Mirantis

C’mon! OpenStack ain’t that tough

OpenStack is still viewed as difficult to install and administer, but it's not that tough -- especially after you’ve taken OpenStack training and hands-on lab exercises.

by Paul Quigley at July 10, 2019 02:42 PM

OpenStack Superuser

Inside open infrastructure: The latest from the OpenStack Foundation

Welcome to the latest edition of the OpenStack Foundation Open Infrastructure newsletter, a digest of the latest developments and activities across open infrastructure projects, events and users. Sign up to receive the newsletter and email community@openstack.org to contribute.

Spotlight: Upstream investment opportunities launch with Glance

It takes a global village to develop the OpenStack open infrastructure platform. From time to time, the community identifies activities where additional volunteers could make a substantial impact. OpenStack recently started to revamp the help wanted list with a new process to better underscore the investment opportunities available upstream.

The first request for more brainpower comes from the Glance team. As OpenStack’s disk image management service, Glance provides a crucial component for a vast majority of deployments. Its source code is maintained by a small but dedicated team looking to expand their ranks to take on some additional challenges in upcoming development cycles.

Collaboration with a focused team like Glance can be a rewarding experience even for seasoned developers and provides a platform for newcomers to grow professionally due to its central nature and interdependence with other services. Working on a project like this also enhances an organization’s understanding of OpenStack (both the software and the people who come together to produce it) and can be instrumental in improving its own efficacy in the broader ecosystem.

Assistance is especially appreciated with code review, bug triage and fixes, development of new features, and bringing the software in line with emerging standards. If you or your employer want to help with Glance, please see the Glance contributors upstream investment opportunity for details on how to get involved.

OpenStack Foundation news

  • The OpenStack Foundation joined the Open Source Initiative as an affiliate member. This provides a unique opportunity to work together to identify and share resources that foster community and facilitate collaboration to support the awareness and integration of open-source technologies.

Open Infrastructure Summit Shanghai and Project Teams Gathering (PTG)

  • Registration is open. Summit tickets also grant access to the PTG. You can pay in U.S. dollars or yuan if you need an official invoice (fapiao.)
  • If your organization can’t fund your travel, apply for the Travel Support Program by August 8.
  • If you need a travel visa, get started now: Information here.
  • Put your brand in the spotlight by sponsoring the Summit: Learn more here.
  • Next week, we’ll be sending surveys to teams about their participation at the PTG.

OpenStack Foundation project news

OpenStack

  • The upcoming OpenStack release, Train, is planned to arrive October 16. But what about the name of the release after that? The release naming process calls for a moniker starting with the letter U, ideally related to a geographic feature close to Shanghai, China. Post your picks on the release naming Wiki page.
  • A great way to get involved in our community is to help run community elections. If you’re interested in helping for the next round in September, reach out to the current election officials.
  • The 2019 OpenStack User Survey is currently open. If you’re operating OpenStack, please share your deployment choices and feedback by August 22.

Airship

  • The first Airship Technical Committee elections are underway. With six nominations from six companies, the elections reflect how much the Airship community has grown since the project launch, just over a year ago. Polls close on July 9.
  • Drawing on the project’s telecom roots, the Airship community has proposed a new OPNFV project for an infrastructure deployment and lifecycle management tool to provide cloud and NFV infrastructure to support VNF testing and certification. The project already has a large cross-industry contributor base and plans to land a first release for fall 2019.

StarlingX

  • The community has been working hard on the second release of the project and recently reached their third milestone. This means that the release timeline is on track and the community is focusing on bug fixes, testing and some features that got an exception to still be able to fit them into the release.
  • The community has also started their planning for the third release which will happen later this year to include features such as the Train version of the OpenStack services or Time Sensitive Networking and Redfish support.

Zuul

  • Zuul 3.9.0 is released. Ansible 2.8 support is added to Zuul for job execution. Pipelines may be configured to “fail fast” stopping a buildset and reporting its results after the first failure. More details in the release notes.
  • Nodepool 3.7.0 launched. A new driver supporting unprivileged OpenShift clusters has been added. Improvements to networking and host key management have been added to the OpenStack driver. More details can be found in Nodepool’s release notes.
  • Join the Zuul community at AnsibleFest in Atlanta, September 24-26.

Upcoming Open Infrastructure Community Events

 July

August

September

October

November

OSF reception on Monday, November 18 at the Hilton Bayfront Hotel

OSF booth

Questions / feedback / contribute

This newsletter is written and edited by the OpenStack Foundation staff to highlight open infrastructure communities. We want to hear from you!
If you have feedback, news or stories that you want to share, reach us through community@openstack.org . To receive the newsletter, sign up here.

The post Inside open infrastructure: The latest from the OpenStack Foundation appeared first on Superuser.

by OpenStack Foundation at July 10, 2019 02:03 PM

About

Planet OpenStack is a collection of thoughts from the developers and other key players of the OpenStack projects. If you are working on OpenStack technology you should add your OpenStack blog.

Subscriptions

Last updated:
August 17, 2019 03:22 PM
All times are UTC.

Powered by:
Planet